如何将许多文件上传到Google Colab?

cam*_*tor 8 python machine-learning jupyter google-colaboratory

我正在研究图像分割机器学习项目,我想在Google Colab上测试它.

对于训练数据集,我有大约700张图像256x256需要上传到我的项目的python numpy数组中.我还有上千个相应的掩码文件.它们目前存在于Google云端硬盘上的各种子文件夹中,但无法上传到Google Colab以便在我的项目中使用.

到目前为止,我一直尝试使用谷歌保险丝,它似乎上传速度非常慢,PyDrive给我带来了各种身份验证错误.我大部分时间都在使用Google Colab I/O示例代码.

我该怎么办呢?PyDrive会成为可行的方式吗?是否有代码用于一次上传文件夹结构或许多文件?

Abd*_*han 9

您可以将所有数据放入google驱动器,然后装入驱动器.这就是我做到的.让我逐步解释.

第1步: 将数据传输到您的谷歌硬盘.

第2步: 运行以下代码来安装谷歌驱动器.

# Install a Drive FUSE wrapper.
# https://github.com/astrada/google-drive-ocamlfuse
!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse



# Generate auth tokens for Colab
from google.colab import auth
auth.authenticate_user()


# Generate creds for the Drive FUSE library.
from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()
import getpass
!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}


# Create a directory and mount Google Drive using that directory.
!mkdir -p My Drive
!google-drive-ocamlfuse My Drive


!ls My Drive/

# Create a file in Drive.
!echo "This newly created file will appear in your Drive file list." > My Drive/created.txt
Run Code Online (Sandbox Code Playgroud)

步骤3: 运行以下行以检查您是否可以在装入的驱动器中看到所需的数据.

!ls Drive
Run Code Online (Sandbox Code Playgroud)

第4步:

现在将数据加载到numpy数组中,如下所示.我的exel文件包含我的火车和简历以及测试数据.

train_data = pd.read_excel(r'Drive/train.xlsx')
test = pd.read_excel(r'Drive/test.xlsx')
cv= pd.read_excel(r'Drive/cv.xlsx')
Run Code Online (Sandbox Code Playgroud)

我希望它可以提供帮助.

编辑

要从colab笔记本环境将数据下载到驱动器中,可以运行以下代码.

# Install the PyDrive wrapper & import libraries.
# This only needs to be done once in a notebook.
!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials



# Authenticate and create the PyDrive client.
# This only needs to be done once in a notebook.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)



# Create & upload a file.
uploaded = drive.CreateFile({'data.xlsx': 'data.xlsx'})
uploaded.SetContentFile('data.xlsx')
uploaded.Upload()
print('Uploaded file with ID {}'.format(uploaded.get('id')))
Run Code Online (Sandbox Code Playgroud)