Explain Codes LogoExplain Codes Logo

How to read data in Google Colab from my Google drive?

python
pandas
dataframe
google-colab
Anton ShumikhinbyAnton Shumikhin·Jan 22, 2025
TLDR

Mounting your Google Drive in your Colab environment is as simple as running two lines of code:

from google.colab import drive drive.mount('/content/drive')

Next, access your files with:

file_path = '/content/drive/My Drive/yourfile.ext' with open(file_path, 'r') as file: data = file.read()

Replace yourfile.ext with your actual filename and extension. And, voila! The data variable now contains your file data, ready for analysis in your Colab notebook.

Getting more from Google Drive in Colab

As convenient as going drive-thru at your favourite fast-food chain, you can easily direct Pandas to read_csv files from your mounted Google Drive:

import pandas as pd # Configuring Panda to have a feast on a CSV file csv_file_path = '/content/drive/My Drive/data.csv' df = pd.read_csv(csv_file_path)

For managing larger data sets and files, using command-line utilities like !ls to gain visibility of contents can prove handy.

Practical tips and tricks

Using UI for easy integration

The 'Mount Drive' button on Colab's UI offers one-click convenience for mounting your Drive. Think of it as your express elevator to your data floor.

Implementing command line for file management

You can cd (change directory) akin to walking through rooms in your house:

%cd /content/drive/My Drive/

Once inside the room (or directory), use the !ls or !cp commands, just like you would switch the light on to see:

!ls // 'Turn the light on to see what's around' !cp source_path destination_path // 'Move items around in the room'

When to remount

Sometimes the elevator breaks. In such cases, use force_remount=True to reset:

drive.mount('/content/drive', force_remount=True)

Mastering PyDrive

Meet PyDrive, your personal butler who can fetch specific files using unique IDs:

from pydrive.drive import GoogleDrive from pydrive.auth import GoogleAuth gauth = GoogleAuth() drive = GoogleDrive(gauth) # When PyDrive says, 'Voila! Presenting you the contents of your digital vault, Sir/Madam.' file_list = drive.ListFile({'q': "'root' in parents and trashed=false"}).GetList() for file1 in file_list: print('title: %s, id: %s' % (file1['title'], file1['id']))

Iterating over Drive files

Loop through your files directly, just like a roomba goes through your entire house:

import os # Assuming '/content/drive/My Drive/data_folder/' is your digital house full of files for file in os.listdir('/content/drive/My Drive/data_folder/'): print(file) // 'Presenting each file item with utter respect'

Advanced tactics for data handling

Maximizing efficiency

Access Google Drive directly to bypass cumbersome manual uploads. It's like having special VIP access to your data nightclub.

Query optimization with PyDrive

PyDrive's 'q' parameter becomes your laser-guided file retrieval tool, locating the exact files you need:

folder_id = 'your_folder_id_here' file_list = drive.ListFile({'q': f"'{folder_id}' in parents and trashed=false"}).GetList()

With PyDrive, complex operations are boiled down to a simple walk in the park.