Explain Codes LogoExplain Codes Logo

How to save a dictionary to a file?

python
serialization
data-serialization
pickle
Alex KataevbyAlex Kataev·Sep 22, 2024
TLDR

The quickest way to serialize your dictionary is using Python's json.dump(). It's simple, portable and human-readable:

import json # Welcome to fruit collection! my_dict = {'apple': 1, 'banana': 2, 'cherry': 3} with open('my_dict.json', 'w') as f: json.dump(my_dict, f) # Careful, Fruits inside!

The above snippet swiftly packages my_dict into a JSON file — 'my_dict.json' that can be easily digested by various programming languages and, most importantly, human eyes!

Your toolbox for dictionary serialization

Going binary with pickle

When the mission is to save everything Pythonic about your dictionary, call for pickle:

import pickle # We ain't sure if a pickle can fruit, but it surely can save fruits! with open('my_dict.pkl', 'wb') as f: pickle.dump(my_dict, f) # Fruits put into pickle jar!

On retrieval, watch as the preserved Python objects return to life:

with open('my_dict.pkl', 'rb') as f: loaded_dict = pickle.load(f) # Fruits are fresh again, straight from the pickle jar!

This approach though comes with a small read warning: This jar isn't transparent! You can't peek into your pickled objects without unpickling them.

The numerical way with NumPy

If your dictionary is a feast of numerical data, count on NumPy:

import numpy as np # Allow_pickle: a green flag to pickle with NumPy np.save('my_dict.npy', my_dict, allow_pickle=True) loaded_dict = np.load('my_dict.npy', allow_pickle=True).item() # Get a particular item like this: item = loaded_dict.item() # Hola! Just single-handly fetched a fruity item!

Remember to call .item() to get individual items from your npy dictionary.

Race ahead with orjson

Swift processing required for large datasets? orjson steps in:

import orjson # We're minimalists! Let's shrink those fruits to bytes! with open('my_dict.json', 'wb') as f: f.write(orjson.dumps(my_dict)) # And off they go!

However, keep note, orjson.dumps() returns bytes, not string!

Space-Saver bz2

When space is at a premium, bz2 compresses your serialized data:

import bz2 import pickle # With bz2, your giant fruit collection can fit into a lunch box! with bz2.open('my_dict.pbz2', 'wb') as f: pickle.dump(my_dict, f) # Please keep the box closed until lunch time!

Few extra seconds for a significant cut in file size is not a bad trade, right?

Things to keep in mind

Watch out for security pitfalls

Remember pickle can run code during load. Beware of the jars from unknown sources! Also, avoid eval at all costs when reading parsed files. You don't want to invite risk!

Be resource aware

Make it a habit to open and close files properly. Python's with statement gives us a gift of context managers. Always use them for foolproof I/O operations:

with open("path/to/dictionary/file") as f: # Operation here...

Be mindful of compatibility issues

Your JSON file format is a letter that can reach any part of the programming world. json is compatible across Python 2.x and 3.x and allows for easy inter-language communications.

Write your own rules with custom serialization

When standard libraries just won't do, why not invent your own format?

def save_custom_dict(file_name, dictionary): # Duty calls for a new data format! pass

Be the judge and architect of your format with custom serialization! Remember, with great power comes great responsibility.

All in your hands

When dealing with .npy files, NPY file viewers are your magnifying glasses to look into the file contents without writing any script. Similarly, tools specific to your file format surely exist – so wield them right!