Explain Codes LogoExplain Codes Logo

Psycopg2: insert multiple rows with one query

python
bulk-insert
performance-optimization
postgresql
Nikita BarsukovbyNikita Barsukov·Mar 3, 2025
TLDR

For sequential insertion of multiple rows with psycopg2, use the executemany() method. Store your data as a list of tuples specifying the values for each row. Pair executemany() with your INSERT statement and the data list to conduct the operation.

Example:

data = [(1, 'Alice'), (2, 'Bob'), (3, 'Charlie')] query = "INSERT INTO table (column1, column2) VALUES (%s, %s)" # And BOOM... data inserted ;) cursor.executemany(query, data) connection.commit()

The need for speed: mogrify()

By comparison, the executemany() might feel like a turtle when placed next to the cheetah of cursor.mogrify(). This method prepares an encoded query string which can then be utilized to generate a complete SQL command.

Here’s how the magic happens:

data = [(1, 'Alice'), (2, 'Bob'), (3, 'Charlie')] # Prepare the strings... no, not the guitar ones. args_str = ','.join(cursor.mogrify("(%s,%s)", x).decode('utf-8') for x in data) # Just doing my INSERT magic trick... cursor.execute("INSERT INTO table (column1, column2) VALUES " + args_str) connection.commit()

When dealing with mass data, mogrify() can out-speed executemany() - being around 10x faster.

Steps ahead: execute_values

execute_values from the psycopg2.extras is the VIP services of bulk inserts. It packs the ease of executemany() with the speed of mogrify() and gift wraps it with simplicity.

Example:

from psycopg2.extras import execute_values data = [(1, 'Alice'), (2, 'Bob'), (3, 'Charlie')] # batch insert coming in.hold on tight. execute_values(cursor, "INSERT INTO table (column1, column2) VALUES %s", data) connection.commit()

This method is the fastest way to insert multiple rows, outperforming executemany() by as much as 60x!

The Fast and Furious: copy_from()

And now, the ultimate Ferrari of batch operations - cursor.copy_from(). Built for handling massive bulk, it’s like streaming your data directly into the database.

from io import StringIO # String stream... Not the streaming service you're used to. data = '1\tAlice\n2\tBob\n3\tCharlie\n' buffer = StringIO(data) # Acquiring target... target locked. Go for copy_from. cursor.copy_from(buffer, 'table', columns=('column1', 'column2')) connection.commit()

This is a memory savior when it comes to bulk data since it creates string streams instead of loading it entirely into memory.

Merging flexibility and performance

For solutions requiring multiple data types and complex operations, the executemany() with dictionaries is your solution. It combines power with readability.

Python 3 and byte strings

Pair the use of byte strings with cursor.execute() for optimized performance when suitably decoded.

The PostgreSQL record format

Utilizing execute_values can effortlessly convert Python tuples to PostgreSQL records for a smooth data insertion process.

The memory saviors

Handling large data insertions? cursor.copy_from to the rescue! It bypasses memory restrictions, making it the pathfinder for regular large data insertion tasks.

The knowledge extensions

The provided references plunge deeper into these subjects for a more rounded comprehension of the functions, their usage, and examples.