Using Python to Load Binary Files into Postgres
Many times developers may find it desirable to store binary files within a database. For example, an online store may find it easier to store the pictures of their merchandise in the database along side the item description, quantity, price, etc. Doing this can have it’s advantages and disadvantages. Advantages may include easier maintenance, as there are no image files to backup and no need to worry about filename collisions. However the disadvantages of the overhead of database queries, especially if you are dealing with large image files, will have to be weighed.
The primary obstacle in importing binary data into Postgres is making sure that your data is properly escaped. For example when you look at an image file, it may contain the following binary data
\234\370\260\260\357
This data will need to be escaped as
\\234\\370\\260\\260\\357
in order for Postgres to store it correctly. Fortunately Psycopg2, which we will use in our import script, takes care of the escaping for you. Below is a script that will take an image file (or any binary file) and a user name, stored in the column username and update the column avatar_image (which is of type bytea) with the supplied image file. Make special note of psycopg2.Binary() which wraps the binary representation of the image file, and takes care of the necessary escaping.
#!/usr/bin/env python
import psycopg2
import sys
conn = psycopg2.connect("dbname='myforum' user='johndoe' host='localhost'")
curs = conn.cursor()
f = open(sys.argv[1], 'rb')
binary = f.read()
curs.execute("UPDATE usertable SET avatar_image = %s WHERE username = %s", (psycopg2.Binary(binary), sys.argv[2]))
conn.commit()