iamtgc

Importing data from a file into a PostgreSQL database

August 6th, 2007 by tgc

To follow up on our previous article on importing data into a MySQL database, is how to Import data from a file into a PostgreSQL database.

In this example, we’ll focus on the data contained here, in an IP-to-Country csv file. This data consists of the first IP address in the range (in long format), the last IP address in the range (again, in long format), and the alpha-2, alpha-3 and country name (according to ISO 3166-1) of the country which the IP range is assigned to.

Here is a brief excerpt to get an idea of the information we’ll be importing.
"3739572224","3739574271","AU","AUS","AUSTRALIA" "3739574272","3739680767","JP","JPN","JAPAN" "3739680768","3739697151","KR","KOR","REPUBLIC OF KOREA" "3739697152","3739746303","JP","JPN","JAPAN" "3739746304","3740270591","KR","KOR","REPUBLIC OF KOREA" "3740270592","3740925951","CN","CHN","CHINA" "3740925952","3741024255","TW","TWN","TAIWAN" "3741024256","3741057023","KR","KOR","REPUBLIC OF KOREA" "3741057024","3741319167","VN","VNM","VIET NAM" "3758096384","4294967295","US","USA","UNITED STATES"

Let’s start by creating the table we need to support this data.
ipdb=# create table ip_to_country ( startrange bigint, endrange bigint, country_alpha2 varchar(2), country_alpha3 varchar(3), country varchar(50) ); CREATE TABLE

In the csv above, you’ll notice that the fields are comma separated and also enclosed in double quotations. Postgres’ COPY function allows you to define what the DELIMITER and QUOTE are, among other variables, you can type \h copy at the psql prompt to see all the arguments COPY takes.

Here is how we import our ip-to-country data.
ipdb=# copy ip_to_country from '/tmp/ip-to-country.csv' WITH DELIMITER AS ',' CSV QUOTE AS '"'; COPY 79440

There were 79440 entries copied, a quick line count on the file can help you determine if all the data was imported.

Now let’s see if it imported correctly, we’ll take the last entry in the above excerpt and query for the startrange
ipdb=# select * from ip_to_country where startrange = 3758096384; startrange | endrange | country_alpha2 | country_alpha3 | country ------------+------------+----------------+----------------+--------------- 3758096384 | 4294967295 | US | USA | UNITED STATES (1 row)
Perfect, and as you’ll notice the quotes enclosing the data were stripped during the import as well.

Posted in Postgres

Leave a Comment

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.