<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>iamtgc &#187; Postgres</title>
	<atom:link href="http://iamtgc.com/category/postgres/feed/" rel="self" type="application/rss+xml" />
	<link>http://iamtgc.com</link>
	<description></description>
	<lastBuildDate>Wed, 21 Jul 2010 12:13:46 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Returning Composite Types in Postgres Stored&#160;Procedures</title>
		<link>http://iamtgc.com/2009/10/01/returning-composite-types-in-postgres-stored-procedures/</link>
		<comments>http://iamtgc.com/2009/10/01/returning-composite-types-in-postgres-stored-procedures/#comments</comments>
		<pubDate>Thu, 01 Oct 2009 14:52:38 +0000</pubDate>
		<dc:creator>tgc</dc:creator>
				<category><![CDATA[Postgres]]></category>

		<guid isPermaLink="false">http://iamtgc.com/?p=123</guid>
		<description><![CDATA[Expanding on the PostgreSQL examples in our previous post, here we will look at taking advantage of some of the features in PostgreSQL 8.3 and modifying our zip_proximity function to avoid using cursors and instead define and return a set of our own composite type. From the postgres documentation &#8220;A composite type describes the structure [...]]]></description>
			<content:encoded><![CDATA[<p>Expanding on the PostgreSQL examples in our <a href="http://iamtgc.com/2009/01/14/implementing-zip-code-proximity-functions-in-mysql-and-postgresql/">previous post</a>, here we will look at taking advantage of some of the features in PostgreSQL 8.3 and modifying our zip_proximity function to avoid using cursors and instead define and return a set of our own <a href="http://www.postgresql.org/docs/8.3/static/sql-createtype.html">composite type</a>.  From the postgres documentation &#8220;A composite type describes the structure of a row or record&#8230;&#8221;.  Since we know that our stored procedure will return a row including: zip code, latitude, longitude, city, state, state abbreviation, and distance, we can create a composite type called, for example, ziprowtype.<br />
<span id="more-123"></span></p>
<p>Here we will create the composite type.<br />
<code>testdb=# create type ziprowtype as (zip varchar, lat float, lon float, city varchar, state varchar, state_abbrev varchar, distance float);
CREATE TYPE</code></p>
<p>Now, to modify the stored procedure, you will need to change the return type from refcursor to SETOF ziprowtype.  The body of the function changes a bit too.<br />
First we load the result of our query into record type &#8220;r&#8221;, then loop and &#8220;RETURN NEXT r&#8221;.<br />
<code>testdb=# CREATE OR REPLACE FUNCTION zip_proximity2(varchar, double precision, varchar) RETURNS SETOF ziprowtype
    AS $_$
   DECLARE
      home_lat float;
      home_lon float;
      r record;
   BEGIN
      SELECT lat, lon INTO home_lat, home_lon FROM zipcodes WHERE zip = $1;
      FOR r IN
      SELECT zip, lat, lon, city, state, state_abbrev, calculate_distance($3, home_lat, home_lon, lat, lon) AS distance
          FROM zipcodes WHERE calculate_distance($3, home_lat, home_lon, lat, lon) &lt; $2 ORDER BY distance
      LOOP
         RETURN NEXT r;
      END LOOP;
   END;
   $_$
    LANGUAGE plpgsql;</code><br />
<strong>NOTE:</strong> To see how we implemented calculate_distance, please read <a href="http://iamtgc.com/2009/01/14/implementing-zip-code-proximity-functions-in-mysql-and-postgresql/">this post</a>.</p>
<p>Now, instead of using cursors and transactions, we can use the following query to return the desired results.<br />
<code>testdb=# select * from zip_proximity2('94043', 3.0, 'mi');
  zip  |   lat    |    lon     |     city      |   state    | state_abbrev |     distance
-------+----------+------------+---------------+------------+--------------+-------------------
 94043 | 37.42337 | -122.07981 | MOUNTAIN VIEW | CALIFORNIA | CA           |                 0
 94039 | 37.41884 | -122.09124 | MOUNTAIN VIEW | CALIFORNIA | CA           | 0.701004112864842
 94035 | 37.41753 | -122.05283 | MOUNTAIN VIEW | CALIFORNIA | CA           |  1.53459076775345
 94042 | 37.39314 | -122.07827 | MOUNTAIN VIEW | CALIFORNIA | CA           |  2.09052870375317
 94041 | 37.38961 | -122.07715 | MOUNTAIN VIEW | CALIFORNIA | CA           |  2.33729808540454
 94306 | 37.41478 | -122.12139 | PALO ALTO     | CALIFORNIA | CA           |  2.35776644118448
 94303 | 37.44424 | -122.11736 | PALO ALTO     | CALIFORNIA | CA           |  2.51480871338068
(7 rows)</code><br />
<strong>NOTE:</strong> If you&#8217;re getting <strong>ERROR:  wrong record type supplied in RETURN NEXT</strong>, then it&#8217;s likely your composite type does not match the columns you&#8217;re querying.</p>
<p>You can also select any subset of the row (composite type) using standard SQL.<br />
<code>testdb=# select zip, distance from zip_proximity2('94043', 3.0, 'mi') limit 5;
  zip  |     distance
-------+-------------------
 94043 |                 0
 94039 | 0.701004115082249
 94035 |  1.53459076784732
 94042 |  2.09052870372395
 94041 |  2.33729808557031
(5 rows)</code></p>
<p>As always, please feel free to leave a comment with any questions or suggestions.</p>
]]></content:encoded>
			<wfw:commentRss>http://iamtgc.com/2009/10/01/returning-composite-types-in-postgres-stored-procedures/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Implementing Zip Code Proximity Functions in MySQL and&#160;PostgreSQL</title>
		<link>http://iamtgc.com/2009/01/14/implementing-zip-code-proximity-functions-in-mysql-and-postgresql/</link>
		<comments>http://iamtgc.com/2009/01/14/implementing-zip-code-proximity-functions-in-mysql-and-postgresql/#comments</comments>
		<pubDate>Wed, 14 Jan 2009 17:26:08 +0000</pubDate>
		<dc:creator>tgc</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Postgres]]></category>

		<guid isPermaLink="false">http://iamtgc.com/?p=95</guid>
		<description><![CDATA[On most retail and social networking websites (along with many others), you&#8217;ll have the capability to search for people, businesses, store locations, etc within a given distance of your location. This can be implemented in a number of ways both mathematically and programmatically. In an attempt to reduce the amount of code I (or others) [...]]]></description>
			<content:encoded><![CDATA[<p>On most retail and social networking websites (along with many others), you&#8217;ll have the capability to search for people, businesses, store locations, etc within a given distance of your location.  This can be implemented in a number of ways both mathematically and programmatically.  In an attempt to reduce the amount of code I (or others) have to write, be it in PHP, Python, or any number languages that may interface with our database, I have chosen to implement these zip code proximity and distance functions as stored procedures in the database.<br />
<span id="more-95"></span><br />
To start, you may already have your database with Zip Code coordinates, if you are looking for one, Team RedLine offers an excellent <a href="http://teamredline.com/zc">Zip Code Database</a> for $5 US that can be easily imported into the database of your choice.</p>
<p>The table that we will be using was created as follows:<br />
<code>CREATE TABLE zipcodes (
    zip varchar(5),
    lat double precision,
    lon double precision,
    city varchar(30),
    state varchar(30),
    state_abbrev varchar(2));</code></p>
<p>I use <strong>57.2958</strong> as a constant for <strong>180 / &pi;</strong> to convert between degrees and radians, beyond this I will focus on the implementation and point you to the <a href="http://en.wikipedia.org/wiki/Haversine_formula">Haversine Formula</a> and the <a href="http://en.wikipedia.org/wiki/Law_of_cosines_(spherical)">Law of Consines</a> if you wish to read more on the math.</p>
<p>First we will examine how these functions are written for PostgreSQL.  (MySQL functions are further down)</p>
<p>The first function implements the Law of Consines, and allows the user to specify which measurement (miles or kilometers) to use, to determine the distance between two coordinates.<br />
<code>CREATE FUNCTION calculate_distance(varchar, double precision, double precision, double precision, double precision) RETURNS double precision
    AS $_$
   DECLARE
      earth_radius double precision;
   BEGIN
      IF $1 = 'mi' THEN
         earth_radius := 3959.0;
      ELSIF $1 = 'km' THEN
         earth_radius := 6371.0;
      END IF;
      RETURN earth_radius * acos(sin($2 / 57.2958) * sin($4 / 57.2958) + cos($2/ 57.2958) * cos($4 / 57.2958) * cos(($5 / 57.2958) - ($3 / 57.2958)));
   END;
   $_$
    LANGUAGE plpgsql;</code><br />
The calculate_distance function can certainly be used stand alone, but in our example it is called exclusively from our zip_proximity function below.</p>
<p>zip_proximity takes a <a href="http://www.postgresql.org/docs/8.1/static/plpgsql-cursors.html">refcursor</a>, a zipcode, a distance, and a distance metric (&#8216;mi&#8217; or &#8216;km&#8217;).</p>
<p>First we retrieve the coordinates for the &#8220;home&#8221; zipcode, and query the databases, performing calculate_distance on each entry in the database, capturing only those that fall withing the given distance.  We add a field to the cursor we return, which is the distance from the home zipcode, to the zipcode which fell within our provided distance.</p>
<p><code>CREATE FUNCTION zip_proximity(refcursor, character, double precision, varchar) RETURNS refcursor
    AS $_$
   DECLARE
      home_lat float;
      home_lon float;
   BEGIN
      SELECT lat, lon INTO home_lat, home_lon FROM zipcodes WHERE zip = $2;
      OPEN $1 FOR
         SELECT zip, lat, lon, city, state, state_abbrev, calculate_distance($4, home_lat, home_lon, lat, lon) AS distance FROM zipcodes
             WHERE calculate_distance($4, home_lat, home_lon, lat, lon) &lt; $3 ORDER BY distance;
      RETURN $1;
   END;
   $_$
    LANGUAGE plpgsql;</code></p>
<p>Now we will query all zipcodes within a 3 mile radius of Mountain View, California 94043.<br />
Here is how we call our new function(s).  Since we are using cursors, we need to be in a transaction.<br />
<code>testdb=# BEGIN;
BEGIN
testdb=# SELECT zip_proximity('zc', '94043', 3, 'mi');
 zip_proximity
---------------
 zc
(1 row)

testdb=# FETCH ALL FROM zc;
  zip  |   lat    |    lon     |     city      |   state    | state_abbrev |     distance
-------+----------+------------+---------------+------------+--------------+-------------------
 94043 | 37.42337 | -122.07981 | MOUNTAIN VIEW | CALIFORNIA | CA           |                 0
 94039 | 37.41884 | -122.09124 | MOUNTAIN VIEW | CALIFORNIA | CA           | 0.701004112864842
 94035 | 37.41753 | -122.05283 | MOUNTAIN VIEW | CALIFORNIA | CA           |  1.53459076775345
 94042 | 37.39314 | -122.07827 | MOUNTAIN VIEW | CALIFORNIA | CA           |  2.09052870375317
 94041 | 37.38961 | -122.07715 | MOUNTAIN VIEW | CALIFORNIA | CA           |  2.33729808540454
 94306 | 37.41478 | -122.12139 | PALO ALTO     | CALIFORNIA | CA           |  2.35776644118448
 94303 | 37.44424 | -122.11736 | PALO ALTO     | CALIFORNIA | CA           |  2.51480871338068
(7 rows)

testdb=# END;
COMMIT</code></p>
<p>Now let&#8217;s take a look at the MySQL functions.  The calculate_distance function is essentially identical to the PostgreSQL function above.<br />
<code>DELIMITER //
CREATE FUNCTION calculate_distance(measurement varchar(2), base_lat double precision, base_lon double precision, lat double precision, lon double precision) RETURNS double precision
   BEGIN
      DECLARE earth_radius double precision;
      IF measurement = 'km' THEN
         SET earth_radius = 6371.0;
      ELSEIF measurement = 'mi' THEN
         SET earth_radius = 3959.0;
      END IF;
      RETURN earth_radius * ACOS(SIN(base_lat / 57.2958) * SIN(lat / 57.2958) + COS(base_lat / 57.2958) * COS(lat / 57.2958) * COS((lon / 57.2958) - (base_lon / 57.2958)));
   END //
DELIMITER ;</code></p>
<p>This function differs slightly from it&#8217;s Postgres counterpart.  Since MySQL does not currently support functions that return a cursor, we will create a procedure which will execute the same query that we would have returned in the cursor.<br />
<code>DELIMITER //
CREATE PROCEDURE zip_proximity(zipcode varchar(5), radius double precision, measurement varchar(2))
   BEGIN
   DECLARE base_lat double precision;
   DECLARE base_lon double precision;
   SELECT lat, lon INTO base_lat, base_lon FROM zipcodes WHERE zip = zipcode;
   SELECT zip, lat, lon, city, state, state_abbrev, calculate_distance(measurement, base_lat, base_lon, lat, lon) AS distance FROM zipcodes
      WHERE calculate_distance(measurement, base_lat, base_lon, lat, lon) &lt; radius ORDER BY distance;
   END //
DELIMITER ;</code></p>
<p>Now here is how we would call the procedure, again we are querying for all zip codes within a three mile radius of 94043.<br />
<code>mysql&gt; call zip_proximity('94043', 3, 'mi');
+-------+----------+------------+---------------+------------+--------------+-------------------+
| zip   | lat      | lon        | city          | state      | state_abbrev | distance          |
+-------+----------+------------+---------------+------------+--------------+-------------------+
| 94043 | 37.42337 | -122.07981 | MOUNTAIN VIEW | CALIFORNIA | CA           |                 0 |
| 94039 | 37.41884 | -122.09124 | MOUNTAIN VIEW | CALIFORNIA | CA           | 0.701004115082249 |
| 94035 | 37.41753 | -122.05283 | MOUNTAIN VIEW | CALIFORNIA | CA           |  1.53459076784732 |
| 94042 | 37.39314 | -122.07827 | MOUNTAIN VIEW | CALIFORNIA | CA           |  2.09052870372395 |
| 94041 | 37.38961 | -122.07715 | MOUNTAIN VIEW | CALIFORNIA | CA           |  2.33729808557031 |
| 94306 | 37.41478 | -122.12139 | PALO ALTO     | CALIFORNIA | CA           |  2.35776644108718 |
| 94303 | 37.44424 | -122.11736 | PALO ALTO     | CALIFORNIA | CA           |  2.51480871282271 |
+-------+----------+------------+---------------+------------+--------------+-------------------+
7 rows in set (0.75 sec)

Query OK, 0 rows affected (0.75 sec)</code></p>
<p>Feel free to leave a comment with any questions or suggestions.</p>
]]></content:encoded>
			<wfw:commentRss>http://iamtgc.com/2009/01/14/implementing-zip-code-proximity-functions-in-mysql-and-postgresql/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Exploring Stored Procedures in MySQL and&#160;PostgreSQL</title>
		<link>http://iamtgc.com/2008/12/19/exploring-stored-procedures-in-mysql-and-postgresql/</link>
		<comments>http://iamtgc.com/2008/12/19/exploring-stored-procedures-in-mysql-and-postgresql/#comments</comments>
		<pubDate>Fri, 19 Dec 2008 14:38:57 +0000</pubDate>
		<dc:creator>tgc</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Postgres]]></category>

		<guid isPermaLink="false">http://iamtgc.com/?p=69</guid>
		<description><![CDATA[The PostgreSQL PL/pgSQL procedural language is well documented here and the MySQL Reference Manual is available here. The MySQL documentation is, in my opinion, lacking, however the MySQL Stored Procedure Forum helps a great deal in making up for the lack of documentation. Now, let&#8217;s take for example this simplified user table. In reality it [...]]]></description>
			<content:encoded><![CDATA[<p>The PostgreSQL PL/pgSQL procedural language is well documented <a href="http://www.postgresql.org/docs/8.3/static/plpgsql.html">here</a> and the MySQL Reference Manual is available <a href="http://dev.mysql.com/doc/refman/5.1/en/stored-routines.html">here</a>.  The MySQL documentation is, in my opinion, lacking, however the <a href="http://forums.mysql.com/list.php?98">MySQL Stored Procedure Forum</a> helps a great deal in making up for the lack of  documentation.<br />
<span id="more-69"></span><br />
Now, let&#8217;s take for example this simplified user table.  In reality it would contain additional information, a hashed password, real name, who knows, but for the sake of our example, we&#8217;ll keep it minimal.  Once the table is created, we will want to create a stored procedure that automatically expires accounts that have not logged on in more than a month.<br />
<code>CREATE TABLE users (
   username varchar(12),
   last_login timestamp,
   expired boolean
);</code></p>
<p>Now, we could create a stored procedure that expires all accounts that have not logged on in a month, but to make it more versatile, and to demonstrate more of the capabilities, we will allow the user to define the number of days since last logged on before the account expires.</p>
<p>First, let&#8217;s see how this we could write this for PostgreSQL, will will be using PL/pgSQL procedural language.<br />
<code>CREATE OR REPLACE FUNCTION mark_expired(days INT) RETURNS VOID AS
$$
DECLARE 
   delta BIGINT; 
BEGIN
   delta := $1*86400;
   UPDATE users SET expired=true WHERE EXTRACT(epoch from age(CURRENT_TIMESTAMP, last_login)) &gt; delta;
END;
$$
LANGUAGE plpgsql;</code></p>
<p>Let&#8217;s look at the user table before running the stored procedure<br />
<code>username  |     last_login      | expired
----------+---------------------+---------
 homer    | 2008-12-18 00:00:00 | f
 marge    | 2008-11-18 00:00:00 | f
 bart     | 2008-12-14 00:00:00 | f
 lisa     | 2008-12-17 00:00:00 | f
 maggie   | 2008-12-19 00:00:00 | f</code></p>
<p>Here is how you would execute the stored procedure, with our user defined 30 day argument.<br />
<code>testdb=# select mark_expired(30);
testdb=# select * from users;
 username |     last_login      | expired
----------+---------------------+---------
 homer    | 2008-12-18 00:00:00 | f
 bart     | 2008-12-14 00:00:00 | f
 lisa     | 2008-12-17 00:00:00 | f
 maggie   | 2008-12-19 00:00:00 | f
 marge    | 2008-11-18 00:00:00 | t</code></p>
<p>Success!, Marge&#8217;s account has now been marked expired, since she has not logged on in the last 30 days.</p>
<p>Your first inclination may be to extract the day from age, but this does not work as one may think.  Take for example the following&#8230;<br />
<code>testdb=# select age(now(), CURRENT_DATE-31);
             age
-----------------------------
 1 mon 1 day 13:02:30.065623
(1 row)</code></p>
<p>Now if we extract day, this is what we get&#8230;<br />
<code>testdb=# select extract(day from age(now(), CURRENT_DATE-31));
 date_part
-----------
         1
(1 row)</code></p>
<p>This is not in fact what we were looking for, if we were to use this method to expire accounts, this account appears to have been logged in as recently as one day ago, when in fact 31 days have elapsed.  So this is why we chose to use epoch, which in this case will get total elapsed seconds.</p>
<p>Now, let&#8217;s review how you might write this for MySQL.<br />
<code>DELIMITER //
CREATE PROCEDURE mark_expired (days INT) 
BEGIN 
   UPDATE users SET expired = true 
              WHERE last_login &lt; SUBDATE(CURRENT_TIMESTAMP, INTERVAL days DAY); 
END //
DELIMITER ;</code><br />
In this example we take a slightly different approach, we subtract the days argument from the current timestamp, and see if the last login timestamp is older than this.  This leverages MySQL&#8217;s SUBDATE function and avoids having to convert days into seconds.</p>
<p>Let&#8217;s take a look at our table again, before calling the procedure<br />
<code>mysql&gt; select * from users;
+----------+---------------------+---------+
| username | last_login          | expired |
+----------+---------------------+---------+
| homer    | 2008-12-18 00:00:00 |       0 |
| marge    | 2008-11-18 00:00:00 |       0 |
| bart     | 2008-12-14 00:00:00 |       0 |
| lisa     | 2008-12-17 00:00:00 |       0 |
| maggie   | 2008-12-19 00:00:00 |       0 |
+----------+---------------------+---------+</code></p>
<p>And how you would call the procedure in MySQL.<br />
<code>mysql&gt; CALL mark_expired(30);
mysql&gt; select * from users;
+----------+---------------------+---------+
| username | last_login          | expired |
+----------+---------------------+---------+
| homer    | 2008-12-18 00:00:00 |       0 |
| marge    | 2008-11-18 00:00:00 |       1 |
| bart     | 2008-12-14 00:00:00 |       0 |
| lisa     | 2008-12-17 00:00:00 |       0 |
| maggie   | 2008-12-19 00:00:00 |       0 |
+----------+---------------------+---------+</code></p>
<p>Again, success, Marge&#8217;s account has been expired.</p>
]]></content:encoded>
			<wfw:commentRss>http://iamtgc.com/2008/12/19/exploring-stored-procedures-in-mysql-and-postgresql/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Importing XML into a database with Python and&#160;SQLAlchemy</title>
		<link>http://iamtgc.com/2008/01/29/importing-xml-into-a-database-with-python-and-sqlalchemy/</link>
		<comments>http://iamtgc.com/2008/01/29/importing-xml-into-a-database-with-python-and-sqlalchemy/#comments</comments>
		<pubDate>Tue, 29 Jan 2008 21:19:47 +0000</pubDate>
		<dc:creator>tgc</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Postgres]]></category>
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://iamtgc.com/2008/01/29/importimg-xml-into-a-database-with-python-and-sqlalchemy/</guid>
		<description><![CDATA[Let&#8217;s begin by analyzing the XML we want to import into our database, it consists of a book&#8217;s ISBN, title, and author. &#60;!-- books.xml --&#62; &#60;catalog&#62; &#60;book isbn="1-880985-26-8"&#62; &#60;title&#62;The Consumer&#60;/title&#62; &#60;author&#62;M. Gira&#60;/author&#62; &#60;/book&#62; &#60;book isbn="0-679775-43-9"&#62; &#60;title&#62;The Wind-Up Bird Chronicle&#60;/title&#62; &#60;author&#62;Haruki Murakami&#60;/author&#62; &#60;/book&#62; &#60;!-- imagine more entries here... --&#62; &#60;/catalog&#62; Now we can create the database [...]]]></description>
			<content:encoded><![CDATA[<p>Let&#8217;s begin by analyzing the XML we want to import into our database, it consists of a book&#8217;s ISBN, title, and author.<br />
<code>&lt;!-- books.xml --&gt;
&lt;catalog&gt;
  &lt;book isbn="1-880985-26-8"&gt;
    &lt;title&gt;The Consumer&lt;/title&gt;
    &lt;author&gt;M. Gira&lt;/author&gt;
  &lt;/book&gt;
  &lt;book isbn="0-679775-43-9"&gt;
    &lt;title&gt;The Wind-Up Bird Chronicle&lt;/title&gt;
    &lt;author&gt;Haruki Murakami&lt;/author&gt;
  &lt;/book&gt;
  &lt;!-- imagine more entries here... --&gt;
&lt;/catalog&gt;</code></p>
<p><span id="more-13"></span><br />
Now we can create the database table.<br />
<code>create table books
(
isbn varchar(14) primary key not null,
title varchar(50),
author varchar(50)
);</code></p>
<p>Our example depends on <a href="http://sqlalchemy.org">SQLAlchemy</a>, which is a &#8220;Python SQL toolkit and Object Relational Mapper that gives application developers the full power and flexibility of SQL.&#8221;</p>
<p>While our example leverages a Postgres database, given SQLAlchemy&#8217;s database agnostic approach, it can trivially be modified to access a MySQL database.</p>
<p>This code is derived from a SAX parser originally found in <a href="http://www.amazon.com/gp/redirect.html?ie=UTF8&#038;location=http%3A%2F%2Fwww.amazon.com%2FProgramming-Python-Mark-Lutz%2Fdp%2F0596009259%3Fie%3DUTF8%26s%3Dbooks%26qid%3D1189212366%26sr%3D8-1&#038;tag=iamtgc-20&#038;linkCode=ur2&#038;camp=1789&#038;creative=9325">Programming Python 3rd Edition</a><img src="http://www.assoc-amazon.com/e/ir?t=iamtgc-20&amp;l=ur2&amp;o=1" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />.</p>
<p><code># bookhandler.py
from sqlalchemy import *
from sqlalchemy.orm import *

import xml.sax.handler

pg_db = create_engine('postgres:///testdb?user=homer')

metadata = MetaData(pg_db)

books_table = Table('books', metadata, autoload=True)

class Book(object):
    pass

mapper(Book, books_table)

class BookHandler(xml.sax.handler.ContentHandler):
    def __init__(self):
        self.buffer = ""
        self.inField = 0
        self.session = create_session(bind=pg_db)

    def startElement(self, name, attributes):
        if name == "book":
            self.isbn = attributes["isbn"]
        elif name == "title":
            self.inField = 1
        elif name == "author":
            self.inField = 1

    def characters(self, data):
        if self.inField:
            self.buffer += data

    def endElement(self, name):
        if name == "book":
            self.session.begin()
            self.newbook = Book()
            self.newbook.isbn = self.isbn
            self.newbook.title = self.title
            self.newbook.author = self.author
            self.session.save(self.newbook)
            self.session.commit()
        elif name == "title":
            self.inField = 0
            self.title = self.buffer
        elif name == "author":
            self.inField = 0
            self.author = self.buffer
        self.buffer = ""</code><br />
Now that we&#8217;ve set up our sax parser and handler to parse and load the entries from books.xml into the table, lets set up a small script to drive it:<br />
<code># runit.py
import bookhandler
import xml.sax

parser = xml.sax.make_parser()
handler = bookhandler.BookHandler()
parser.setContentHandler(handler)
parser.parse("books.xml")</code></p>
<p>Now let&#8217;s see if it works:<br />
<code>$ ls
bookhandler.py  books.xml       runit.py
$ python ./runit.py
$ psql testdb

testdb=# select * from books;
     isbn      |           title            |     author
---------------+----------------------------+-----------------
 1-880985-26-8 | The Consumer               | M. Gira
 0-679775-43-9 | The Wind-Up Bird Chronicle | Haruki Murakami
(2 rows)</code></p>
]]></content:encoded>
			<wfw:commentRss>http://iamtgc.com/2008/01/29/importing-xml-into-a-database-with-python-and-sqlalchemy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Importing data from a file into a PostgreSQL&#160;database</title>
		<link>http://iamtgc.com/2007/08/06/importing-data-from-a-file-into-a-postgresql-database/</link>
		<comments>http://iamtgc.com/2007/08/06/importing-data-from-a-file-into-a-postgresql-database/#comments</comments>
		<pubDate>Mon, 06 Aug 2007 14:21:13 +0000</pubDate>
		<dc:creator>tgc</dc:creator>
				<category><![CDATA[Postgres]]></category>

		<guid isPermaLink="false">http://iamtgc.com/2007/08/06/importing-data-from-a-file-into-a-postgresql-database/</guid>
		<description><![CDATA[To follow up on our previous article on importing data into a MySQL database, is how to Import data from a file into a PostgreSQL database. In this example, we&#8217;ll focus on the data contained here, in an IP-to-Country csv file. This data consists of the first IP address in the range (in long format), [...]]]></description>
			<content:encoded><![CDATA[<p>To follow up on <a href="http://iamtgc.com/2007/05/26/importing-data-from-a-file-into-a-mysql-database/">our previous article</a> on importing data into a MySQL database, is how to Import data from a file into a PostgreSQL database.  </p>
<p>In this example, we&#8217;ll focus on the data contained <a href="http://ip-to-country.webhosting.info/node/view/6">here</a>, in an IP-to-Country csv file.  This data consists of the first IP address in the range (in long format), the last IP address in the range (again, in long format), and the alpha-2, alpha-3 and country name (according to ISO 3166-1) of the country which the IP range is assigned to.<br />
<span id="more-10"></span><br />
Here is a brief excerpt to get an idea of the information we&#8217;ll be importing.<br />
<code>"3739572224","3739574271","AU","AUS","AUSTRALIA"
"3739574272","3739680767","JP","JPN","JAPAN"
"3739680768","3739697151","KR","KOR","REPUBLIC OF KOREA"
"3739697152","3739746303","JP","JPN","JAPAN"
"3739746304","3740270591","KR","KOR","REPUBLIC OF KOREA"
"3740270592","3740925951","CN","CHN","CHINA"
"3740925952","3741024255","TW","TWN","TAIWAN"
"3741024256","3741057023","KR","KOR","REPUBLIC OF KOREA"
"3741057024","3741319167","VN","VNM","VIET NAM"
"3758096384","4294967295","US","USA","UNITED STATES"</code></p>
<p>Let&#8217;s start by creating the table we need to support this data.<br />
<code>ipdb=# create table ip_to_country (
startrange bigint,
endrange bigint,
country_alpha2 varchar(2),
country_alpha3 varchar(3),
country varchar(50)
);
CREATE TABLE</code></p>
<p>In the csv above, you&#8217;ll notice that the fields are comma separated and also enclosed in double quotations.  Postgres&#8217; COPY function allows you to define what the DELIMITER and QUOTE are, among other variables, you can type <i>\h copy</i> at the psql prompt to see all the arguments COPY takes.</p>
<p>Here is how we import our ip-to-country data.<br />
<code>ipdb=# copy ip_to_country from '/tmp/ip-to-country.csv' WITH DELIMITER AS ',' CSV QUOTE AS '"';
COPY 79440</code></p>
<p>There were 79440 entries copied, a quick line count on the file can help you determine if all the data was imported.</p>
<p>Now let&#8217;s see if it imported correctly, we&#8217;ll take the last entry in the above excerpt and query for the startrange<br />
<code>ipdb=# select * from ip_to_country where startrange = 3758096384;
 startrange |  endrange  | country_alpha2 | country_alpha3 |    country
------------+------------+----------------+----------------+---------------
 3758096384 | 4294967295 | US             | USA            | UNITED STATES
(1 row)</code><br />
Perfect, and as you&#8217;ll notice the quotes enclosing the data were stripped during the import as well.</p>
]]></content:encoded>
			<wfw:commentRss>http://iamtgc.com/2007/08/06/importing-data-from-a-file-into-a-postgresql-database/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
