Thursday, April 15, 2010

dbapiext.py - making dealing with SQL easier in Python

Sorry there was no post last week - I’ve been a little busy organising PyCon Australia.

This week I’d like to share dbapiext.py by Martin Blais with you. It provides really nice interface for making dealing with SQL easier in Python.

Introduction

In the Python world we have a standard specification of how database interfaces should look and we call it PEP 294, or more commonly DB-API 2.0. One of the things about DB-API 2.0 is that it’s a little loose in some areas - the stated goal of the PEP is “to encourage similarity between the Python modules that are used to access databases” (my emphasis) rather than to dictate a precise API.

In practical terms what this means is that you can’t just swap DB-API 2.0 modules like DCOracle2 for cx_Oracle, even for simple operations and even though they both run on top of Oracle. The core reason for this is that they use different parameter specifications. For example, a query in DCOracle2 might look like:

cursor.execute('SELECT * FROM customers WHERE name=:1', 'Richard Jones')

the same query in cx_Oracle might look like:

cursor.execute('SELECT * FROM customers WHERE name=:name', name='Richard Jones')

Of course one solution is to not use parameters and format the arguments directly into the SQL. But that’s just inviting disaster.

The dbapiext module attempts to normalise these interfaces and add some new convenience features. At its simplest it allows one to translate the above two statemens into a single form:

execute_f(cursor, 'SELECT * FROM customers WHERE name=%S', 'Richard Jones')

Where the “%S” indicates that the value being passed should be escaped and quoted as appropriate. If you don’t wish for the value to be treated as such you may use “%s” which formats the value directly into the SQL. This will work on top of either backend, given the following definition of execute_f:

from dbapiext import execute_f
import functools
if using_cx_Oracle:
    execute_f = functools.partial(execute_f, paramstyle='named')
else:
    execute_f = functools.partial(execute_f, paramstyle='numeric')

Other Capabilities

The query parsing and argument handling is quite flexible. You can mix positional arguments and keyword arguments - and refer to the keyword arguments by name in the SQL, especially if the underlying database connection implementation doesn’t offer the facility:

execute_f(cursor, '''SELECT * FROM account WHERE
     account.active = %S AND (
     account.number = %(number)S OR
     account.number = (
         SELECT account FROM mobile_account WHERE
         mobile_account.msisdn = %(number)S
     ))''', activated, number=number)

In this situation the “%S” SQL parameter will use the fixed argument “activated” and the “%(number)S” SQL parameter will use the keyword argument “number”. Pretty cool.

dbapiext doesn’t stop there - it introduces a bunch of other really neat extensions. How many times has your code included something like this?

# given some variable list of data to update
data = dict(name='Richard', size='medium', alignment='neutral')

# figure the SQL and values argument
columns = ', '.join('%s=:1'%k for k in data)
sql = 'UPDATE person set %s'%columns
values = [data[k] for k in data]

# update the information in a table
cursor.execute(sql, *values)

With dbapiext you can do the much more pythonic:

# given some variable list of data to update
data = dict(name='Richard', size='medium', alignment='neutral')
execute_f(cursor, 'UPDATE person %S', data)

The “%S” argument here renders the dictionary in the form suitable for the UPDATE statement. If your SQL was a SELECT instead you may use “%A” which joins the dictionary items with “AND” instead:

# given some variable list of data to match
data = dict(name='Richard', size='medium', alignment='neutral')
execute_f(cursor, 'SELECT * FROM person WHERE %A', data)

Note that “data” could also be a list of pairs instead of a dictionary. How convenient is that!

4 comments:

  1. Any reason for not using sqlalchemy? They have a really nice abstraction layer between the different databases, which really makes life easier. It also has an ORM, but you don't have to use that to get decent benefit from it.

    http://www.sqlalchemy.org/

    ReplyDelete
  2. Yes, there's reason: SQLAlchemy is a large, complex package and dbapiext is a single, small module that does one thing very well.

    ReplyDelete
  3. I wrote the inrex module commented on in your first post. I just started working on sqldict, a module providing a complete dict call-interface while being database backed. Its backed by sqlalchemy so it provides a solid database-agnostic basck-end.

    Full object serialization support and a dict of unlimited capacity might give you an edge, who knows.

    As much of a pythonic interface you ask for.

    ReplyDelete
  4. Looks nice! Thanks for pointing it out!

    Aha! And looking through the code, it looks like he built in named tuples as results, too! That's something I really want - named tuples seem designed to represent database rows, and it drives me crazy that the DB-API doesn't return them! (Granted, the DB-API preceded them...)

    ReplyDelete