Python and MySQLdb

Posted by: Keith Brown

Introduction

We're going to explain a few things about accessing a MySQL database server from Python using MySQLdb here. MySQLdb is a Python DB API 2.0-compliant database adapter for the MySQL RDBMS. It wraps _mysql, which is a Python module written in C that implements most of the MySQL C API in similarly-named Python calls. Installation requires the MySQL client to be installed because it requires the MySQL client libraries.

MySQLdb was written by Andy Dustman for Python 2. A Python 3-compatible version was a stated goal but never appeared, so the project was forked. This has led to some confusion regarding the package names on the Python Package Index (pypi). The original Python 2-only package is now named mysql-python and supports Python 2.4-2.7 at the time of this writing. The fork (the focus of this article) is named mysqlclient. It's a drop-in replacement that supports Python 2 and Python 3. It's recommended by e.g. the Django project.

Links:
https://pypi.python.org/pypi/mysqlclient/
https://pypi.python.org/pypi/MySQL-python/
http://sourceforge.net/projects/mysql-python/
https://www.python.org/dev/peps/pep-0249; Python Database API Specification v2.0

We'll be using Python 3 on Ubuntu.

Installation

mysqlclient is available on pypi as the mysqlclient package and can be installed using pip. Let's create a virtual environment and install the package there.

Note that on Ubuntu you might need to install the libmysqlclient-dev package from the Ubuntu repo first (e.g. sudo apt-get install libmysqlclient-dev).

$ python3 -m venv myvenv

$ source myvenv/bin/activate

(myvenv) $ pip install mysqlclient
Collecting mysqlclient
  Downloading mysqlclient-1.3.12.tar.gz (89kB)
  ...
Installing collected packages: mysqlclient
  Running setup.py install for mysqlclient ... done
Successfully installed mysqlclient-1.3.12

Getting Started

Access to a database requires a connection object and a cursor object. To create a connection object use the MySQLdb connect function with the typical MySQL client connection arguments e.g. user name and password. To create a cursor object use the cursor method on the resulting connection object. By default a cursor is tuple-based but a dictionary-based cursor can be created by passing the MySQLdb.cursor.DictCursor cursor class to the cursor factory.

>>> import MySQLdb
>>> from getpass import getpass
>>> 
>>> conn = MySQLdb.connect(host="localhost", user="root", passwd=getpass(), charset="utf8")
Password: 
>>> 
>>> conn
<_mysql.connection open to 'localhost' at 1dfe058>
>>> 
>>> cur = conn.cursor(MySQLdb.cursors.DictCursor)
>>> cur
<MySQLdb.cursors.DictCursor object at 0x7fdbd6cf0da0>
>>> 
>>> curalt = conn.cursor()
>>> curalt
<MySQLdb.cursors.Cursor object at 0x7fdbd9be6c88>

Connection error examples:

_mysql_exceptions.OperationalError: (1045, "Access denied for user 'root'@'localhost' (using password: YES)")
_mysql_exceptions.OperationalError: (2003, "Can't connect to MySQL server on '192.168.2.150' (101)")
_mysql_exceptions.OperationalError: (2005, "Unknown MySQL server host 'dbserver' (2)")

If the MySQL server is restarted or the MySQL connection has timed out based on the MySQL server wait_timeout value then the connection object must be recreated. Otherwise trying to use an existing connection object produces an error message

_mysql_exceptions.OperationalError: (2006, 'MySQL server has gone away')

Executing Queries

To execute a SQL query call the execute method on a cursor object. execute accepts either one or two arguments. In the one-argument form the argument is simply a string expression representing a SQL query. The return value from execute depends on the query but it's typically the number of rows affected. In cases where this doesn't apply the return value is 0.

>>> cur.execute("use test")
0
>>> cur.execute("create table person (fname varchar(255) not null, lname varchar(255) not null, primary key (lname, fname))")
0
>>> cur.execute("insert into person (fname, lname) values ('homer', 'simpson'), ('marge', 'simpson'), ('bart', 'simpson'), ('lisa', 'simpson'), ('maggie', 'simpson')")
5

If using string literals, then of course escaping might be necessary. Effectively mixing single, double, and triple quotes can be helpful. Choose a style and be consistent.

In the one-argument form of execute, query parameterization (any part of it) can be done by using Python's usual string mechanisms.

>>> fname = "marge"
>>> lname = "simpson"
>>> 
>>> cur.execute("select * from person where fname = '" + fname + "' and  lname = '" + lname + "'")   # concatenation
1
>>> cur.execute("select * from person where fname = '%s' and lname = '%s'" % (fname, lname))   # interpolation
1
>>> cur.execute("select * from person where fname = '%(fname)s' and lname = '%(lname)s'" % {"fname":fname, "lname":lname})    # interpolation
1
>>> cur.execute("select * from person where fname = '{0}' and lname = '{1}'".format(fname, lname))   # format method
1
>>> cur.execute("select * from person where fname = '{fname}' and lname = '{lname}'".format(fname=fname, lname=lname))   # format method
1
>>> cur.execute(f"select * from person where fname = '{fname}' and lname = '{lname}'")   # f-strings
1
>>> cur.execute(string.Template("select * from person where fname = '$fname' and lname = '$lname'").substitute(fname=fname, lname=lname))   # Template class
1
>>> cur.execute(string.Template("select * from person where fname = '$fname' and lname = '$lname'").substitute("fname":fname, "lname":lname))   # Template class
1

All of these forms require quotes around string parameters. Types must be differentiated when using interpolation e.g. %s vs. %d. If substituting objects then dictionary keys and keyword arguments duplicate the object identifiers. F-strings (new in Python 3.6) are probably the best option here.

In the two-argument form of execute, the first argument is a string expression representing a query but also containing the placeholder %s like that used in string interpolation. The second argument is a tuple/list of values used to replace the placeholder(s). This is again like string interpolation but the string interpolation/modulo operator is not used, i.e. the value tuple/list is simply the second argument in a method call and so it follows a comma. The restriction is that only column values can be substituted, i.e. not database names, table names, etc.

Note that in this MySQLdb-style of interpolation:

%s is always used even if the value is e.g. an integer
no quotes are used around %s
substituted values are automatically escaped e.g. quotes in strings are automatically handled
a tuple/list is required; older versions of the original MySQLdb allow a single value instead of a tuple/list to be used if only one placeholder is present

>>> fname = "homer"
>>> cur.execute("select * from person where fname = %s", (fname,))
1
>>> fname = "marge"
>>> lname = "simpson"
>>> cur.execute("select * from person where fname = %s and lname = %s", (fname, lname))
1

In either case, to actually retrieve the results of a query use the fetchone or fetchall method on the cursor object directly following the query. fetchone returns a single tuple/dictionary or None if the set is empty. fetchall returns a tuple of tuples/dictionaries or an empty tuple if the set is empty.

>>> fname = "marge"
>>> lname = "simpson"
>>> 
>>> cur.execute("select * from person where fname = %s and lname = %s", (fname, lname))
1
>>> cur.fetchone()
{'fname': 'Marge', 'lname': 'Simpson'}
>>> 
>>> cur.execute("select * from person where fname = %s and lname = %s", (fname, lname))
1
>>> cur.fetchall()
({'fname': 'Marge', 'lname': 'Simpson'},)
>>> 
>>> cur.execute("select * from person where lname = %s", (lname,))
5
>>> cur.fetchall()
({'fname': 'Marge', 'lname': 'Simpson'}, {'fname': 'Homer', 'lname': 'Simpson'}, {'fname': 'Lisa', 'lname': 'Simpson'}, {'fname': 'Bart', 'lname': 'Simpson'}, {'fname': 'Maggie', 'lname': 'Simpson'})

If using a dictionary-based cursor, then each dictionary's keys are column names and match the column names used in the query, unless the AS SQL qualifier was used, or unless there is ambiguity. A tuple of dictionaries can be viewed as a table by arranging the dictionaries vertically.

({"fname":"homer", "lname":"simpson"},
 {"fname":"marge", "lname":"simpson"},
 {"fname":"bart",  "lname":"simpson"})

If there is ambiguity in the column names then MySQLdb does the minimal amount necessary to resolve the ambiguity (a dictionary cannot contain duplicate keys), for example by using a table name when doing a join.

select responsepoint.id, responsepoint.name, project.id from ...`

This query yields keys of id, name, and project.id. To be safe use the AS operator.

Here's an example using an auto-increment integer field and a date field:

>>> cur.execute("drop table person")
0
>>> cur.execute("create table person (id serial not null, fname varchar(255) not null, lname varchar(255) not null, bdate date, primary key (id), unique (lname, fname))")
0
>>> cur.execute("insert into person (fname, lname, bdate) values ('homer', 'simpson', '1982-03-01'), ('marge', 'simpson', '1984-10-01'), ('bart', 'simpson', '0000-00-00')")
3
>>> cur.execute("select * from person")
3
>>> cur.fetchall()
({'id': 1, 'fname': 'homer', 'lname': 'simpson', 'bdate': datetime.date(1982, 3, 1)}, {'id': 2, 'fname': 'marge', 'lname': 'simpson', 'bdate': datetime.date(1984, 10, 1)}, {'id': 3, 'fname': 'bart', 'lname': 'simpson', 'bdate': None})
>>>

Note that MySQLdb returns "zero" MySQL dates or datetimes, e.g. 0000-00-00, as None (likewise for NULL dates). If you're reading a row of values and then generically writing the row back to the database, then None must be converted to the proper MySQL zero value (based on the field type) before writing back to the database. In fact, as in the MySQL client, using 0 works in all cases (i.e. the integer; '0' also works in later versions).

And there we are. A few things about accessing a MySQL database server from Python using MySQLdb.

Authors

Keith Brown (13)

Feeds

RSS / Atom