Python unit testing and MongoDB

One of the last project I've worked on use MongoDB for collecting events, manipulate them and computing statistics. MongoDB is quite popular nowadays but I couldn't find any doc about integrating it with Python unittest.

When I run tests I like to have a self-configured environment that just works and doesn't take too long to finish. In previous projects, whenever I've used a DB I could setup a SQLite driver that stores the temporary information in memory. That makes tests fast and super easy to run as you don't have to ensure you have a running db server properly configured. Django actually has a nice approach for db testing: whenever the DB backend is SQLite then by default it uses an in-memory database.

Browsing the web I've found out Ming, the MongoDB orm, provides a "mim" module that stands for Mongo-In-Memory. I guess this module is just used internally for Ming unittests and that's why it is not that sponsored on the doc but I wanted to give it a chance for my purposes. I've created my own TestCase that creates an instance of the mim Connection to execute my tests on it. Everything looked promising for my first tentatives then I had to write some tests for the routines that were using Map Reduce and MIM showed some limits, for instance it is not possible to write the result of the MR in another database and I've got some weird errors with more complex objects. I've written some workaround and started investigating on the bugs then I've given up, considering a simpler alternative that I've first delegated as my plan B.

Plan B consisted in starting a MongoDB singleton instance from my TestCase class itself. This is more or less the code I've used, hopefully will be helpful for others:

import time
import atexit
import shutil
import tempfile
import unittest
import subprocess

import pymongo

from util import load_fixture


class MongoTemporaryInstance(object):
    """Singleton to manage a temporary MongoDB instance

    Use this for testing purpose only. The instance is automatically destroyed
    at the end of the program.

    """
    _instance = None

    @classmethod
    def get_instance(cls):
        if cls._instance is None:
            cls._instance = cls()
            atexit.register(cls._instance.shutdown)
        return cls._instance

    def __init__(self):
        self._tmpdir = tempfile.mkdtemp()
        self._process = subprocess.Popen(['mongod', '--bind_ip', 'localhost',
                                          '--port', str(MONGODB_TEST_PORT),
                                          '--dbpath', self._tmpdir,
                                          '--nojournal', '--nohttpinterface',
                                          '--noauth', '--smallfiles',
                                          '--syncdelay', '0',
                                          '--maxConns', '10',
                                          '--nssize', '1', ],
                                         stdout=open(os.devnull, 'wb'),
                                         stderr=subprocess.STDOUT)

        # XXX: wait for the instance to be ready
        #      Mongo is ready in a glance, we just wait to be able to open a
        #      Connection.
        for i in range(3):
            time.sleep(0.1)
            try:
                self._conn = pymongo.Connection('localhost', MONGODB_TEST_PORT)
            except pymongo.errors.ConnectionFailure:
                continue
            else:
                break
        else:
            self.shutdown()
            assert False, 'Cannot connect to the mongodb test instance'

    @property
    def conn(self):
        return self._conn

    def shutdown(self):
        if self._process:
            self._process.terminate()
            self._process.wait()
            self._process = None
            shutil.rmtree(self._tmpdir, ignore_errors=True)


class TestCase(unittest.TestCase):
    """TestCase with an embedded MongoDB temporary instance.

    Each test runs on a temporary instance of MongoDB. Please note that
    these tests are not thread-safe and different processes should set a
    different value for the listening port of the MongoDB instance with the
    settings `MONGODB_TEST_PORT`.

    A test can access the connection using the attribute `conn`.

    """
    fixtures = []

    def __init__(self, *args, **kwargs):
        super(TestCase, self).__init__(*args, **kwargs)
        self.db = MongoTemporaryInstance.get_instance()
        self.conn = self.db.conn

    def setUp(self):
        super(TestCase, self).setUp()

        for db_name in self.conn.database_names():
            self.conn.drop_database(db_name)

        for fixture in self.fixtures:
            load_fixture(self.conn, fixture)

MongoTemporaryInstance is the singleton class that setups the temporary instance. I know the singleton pattern in Python could be written in a different way, I just prefer this way because it is more explicit.

At the start of each test, all the databases in the instance are dropped and all the fixtures defined in the fixtures array of the subclasses of TestCase are loaded. load_fixture is a function defined elsewhere, it just loads a json file and puts the data contained in the db. At the end of the execution of the program the MongoDB temporary instance is killed and its data is destroyed.

Angry Bits

Words on bytes and bits

Python unit testing and MongoDB

Comments