Google App Engine version migration

Jonathan Ricketson

When writing your application on Google App Engine, you are inevitably going to deploy a version to production that does not have a final set of features. This is (of course) unavoidable. So, then for version 2, when creating new features you will also (probably) have to refactor the data model, at least adding new fields, and potentially renaming fields, or creating composed entities. So, when putting version 2 into production the data that all of your users have created in version 1 will need to be migrated to the new data-model.

Nothing exists as of now to do something similar to Rails Migrations, so everything pretty much needs to be done yourself.

The solution that I outline here is an automated way of processing over all entities in a list of Model classes. This solution loads the entities up into the v2 model classes and then calls your code to modify them (and create any new objects that might be required). This solution will not work for the case where the v2 model is missing fields that need to be read to migrate to the new state. That solution would need to be written using the underlying Entity classes and Query. (I haven’t needed to do that yet). This is really more of a recipe that can be modified to your purposes, than a generic drop-in migration tool.

The idea came from code copied from a google demo.

Here is the file: migrate.py

Directions for use (these directions are for Django, but if you are using the basic handler it would look very similar):

1. Rename migrate.py to migrateXToY.py (for your X and your Y)

2. Modify your urls.py to contain:

patterns("migrateXToY",
                       (r'^migrate/migrateXToY$', 'migrateXToY'),
                       (r'^worker/migrateXToY/(?P[^/]+)$', 'migrateModel'),
)

3. Create a migrateXToY method in the migrate.py like the following (modelsToMigrate is the list of model names that will be processed):

def migrate1To2(request):
    modelsToMigrate = ['User', 'Foo', 'Bar', 'BarDetail', 'Image', 'Report']
    [taskqueue.add(url='/worker/migrateXToY/%s' % i) for i in modelsToMigrate]
    return HttpResponse("created all tasks for processing")

4. Create a FooWorker class like the following (the migrateItem method will receive each item that is being migrated, the method should return the item after processing and should not put it. For efficiency puts are batched):

class PlacementWorker(MigrationWorker):
    kind=Foo
    kindName="Foo"
    def migrateItem(self, item):
        logging.info("processing %s %s" % (self.kindName, item.key()))
        #whatever processing you need to do
        item.cancelled=False
        return item

This entry was posted on June 28th, 2009.