This test compares the speed of various object serializers available for Python. I should have run it long ago. It opened my eyes a bit!
import time import json import simplejson import marshal import pickle import cPickle import cerealizer def time_module(module): d = {} for k in range(100000): d[str(k)] = -k start = time.time() enc = module.dumps(d) stop = time.time() enc_time = stop - start start = time.time() got = module.loads(enc) stop = time.time() dec_time = stop - start if got != d: raise AssertionError("Module %s failed encoding" % module.__name__) print '%s: %.3f encode, %.3f decode' % ( module.__name__, enc_time, dec_time) if __name__ == '__main__': time_module(marshal) time_module(simplejson) time_module(cPickle) time_module(pickle) time_module(cerealizer) class EnhancedCPickle: __name__ = 'EnhancedCPickle' def dumps(self, obj): return cPickle.dumps(obj, cPickle.HIGHEST_PROTOCOL) def loads(self, data): return cPickle.loads(data) time_module(EnhancedCPickle()) time_module(json)
Here are the results I got.
marshal: 0.010 encode, 0.016 decode simplejson: 0.056 encode, 0.035 decode cPickle: 0.136 encode, 0.051 decode pickle: 0.570 encode, 0.553 decode cerealizer: 0.268 encode, 0.249 decode EnhancedCPickle: 0.115 encode, 0.030 decode json: 0.214 encode, 1.195 decode
According to the Python documentation, the built-in json module is based on simplejson. What the documentation fails to mention is that simplejson is at least an order of magnitude faster than the built-in module. It’s even faster than cPickle, according to this microbenchmark. I would use the marshal module if I could, but that would introduce security issues in my application (as would the pickle modules).