This test compares the speed of various object serializers available for Python. I should have run it long ago. It opened my eyes a bit!
import time
import json
import simplejson
import marshal
import pickle
import cPickle
import cerealizer
def time_module(module):
d = {}
for k in range(100000):
d[str(k)] = -k
start = time.time()
enc = module.dumps(d)
stop = time.time()
enc_time = stop - start
start = time.time()
got = module.loads(enc)
stop = time.time()
dec_time = stop - start
if got != d:
raise AssertionError("Module %s failed encoding" % module.__name__)
print '%s: %.3f encode, %.3f decode' % (
module.__name__, enc_time, dec_time)
if __name__ == '__main__':
time_module(marshal)
time_module(simplejson)
time_module(cPickle)
time_module(pickle)
time_module(cerealizer)
class EnhancedCPickle:
__name__ = 'EnhancedCPickle'
def dumps(self, obj):
return cPickle.dumps(obj, cPickle.HIGHEST_PROTOCOL)
def loads(self, data):
return cPickle.loads(data)
time_module(EnhancedCPickle())
time_module(json)
Here are the results I got.
marshal: 0.010 encode, 0.016 decode simplejson: 0.056 encode, 0.035 decode cPickle: 0.136 encode, 0.051 decode pickle: 0.570 encode, 0.553 decode cerealizer: 0.268 encode, 0.249 decode EnhancedCPickle: 0.115 encode, 0.030 decode json: 0.214 encode, 1.195 decode
According to the Python documentation, the built-in json module is based on simplejson. What the documentation fails to mention is that simplejson is at least an order of magnitude faster than the built-in module. It’s even faster than cPickle, according to this microbenchmark. I would use the marshal module if I could, but that would introduce security issues in my application (as would the pickle modules).