Python JSON Performance

This test compares the speed of various object serializers available for Python. I should have run it long ago. It opened my eyes a bit!

import time
import json
import simplejson
import marshal
import pickle
import cPickle
import cerealizer

def time_module(module):
    d = {}
    for k in range(100000):
        d[str(k)] = -k

    start = time.time()
    enc = module.dumps(d)
    stop = time.time()
    enc_time = stop - start

    start = time.time()
    got = module.loads(enc)
    stop = time.time()
    dec_time = stop - start

    if got != d:
        raise AssertionError("Module %s failed encoding" % module.__name__)

    print '%s: %.3f encode, %.3f decode' % (
        module.__name__, enc_time, dec_time)

if __name__ == '__main__':

    class EnhancedCPickle:
        __name__ = 'EnhancedCPickle'
        def dumps(self, obj):
            return cPickle.dumps(obj, cPickle.HIGHEST_PROTOCOL)
        def loads(self, data):
            return cPickle.loads(data)



Here are the results I got.

marshal: 0.010 encode, 0.016 decode
simplejson: 0.056 encode, 0.035 decode
cPickle: 0.136 encode, 0.051 decode
pickle: 0.570 encode, 0.553 decode
cerealizer: 0.268 encode, 0.249 decode
EnhancedCPickle: 0.115 encode, 0.030 decode
json: 0.214 encode, 1.195 decode

According to the Python documentation, the built-in json module is based on simplejson. What the documentation fails to mention is that simplejson is at least an order of magnitude faster than the built-in module. It’s even faster than cPickle, according to this microbenchmark. I would use the marshal module if I could, but that would introduce security issues in my application (as would the pickle modules).