Long time no blog. I’d like to break the ice with a lighthearted Python code readability optimization discussion.
I need a Python function that generates short, random, numeric codes of a specified length using a secure random number generator. The codes must not start with 0, but the output should be a string. This was my first draft:
import struct, os
def make_code(digits):
"""Generate a random short code."""
if digits > 9:
return make_code(9) + make_code(digits - 9)
[rand] = struct.unpack('>L', os.urandom(4))
min_code = 10 ** (digits - 1)
code = min_code + rand % (9 * min_code)
return unicode(code)
Two things about this solution bothered me:
- The math I used looks too clever. I’m not sure future maintainers will be able to understand why it works.
- It discards a lot of the entropy provided by /dev/urandom, which could be expensive on some servers. (I confess this thought is premature optimization.)
To try to clean this up, I created a generator that produces random secure digits and based the make_code function on it:
def random_digits():
while True:
[n] = struct.unpack('>L', os.urandom(4))
s = ('%d' % n)[1:]
for c in s:
yield c
next_random_digit = random_digits().next
def make_code(digits):
return ''.join(next_random_digit() for _ in range(digits))
This felt better, but it has two critical problems. It can return 0 for the first digit, and due to the next_random_digit global, this solution is not thread safe! I’ve never tried to run a generator in multiple concurrent threads. I can only imagine the possible fireworks.
So I solved the problems as follows.
class RandomDigitStream(object):
def __init__(self):
self.streams = threading.local()
def stream(self):
while True:
[n] = struct.unpack('>L', os.urandom(4))
s = ('%d' % n)[1:]
for c in s:
yield c
def make_code(self, digits):
"""Generate a random short code containing only digits."""
try:
next_digit = self.streams.next_digit
except AttributeError:
self.streams.next_digit = next_digit = self.stream().next
while True:
c = next_digit()
# Don't use 0 for the first digit.
if c != '0':
break
lst = [c]
lst.extend(next_digit() for _ in range(1, digits))
return ''.join(lst)
make_code = RandomDigitStream().make_code
Now the solution is large and I’m pretty sure it’s less readable than the original solution. Anyone want to take a shot at making this better? I’m currently thinking that the original solution was better than I thought.