Lottery

Clojure is winning the lottery

This post describes a lottery simulation. The goal is to see how long it takes to win the jackpot.

The lottery rules

The rules are simple. Seven different numbers between 1 and 39 are drawn at random. Seven correct numbers earns you the jackpot! These are the Finnish lottery rules if someone was wondering.

The program

Generating a random lottery row

A random lottery number is generated with the rand-int function. We must inc the given number to get a number in the range [1,39] instead of [0,38].

(defn rand-lottery-number []
  (inc (rand-int 39)))

Generating seven lottery numbers at random is now easy, for example (repeatedly 7 rand-lottery-number) will do it. The problem of course is that there might be duplicates in the generated sequence which is not allowed!

Another approach would be to shuffle the numbers between 1 and 39 and take the first 7 numbers. The following code does exactly that:

(defn rand-lottery-row []
  (->> (range 1 40)
       shuffle
       (take 7)))

This works perfectly fine but since this function will be run millions of times in the simulation I would like to do some optimization at this point. The function bench is used to check the performance of different approaches.

(defn bench [f]
  (dotimes [i 10]
    (printf "run %d:" i)
    (time (dotimes [_ 100000] (f)))))

On my machine I get the following results:

user=> (bench rand-lottery-row)
Run 0: "Elapsed time: 1124.027378 msecs"
Run 1: "Elapsed time: 1090.87605 msecs"
Run 2: "Elapsed time: 1109.219982 msecs"
Run 3: "Elapsed time: 1089.985089 msecs"
Run 4: "Elapsed time: 1088.312398 msecs"
Run 5: "Elapsed time: 1085.102858 msecs"
Run 6: "Elapsed time: 1088.914707 msecs"
Run 7: "Elapsed time: 1089.41358 msecs"
Run 8: "Elapsed time: 1090.293018 msecs"
Run 9: "Elapsed time: 1095.931276 msecs"

About 1.1 seconds to generate 100 000 lottery rows, I would like to do better.

Another aproach is to draw numbers with rand-lottery-number until we have seven unique numbers. Let's see if this is faster.

An infinite sequence of lottery-numbers can be generated with (repeatedly rand-lottery-number). The function distinct is used to ensure unique values.

(defn rand-lottery-row' []
  (->> (repeatedly rand-lottery-number)
       distinct
       (take 7)))

(note the use of the tick mark in rand-lottery-row'. I use clojure version 1.3.0-alpha8 to run these simulations. If you use an older version you must change the name as it is not valid prior to the 1.3.0-alpha releases)

It's very interesting to think about how differently these two algorithms are executed while the "shape" of the code is identical.

Here are the results of (bench rand-lottery-row'):

user=> (bench rand-lottery-row')
Run 0: "Elapsed time: 873.123392 msecs"
Run 1: "Elapsed time: 872.26316 msecs"
Run 2: "Elapsed time: 879.863531 msecs"
Run 3: "Elapsed time: 877.989559 msecs"
Run 4: "Elapsed time: 871.191241 msecs"
Run 5: "Elapsed time: 875.531018 msecs"
Run 6: "Elapsed time: 885.829832 msecs"
Run 7: "Elapsed time: 874.263615 msecs"
Run 8: "Elapsed time: 879.954463 msecs"
Run 9: "Elapsed time: 880.150228 msecs"

The new function is about 20% faster than the original.

Let's finally try explicit looping. The algorithm is simple, conj (rand-lottery-number) onto a set until it has seven items. The use of sets here is essential due to the guarantee of uniqueness of the set items.

(defn rand-lottery-row'' []
  (loop [lottery-row #{}]
    (if (= (count lottery-row) 7)
      lottery-row
      (recur (conj lottery-row (rand-lottery-number))))))

This is essantially the same algorithm as before but the performance is much better:

user=> (bench rand-lottery-row'')
Run 0: "Elapsed time: 269.215 msecs"
Run 1: "Elapsed time: 249.544113 msecs"
Run 2: "Elapsed time: 251.791943 msecs"
Run 3: "Elapsed time: 254.1995 msecs"
Run 4: "Elapsed time: 247.825468 msecs"
Run 5: "Elapsed time: 246.724914 msecs"
Run 6: "Elapsed time: 247.895657 msecs"
Run 7: "Elapsed time: 247.201648 msecs"
Run 8: "Elapsed time: 249.169627 msecs"
Run 9: "Elapsed time: 247.542752 msecs"

An additional 70% performance increase, though I have to admit the code is not as nice anymore.

Running the simulation

From now on the rand-lottery-row'' function is used. An infinite sequence of lottery rows is defined as

(defn lottery-rows [] 
  (repeatedly rand-lottery-row''))

The function correct counts how many numbers of my-row are in correct-row.

(defn correct [correct-row my-row]
  (count (set/intersection correct-row my-row)))

The function run-lottery runs the simulation until the first occurance of n correct numbers is found. The time it took to run the simulation is reported.

(defn run-lottery [my-nums n]
  (time
   (->> (lottery-numbers)
	(map #(correct my-nums %))
	(some #(= % n)))))

An example run of the simulation might look like this:

user=> (def my-nums #{4 13 16 19 21 27 30 36})
#'user/my-nums
user=> (run-lottery my-nums 7)
"Elapsed time: 35842.890297 msecs"

Note that i play with 8 numbers (as I always do to increase my chances for jackpot). The time it takes to run the simulation varies greatly as demonstrated if we run it several times:

user=> (dotimes [i 10] (printf "run %d: " i) (run-lottery my-nums 7))
run 0: "Elapsed time: 17974.975082 msecs"
run 1: "Elapsed time: 701.597192 msecs"
run 2: "Elapsed time: 1783.32394 msecs"
run 3: "Elapsed time: 3488.578423 msecs"
run 4: "Elapsed time: 15711.582978 msecs"
run 5: "Elapsed time: 693.50612 msecs"
run 6: "Elapsed time: 22030.678959 msecs"
run 7: "Elapsed time: 1214.262498 msecs"
run 8: "Elapsed time: 2816.32497 msecs"
run 9: "Elapsed time: 2358.954691 msecs"

Things to consider

Why is rand-lottery-row' slower than rand-lottery-row''?
Is there a better algorithm to generate unique numbers?
Is there a faster way to implement the correct function?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly