-
Notifications
You must be signed in to change notification settings - Fork 4
Lottery
This post describes a lottery simulation. The goal is to see how long it takes to win the jackpot.
The rules are simple. Seven different numbers between 1 and 39 are drawn at random. Seven correct numbers earns you the jackpot! These are the Finnish lottery rules if someone was wondering.
A random lottery number is generated with the rand-int
function. We
must inc
the given number to get a number in the range [1,39]
instead of [0,38].
(defn rand-lottery-number []
(inc (rand-int 39)))
Generating seven lottery numbers at random is now easy, for example
(repeatedly 7 rand-lottery-number)
will do it. The problem of course
is that there might be duplicates in the generated sequence which is
not allowed!
Another approach would be to shuffle the numbers between 1 and 39 and take the first 7 numbers. The following code does exactly that:
(defn rand-lottery-row []
(->> (range 1 40)
shuffle
(take 7)))
This works perfectly fine but since this function will be run millions
of times in the simulation I would like to do some optimization at
this point. The function bench
is used to check the performance of
different approaches.
(defn bench [f]
(dotimes [i 10]
(printf "run %d:" i)
(time (dotimes [_ 100000] (f)))))
On my machine I get the following results:
user=> (bench rand-lottery-row)
Run 0: "Elapsed time: 1124.027378 msecs"
Run 1: "Elapsed time: 1090.87605 msecs"
Run 2: "Elapsed time: 1109.219982 msecs"
Run 3: "Elapsed time: 1089.985089 msecs"
Run 4: "Elapsed time: 1088.312398 msecs"
Run 5: "Elapsed time: 1085.102858 msecs"
Run 6: "Elapsed time: 1088.914707 msecs"
Run 7: "Elapsed time: 1089.41358 msecs"
Run 8: "Elapsed time: 1090.293018 msecs"
Run 9: "Elapsed time: 1095.931276 msecs"
About 1.1 seconds to generate 100 000 lottery rows, I would like to do better.
Another aproach is to draw numbers with rand-lottery-number
until we
have seven unique numbers. Let's see if this is faster.
An infinite sequence of lottery-numbers can be generated with
(repeatedly rand-lottery-number)
. The function distinct
is used to
ensure unique values.
(defn rand-lottery-row' []
(->> (repeatedly rand-lottery-number)
distinct
(take 7)))
(note the use of the tick mark in rand-lottery-row'
. I use clojure
version 1.3.0-alpha8 to run these simulations. If you use an older
version you must change the name as it is not valid prior to the
1.3.0-alpha releases)
It's very interesting to think about how differently these two algorithms are executed while the "shape" of the code is identical.
Here are the results of (bench rand-lottery-row')
:
user=> (bench rand-lottery-row')
Run 0: "Elapsed time: 873.123392 msecs"
Run 1: "Elapsed time: 872.26316 msecs"
Run 2: "Elapsed time: 879.863531 msecs"
Run 3: "Elapsed time: 877.989559 msecs"
Run 4: "Elapsed time: 871.191241 msecs"
Run 5: "Elapsed time: 875.531018 msecs"
Run 6: "Elapsed time: 885.829832 msecs"
Run 7: "Elapsed time: 874.263615 msecs"
Run 8: "Elapsed time: 879.954463 msecs"
Run 9: "Elapsed time: 880.150228 msecs"
The new function is about 20% faster than the original.
Let's finally try explicit looping. The algorithm is simple, conj
(rand-lottery-number)
onto a set until it has seven items. The use
of sets here is essential due to the guarantee of uniqueness of the
set items.
(defn rand-lottery-row'' []
(loop [lottery-row #{}]
(if (= (count lottery-row) 7)
lottery-row
(recur (conj lottery-row (rand-lottery-number))))))
This is essantially the same algorithm as before but the performance is much better:
user=> (bench rand-lottery-row'')
Run 0: "Elapsed time: 269.215 msecs"
Run 1: "Elapsed time: 249.544113 msecs"
Run 2: "Elapsed time: 251.791943 msecs"
Run 3: "Elapsed time: 254.1995 msecs"
Run 4: "Elapsed time: 247.825468 msecs"
Run 5: "Elapsed time: 246.724914 msecs"
Run 6: "Elapsed time: 247.895657 msecs"
Run 7: "Elapsed time: 247.201648 msecs"
Run 8: "Elapsed time: 249.169627 msecs"
Run 9: "Elapsed time: 247.542752 msecs"
An additional 70% performance increase, though I have to admit the code is not as nice anymore.
From now on the rand-lottery-row''
function is used. An infinite
sequence of lottery rows is defined as
(defn lottery-rows []
(repeatedly rand-lottery-row''))
The function correct
counts how many numbers of my-row
are in
correct-row
.
(defn correct [correct-row my-row]
(count (set/intersection correct-row my-row)))
The function run-lottery
runs the simulation until the first
occurance of n
correct numbers is found. The time it took to run
the simulation is reported.
(defn run-lottery [my-nums n]
(time
(->> (lottery-numbers)
(map #(correct my-nums %))
(some #(= % n)))))
An example run of the simulation might look like this:
user=> (def my-nums #{4 13 16 19 21 27 30 36})
#'user/my-nums
user=> (run-lottery my-nums 7)
"Elapsed time: 35842.890297 msecs"
Note that i play with 8 numbers (as I always do to increase my chances for jackpot). The time it takes to run the simulation varies greatly as demonstrated if we run it several times:
user=> (dotimes [i 10] (printf "run %d: " i) (run-lottery my-nums 7))
run 0: "Elapsed time: 17974.975082 msecs"
run 1: "Elapsed time: 701.597192 msecs"
run 2: "Elapsed time: 1783.32394 msecs"
run 3: "Elapsed time: 3488.578423 msecs"
run 4: "Elapsed time: 15711.582978 msecs"
run 5: "Elapsed time: 693.50612 msecs"
run 6: "Elapsed time: 22030.678959 msecs"
run 7: "Elapsed time: 1214.262498 msecs"
run 8: "Elapsed time: 2816.32497 msecs"
run 9: "Elapsed time: 2358.954691 msecs"
- Why is
rand-lottery-row'
slower thanrand-lottery-row''
? - Is there a better algorithm to generate unique numbers?
- Is there a faster way to implement the
correct
function?