Skip to content

Latest commit

 

History

History
60 lines (44 loc) · 2.4 KB

README.md

File metadata and controls

60 lines (44 loc) · 2.4 KB

benchmark-awk-vs Build Status

Let's make a challenge versus awk and other implementations.

Source idea was here on StackOverflow

Reported here as the original question from Ed Morton

Given a 10 Million line input file created by this script:

$ awk 'BEGIN{for (i=1;i<=10000000;i++) print (i%5?"miss":"hit"),i,"  third\t \tfourth"}' > file

$ wc -l file
10000000 file

$ head -10 file
miss 1   third          fourth
miss 2   third          fourth
miss 3   third          fourth
miss 4   third          fourth
hit 5   third           fourth
miss 6   third          fourth
miss 7   third          fourth
miss 8   third          fourth
miss 9   third          fourth
hit 10   third          fourth

and given this awk script which prints the 4th then 1st then 3rd field of every line that starts with "hit" followed by an even number:

$ cat tst.awk
/hit [[:digit:]]*0 / { print $4, $1, $3 }

Here are the first 5 lines of expected output:

$ awk -f tst.awk file | head -5
fourth hit third
fourth hit third
fourth hit third
fourth hit third
fourth hit third

and here is the result when piped to a 2nd awk script to verify that the main script above is actually functioning exactly as intended:

$ awk -f tst.awk file |
awk '!seen[$0]++{unq++;r=$0} END{print ((unq==1) && (seen[r]==1000000) && (r=="fourth hit third")) ? "PASS" : "FAIL"}'
PASS

How it works:

On each commit a build is launched on travis CI wich produce a markdown file with the results. The build ensure each script pass the test and then run it without output enclose by time to measure the time taken The resulting file is inserted in a gh-pages site visible here which is updated at end of each build

How to participate:

  • Fork this repository
  • Add your script to be tested in the test/<desired language> directory
  • If there's not already a launcher, create one (try to keep close from others to not loose time)
  • Make a pull request ;)

ToDo:

  • Find participant to get tests in other languages