This is the distribution for the program snppit:
(SNP) (P)rogram for (I)ntergenerational (T)agging
Full documentation is found in the report to the PSC about
the program in the file doc/snppit_doc.pdf
in this
repository. Part II has the user instructions.
An example data set is included in the ExampleData
directory.
All the source code for the program is found in the src
directory. However, it is probably easiest to use the
precompiled binaries snppit
and snppit.exe
in this
repository.
I just added a few new options to exclude additional parent pairs on the basis of the log-likelihood ratio and a limit on the number of non-excluded parent pairs. These are for specialized applications in situations where there are some populations with many unexcluded parents. Read about it here: http://eriqande.github.io/snppit/logl-and-rank-thresholding.nb.html . NOTE: These changes have not been compiled up into Linux yet. If you want them you must compile it yourself on your Linux box for now.
If you don't use git
and don't want to clone this repository, you can
download a compressed zip of all the contents from:
https://github.com/eriqande/snppit/archive/master.zip
Warning! the executable files snppit-windows.exe
, snppit.exe
, snppit-Darwin
, and snppit-Linux
are provided
as a courtesy, but are not guaranteed to have been compiled up from the latest commit. For that
you should compile it up yourself (or see when it was last committed).
To get the program running quickly, do the following:
Note: Jon Hess at CRITFC compiled snppit-windows.exe
on his own Windows box. It works a whole lot
better than windows.exe
which was cross-compiled on a Mac. So Windows users should use snppit-windows.exe
- Copy the executable "snppit-windows.exe" to your Desktop.
- Copy the data file ExampleDataFile1.txt in the ExampleData directory to your Desktop.
- Open the Command Prompt application (under Start->All Programs->Accessories)
- Type:
into the Command Prompt window and hit return. You should now be in the Desktop directory.
cd Desktop
- Type:
into the command prompt.
snppit-windows.exe -f ExampleDataFile1.txt
- Copy the executable
snppit-Darwin
to your Desktop. - Copy the data file
ExampleDataFile1.txt
in theExampleData
directory to your Desktop. - Open the Terminal application (in the Utilities folder inside the Applications folder)
- Type:
into the Terminal window and hit return. You should now be in the Desktop directory.
cd Desktop
- Type:
into the command prompt, and the program should run.
./snppit-Darwin -f ExampleDataFile1.txt
Be sure to read the documentation
on how to use snppit
. If you are comfortable moving the program
to another location than Desktop, feel free to do so. If you prepare
your own data file named MyFile.txt
then you run it with the option
-f MyFile.txt
instead of -f ExampleDataFile1.txt
.
Follow the directions as for the Mac version, but use snppit-Linux
instead
of snppit-Darwin
. And you probably don't have a Desktop on Ubuntu the same as
on Mac, so just use some other directory. Of course, it is probably best to
compile it up anew and then put it in /usr/local/bin
.
For full list of all program options do:
snppit --help-full
NOTE The program seems to run about 20 times faster on my Apple computer when running natively (OSX) than when I run the PC version through VMWare Fusion Virtualization software. I don't know if this is because the compiler I used for PC is lousy (I doubt it, since other programs compiled with it run just fine) or if the VM software is slow (quite possible--- there may be some operations on arrays that just go slowly on the virutal machine). I hope that it runs faster on a native Windows machine than when Windows is running virtually on my Mac.
Like so:
git clone https://github.com/eriqande/snppit.git
cd snppit
./Compile_snppit.sh
On Linux there may be a lot of warnings about not catching return values from fscanf. I've got to deal with
those eventually. But for now it is fine---they are just warnings and not errors. It will compile up into
snppit-Darwin
(on a Mac) and snppit-Linux
(on Linux).
I have some limited tests in the directory test
. Basically, it runs some data sets and then checks to see
whether the results are identical to some stored results. Currently, some data sets give different orderings of output
individuals on Linux vs Mac, so it checks consistency across operating systems too. To run the tests, do this from
the main repository directory:
cd test
./run_all_tests.sh
This work was funded by the Pacific Salmon Commission Chinook Technical Committee Letter of Agreement
Carried out by Eric C Anderson and Veronica Mayorga
Thanks to Matt Campbell for suggesting the program name.