Skip to content

CRISPRleader - An efficient tool to determining CRISPR leader boundaries

Notifications You must be signed in to change notification settings

amri2k/CRISPRleader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CRISPRleader - An efficient tool to determining CRISPR leader boundaries

CRISPRleader version 1.0.3

CRISPRleader takes a complete genome or draft genome as input and selects all possible CRISPR arrays in the correct orientation, and annotates the CRISPR leader boundaries.

Please run test_bin.py script before executing the main program. It will check if executable files can be run and change the path in the mview file to the correct one.

To execute the program, please open your terminal console. Make sure "CRISPRleader.py" file and "bin","lib" and "Dataset" folder exists in your current working directory.

Requirements

  • Python 2.7. Additionally, please install the corresponding libraries(Docopt, Numpy, Urllib3 and Biopython). For example: pip install numpy --user OR sudo pip install numpy

Dependencies The following dependencies should be present in the respective folders for the successful execution of the program:

  • [bin folder]

    • CRT1.2-CLI.jar
    • EDeN executable file
    • prodigal executable file
    • hmmsearch executable file
    • mview executable file
    • Mafft folder
    • needleall executable file (please install EMBOSS 6.3.1 or above). Please note that if 'needleall' does not work, please copy the 'needleall' executable file from the EMBOSS directory to the CRISPRleader/bin folder
  • [lib folder]

    • Archaea_Final_Repeat_dataset.fa
    • Archaea_LeaderClusters_mean_std.tab
    • Bacteria_Final_Repeat_dataset.fa
    • Bacteria_LeaderClusters_mean_std.tab
    • clustInfos.tab
    • DR_Repeat_model

The program execution can be carried out with three input categories as given below:

  • Input with Repeat, Leader and the organism type. Execution is proceed as follows:

    CRISPRleader.py r_l_o "repeat" "leader" "organism" .

    For Repeat and Leader DNA sequances are expected as input Organism type "a" for archea and "b" for bacteria

    Example: python CRISPRleader.py r_l_o GAAATCAAAAGATAGTTGAAAC AAGAATGGCGTTGGGCCTCGGCGTTTTCTCAACCTCCACGTTGCTGTGCTTGCGAAGAATGGCGGCCGCCCACCCCGAAAAGAGACATATATGAATGTAAAACGCGGCAGAAAAGCGTCCACCGAAGA TACAAAAAACCTACAAAAAACTTAAAAACCCACAAAAACCAACAAAACCAGCCCCA a

  • Input with Accession number. Run the program by giving the following command:

    CRISPRleader.py d_f_o "acc_num" "organism"

    Provides an option to enter desired accession number and the organism type. Corresponding fasta file will be downloaded from NCBI Organism type 'a' for archea and 'b' for bacteria

    Example: python CRISPRleader.py d_f_o CP003098 a

  • Input with a file which contains Genome in fasta format. Execute by:

CRISPRleader.py f_c_o "file" "completeness" "organism"

Path to the file, completeness and the organism type should be provided Input completeness: "COMPLETE" or "PARTIAL" respectivelly Organism type 'a' for archea and 'b' for bacteria

Example: python CRISPRleader.py f_c_o NC017453.fa partial a

In all the above executions, the user should provide the organism type. -o provides an option to choose Archea "a" or Bacteria "b"

The ouput files are saved in a folder with name "Output" in the current working directory

Contribution

Feel free to contribute to this project by writing Issues with feature requests, bug reports, or just contact messages.

Citation

If you use CRISPRleader, please cite our article

About

CRISPRleader - An efficient tool to determining CRISPR leader boundaries

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published