Skip to content

Commit

Permalink
add genome data
Browse files Browse the repository at this point in the history
- sample.fasta
- README file

fixes #161
  • Loading branch information
alexg9010 committed Jul 30, 2020
1 parent d24d985 commit 9a8654c
Show file tree
Hide file tree
Showing 2 changed files with 149 additions and 0 deletions.
13 changes: 13 additions & 0 deletions tests/genome/README
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@

## sample.fasta

The `sample.fasta` is extracted from chr1:7291263-7302041 of the hg38 human genome assembly.

## cpgIslandExt.hg19.bed.gz

The `cpgIslandExt.hg19.bed.gz` is downloaded from UCSC.

## refGene.hg19.bed.gz

The `refGene.hg19.bed.gz` is downloaded from UCSC.

136 changes: 136 additions & 0 deletions tests/genome/sample.fasta
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
>chr1
TCGTGCTGGAGTGGGGAATCTCCCTAGAGTGTATCTCGGGAGAGTGCCATGTTGTTAGGGAGAACCTAGGCTATGGAGTC
AGCCTGACCCAGTTCTGCGCTGGAGTAGTTACGTGAAAACAGTGGGCCTCAGTCTCGTCATCTGAAATGGAATTTCAGTG
CACAACTCCTAGGAGTTGTTCAGAGATGAATGTTAAGTGCCTGGCCCAGAGCCTGTTCCCAGTTGGCTGAGCTTTCCTCT
GTGTGCACGCCCCTCATCACACCAGCAAATGGTCAGCTTGTCTTCTGGGCCCTAGCGATGAATTAAGCGCTGCTGGGAGA
TCCCCTCTGGGTGTTGCTTCAGCTGCCTGACCTCAGGAGGCTGATATCACAGCAACAGCACTCACTGGGCTTTAAGCCTT
CACATTTCTTTTACTGGCAGAACTGGAAGGGCAGAGAGCCTAGCCCAGGAGTGGGTGGCCAGTGAGAACTGGGCAGCCAG
CCTGGCTCCACTCCATTGGCGGCATTGCCCTTGGACAGTTTTGGTTGTCACCCACATTTGGCTTCTTAGCTGAAACTCTA
GAGGTACATTCCTCCTTCCAACAACTGTCTGGAAACAGGAAGTGGTGCAGGGAAGAAAACCAGATGCCAGCTTCCCCCGC
TGTTTTGACTGGGCTCTGTAAATGGAAACAGCTGAGAATACTATCCAAGGGTGCGTGTGCCCATCTTAAAGCTGGTGTGT
GCATTGCAGGGAGGATGCTGGGTGCAAACATACTCTGACGTTTATATTGAATATTTTAGTTTGCAGAAATTCGCAATTAA
ATCTACTTTCCACCTACACTTTGAAAAATCCATTTATAGAAACCAAATCAAATTGGAGATCTAAGGACTGGTGGCCAGAA
ATGAAATATTTTTAAAATAAAATATATGGAGATTGAATTGCTGAAGCGCTGGCTTGTTCTGCGGCTTCCTGGGATGTGAG
CTAGAGGGAAGGCTTATTGGACAGAGCCTTCTCTGGGGATCAATTATGAACTACTCTTACGTCCTGGCAGTTTCAAAAGG
GAAAAAAAAAGTTCTTTGAAAGAACTTATTTAGCAGGCAGAGCCAATATTCTGTGTTTTCTTGAAGCCACCTTGTATTTC
AGCTTCGGTGAATATGTGTGTCTCCTGTGGATCTTTATGGGAATTGCTTGAACACTCCGGAGAGGCTGGATGTGGCATTA
AAACCACCTAACCTTTGCTCCCCCAGCCTCGCCATTCATCCTGAAGACTTAATTATTTCTCGAAAGATGATCCATACGGG
TACTGCCTTTTTCTTAGGAATGTGAGCCCAGATGCTGAATTCCCCACTCTCCGTACTCAGGAGAGAGAAGCTCAGACTCA
GCTCTGGCTTAAATTTCTGAATTGATCCCTGATTGAATTCCTCCTTTGAAAATTATCAAGAGCTGCTTAGTTTTTTCCCA
GTTCCTATGAGGGATGCCAAATCCTTGCATGCGTTTCTGCACTCCTCTGCACCCACGGACCACTCAGAGAAAATGCAACC
ATTTTGTTCAAGTGAGTTCTTATTGCCCTTCATGGGCTCAGTTCCCTTTGAGAAGACAACTGATCCCATGCCACGTGTTT
CCTGCAGCCTCGCTTCCGCCTTTAGGAAGCAGGCAGAGGAATTCCTACAGAGCACTAGGCCACCCCAGCGCCACCTCCCA
TGTCACACCAGGCAGAGGTTGGGTGGGATCTCTGGGAAGGCATGTCCTCCCTGCCTCCTGGGAATCCTGGGGAAGAACCA
CCCTAGGAAGTGGCAAGCTCCGGCGGGGTGTCCTACTCCCTGCTGGGTCACTGGAGCTTCCAGAGCCCAGGCTTCGTTGC
TTTTCTGCTGTATTCCTGCCCCCTTTCTACTGCTTTCCAGTGGACAGGGACGGGGTGGGGCTGCCTTCCTGGGAAGGGAT
TGCGCTGCAGAAGCTGGGGCTGGTAAAGGCTGGGGAGCAAATGCAGACCTTGTCTGGGTGATGAGAGACCGCCACTCGTC
TCCAGGAGCTCCCGCCAGAGGCCTGCTGTGACAAGGGATAACGTGCTGGCAGGCCACTTGTCCAAGAGAATCATTCCTTC
CTGCACCTGGACACTGTTTCAGGGCTGGCATTTCTCTGGCACTCGGTGTGAGCAAAGCCCATTTGGTCGTTTGTCAGTGA
AGACAGAAGAGTCACAATGAGCCGATTTATAGAGTACATGGGGTGTTGTGGGGCAGAGGCCCGGATGGCTCCATGATCCA
TGTGCATGTGTTCCCGGTTTTGTTTGTCATTTGCTTTCTTGCCATCCACTGGGAAATGTATTTATCACTACAGTTTGTAT
AATTTGGTGGTCTGTTTTCATCTTTTACAAGCAAAATGCTTGGTATTTCAATAAGATTGAAGGGGGGTGGAAATAGATTG
CATCCCACTTAGGACTCTCAGTGGGGGAGTCTAAACATGAATTATCCCAGCTTAATGCTTCAAGTTATTGGGGTTCCCCT
GGGGAGCATCTTCTTGGTTCTTGTAATGTTGCTGCCTCCCGATGGCTGCCCAGGGTTTCCAGCTCTCCCGTGCCCCTCGC
TTTTCTCTGAAAACTCGGCATGCCCCGGGGGGCCTCTTTGGAGCCTGTTGGAGGAGATTTTGTAGTAATGAGCATTTGGC
TTTTGTAATCCTTTTTCTGCTTCGTGCTAGCAGCCTCGCACTTCCCACCTGTTAAAGCTTGCAAAGATATCATCGGCAGG
CATTTTCCCAGGGGAAGCACAGTTGTGTCCAAAGCCCAGAATAATCCAGGCTTTCCTCATTTCTATGCAGAGGGCGTTAG
TTGATCCTGGTGCCCTCAAGAGAAGGTCCCTCCAGAACAAGCTCACCCTCCATCACAAGCACTGATGATGGGGGTGCTCA
AGCTGGGGCAGGGAGAGGCTGAGAGGGGATAACCCAAGGTGGCAGGAGGCTCCCTCGGTGGTTTGTGGTGGGGCTGTTCT
GTTCAGCCAGGATGTGGCCCTTTCAGGGGTTTACGCTGTTTAACACTGGCTGAGTGTCAGGACTGGGGGGCTATGGATGG
TGAGCTTCTCCCTGGAGCTGGCTGGTAGGGAAGTTCAGCAGCTCTCTGGATGTTATGAAACCACGGAGAGGCACAGTCAC
CTTAAAAGGACAGCAGCTCTCACTGGCTGCTAGACTGCCTCTGATGGAAGGTGTCTGCAAGACAAAGGAGAGGCTGCGGG
CAGCAGTGTGCGGGGGACAGGTGATCTTGGCCAGCATCCTGCTTGCTCATATTCCTCGGTGAGACCAGCTCTGCAGAGAA
AGGGCTGCTTCAGCCCTATTCACCCAAAACCTCAATGTAGAAACCACCCACTTTGCTCTCGTTCAGCTTCCAGAGGGAAT
GCTGTCAGTGACACAGCATTCTGGTCCCTGGAGTAGTGGGTTCTTGACATTAACGAATGGGCATTGTGATTTTATCCTTG
TGCCTTATGAACTTGGTTAAAGATTTGACAGTCACTGTCAGAGGCTGCAAGTCATGCAGGGGCTTCTGAGCCTGCAGGTG
AGGACCAGCCCAGACCAGGGTGAAGTCTCAGAGAGACTCTCCGTGGACCCTTCACCAAATGGCCTCCAAGGCAAGGCCCT
TGATCACAAGTGTGTCCCTGGTTCTCCCTGGTCCTCTCGCATCTGCCCCGTGACTCCCGACTCCCCTCTGCACTTGGTCT
GGTCTCCAAGATCGTTCTGATGTGGACTTTCCTCCTTGTCTATGCTGGCAGACCTCAGGAGGAATCCACCACCTCCGCTT
TGCTGCCAGGCCAATCACTGATAGCATCAAGCACACTGGCTTGAAGTCAGACTTCAGGTGTTCAAATACTGACTCCACCT
TCACTTACATCTGACTCGGGCAGTTATGTAACCTTCTGGCTTTGGGGTTTTTTTTGTTGTAAACAGGGACCCTAATGGCA
TCTGCCTCCCTGGAGTTGGAAGCTGGAACGGGGCAACACATGGACAATGCTGGGAACAGTGCCTCACACATGCCCGTACT
CAATGCTGTCATGGGACCCATTCCGGCCCTGTCTCCTACATTGCATCTCCATCTGTGATCTTTCACTCAGTGCCCCGGGG
TTGAGAACTAAACCCCTCTACTCATTCTCACAAGCAGGCCTTTTCCTCTTCAACCAGATTGTAACTCCCTGAGGGGAGGG
CTTGTTGTGTGTTTCCTTTGGATCTGCACAGTGGTACTTAGCAGCGTTCTAGGCACCTGGTAAGGGTTTGAGTAAAACAG
ACTGATTGGATAGTTAGTGATGCTGGGAATCTGGAGTGTGAATTAAGCAGTATATTCCAAATTACTTCCAGGCGTTCATC
ATTTCAGTATCACACGGCCACATGGGAAAATGCCCACTGTGTGAAATTTATTCATCAGTTCAACAAATATTTATCGAGTA
CTTTCCATGGCATGGTTTTGGTTCCTGAAGATTCAAAAATGAACAGACAGAAAATATCCACGCCCTTGTCACTTACCTGC
AGCAGACTGACAGTGGTCAGATACAATCTCAAGAGGTGATCAGTGCCACAGAGAAAATAGAGCAGGGTGTATTGGTTATC
TATCTCTGTGTAACAAATTATCCCCAAACATAGCAGTTTAAAACAATGCATATTTATTATCTGACCATTTCTGTGGGCTG
GGGCCAGGCATAGCTCTGCTGATCCTCCCACTCAGGGCTCCTTACAAGGCTACCGAAAAGGTGCTGGCCAGGGCCGTGGT
CATTGCAAGGTTCAACTGAGTCAAAATCCACTTCTAAACTCGATCATGGGGTTGTTGGGAGGATACAGTTTCTCATGAGC
TGTTGAACTAAGAGCTCAGGTTCCAGCTGTTTGGTGAATGCCATGTGGGCCTCTCCATGGAACACAACAGGAAAGCTTGC
TTCATCAGAGTGAGCAAGTGAGAAGAGCCAGAGAGAGTGAACACGCCAGGGACCTTGAGTCACCATCTTCCCAGTAAAGT
AAGTGCCATTAGGAACCCCACTTTACAGACGAAAATGGAGGCCCAGAGAGTTAAGTATTTTGGCCAAGGTTACATAACTA
TGTAACTAGTGAGTGGCAGGGCCGAGGTTTGATCCCAACCTGTTTGGCTCCAGTACTTGTGCTCCTAACTACCACAGTGG
ACCATTGCTCTGCCTTTTAGTTGCACACACTGTCTGCTGCAGTCTTCTTGACAAGAGAAGATGATGACTCAGGAACTTCA
GCAGTAGTGGGGATGGGAAGAAGTGGGTTGACTGGGAATACATTTTGAAGAGATTGTTGCTAGAACTTCTTGATAATTTG
GGTGTGGGAATAAGGTGAAAGAAAGATGTTAACAATGATCCCAAAGTTGGGTGCCTAAGAAACTGAGTGAATGACGGTGC
TATTCCTCAAGATAGAAGATGATGGTAGTGGAGCAGGTCTAGGGGGAGAATAAAGAGTTCTGTTTTGGCCACGTCAAGGG
TGATGTTCAGATGTAGATGTCAAGTGGCCCATTGAATACATATGTCTGGAGCTCAGGGGAGCAGTCAGACCAGAGATACA
GACTTTTGTAGTCATTAATCACGGATGACCACCGAGAGAAAGAGTATAAAGGAGAGAAAATGACTAAAGACTTAGACCCC
TGACACCTAAAACATTTAGCTGCCTGGACAAAAAGGGAGAGCCAGCAAAGGAAATGGAGACACAGGGGGCAAGAAAACCA
GGAAAGCATGGTGACCTGGAAGCCAGGTGAAGAGGGGCTTCCAAGAAGGAGGAAATGGTTAGCTGTGCCAAATGCTTGGT
GAGAGGGGATGGAGGACCAAATGGAAAAAGAAAGCTACAAAACTGTATTTTCAGAAATGTCCTAATTTTGCAAAAATAAT
TTCTCCCTGCCCAAAAAGAGTAAAAAGCTCTTTGAGTTTGTTAAATCTGTAGTCCAATTTTAATCCATGAACATGGTATA
TCTCTATTCTAGATATTTTAGAAGATATGTGGTAGGGGAGTGTGGGTTATGTTGTTTTCTATAAAAGGTGTTGAATTTTG
TTCTGGCAGGTCAGTTAATTACTGGCAGATCCTCTTGGTCTGGTCAGGCTTGGTTTGAATCTTTATTAGAGTTGGTCCAT
TTTTATTTTGCCCTTATTCCTAGGCTTTGGCCTTTCCTGTAGGGTATAGTCCTTGTTCCTAAGGCCTGGCCTTTTGGGGA
ACAAATGCCCAAGGTACTCAGCAAGGGTTCTCTCGTCTGGCTTGGCCAGAATTCAAAGCTCTGCCAGCTCTGCATGACTT
CTAGTATCTCTGCTTAGCTTTCAGTCTTATAACAGCTATGGAGTTTTGCCCTGCATGTGGGCAGTTCTGCCCTCAACCAA
GTACTCACAGGGTACCTCCAGGCAGACTTCTAGGGCCTACTTTCTGCACAAGATTCTTCTCTCCAGAGCCCAGACCTGTA
AGTTCTAGCTGCTTTAGCAGCCCCAAACTCTTGATTTTTGCCTTCTCCTATAATCTAGGCTGCTGTGCCCCATCGTCCTG
CATAATGGGCCAAAAGGCTGGGCAAACATGGGGCTTATCTTGTGTGTTTTTTTTCTGGCAGAACCACAGTCCTGAACTGC
CTGTTGTACAATCCCTACAAAGAGTAGCCTCAGATATTTTTGTCTAGTTTCATGATTGTTTACAGCAGGAGGGCAAATCC
AATCCTAGTCACTTCATTGTGGCAGGAATCAATATCTATTTGGTTTAATGTTTACCATTCCTTCCCATTCCCTATACCAG
CTCATCACCCTTGTCAGCGGCAGTTCTATATATGTTTGTAGTCCTTTCAGTTGAAATAAAATTGAGTCAAACAATTTATA
ATCTGAAATGAACAAAGACCATGTAAATTCTTTTCAGTTGCCTTTTTAGGCTGGTATCTGTCTGTAGAGGACTCCCAGGG
AAGGATTCTCAAGGAAGACTCAATGGGAAGCCTTTGGATGGCAAGCTACAGGGCTGAGTACGAGGGTGGCCTTCCTGTGC
GGGGTGTAGGATTAGCATCAGAGAAACAAATCAGGAATCCCTGGGTATAACAGTAGGGGCTGCATTGTCATTTTAGCCTT
CTCCAGAGATGCATGATCCACCCCCCACCCCGCAGGTCTTTTCCTCTCTATGAAAGTCACTGACAAGTTCCTGTGGATCC
CCTGGGGTCCCCAGGAATGTATTTCTACTCTGAGGGCCAGTGAGTATCTTTTTCTTGGCCATGAGCCATGAATAGAAACG
GGACCCTTAATACTGCCTGACCTGCCACATGCAGCCCTCTCTTCTGGGACTGTCTACACAAGTACATTTCCACCATCTAC
AGGGGCGGAATTCCGGTCATTTGCTACAGGGGAAATTCTAGATTTTTGTTTCTAAAGGGCCCCAGGTTACCAAGGTCTCT
TGTATATTAGAATTTTACCACTTCTCTCTGGGGGAATTAGGTATTTATCAATACATTTTGTAATATATATTTACCAAATT
GTTTATCCAGCAAACTTTTATTAAGCAAATGCCAGGCATTGTGTTAGGTTCTGAAGCTCTCCCAGTTCCTTATCCAGGGC
CCCAGGAACGCACTCAGGGACCATTTCAGACACGTGGGTGAGTCAGTCCCTAATGGGCTCAGAAGAGAATCACCAAAATG
TTTAAGACAGAACAGTCCCGAGCAGCCAACATTTCACCCTTCCTTCCTCTCTCTTCCCTTGCCGACCAGACCCAAGGGTT
TTGGACTCAACTGTTTCCTTTTGTAGCAATGAGCTAAGGAGGCTATTTGCTTTGACTAAGGGGCCAAGATGATGAAGTCA
GATGTCCTGAGTTTGCATCTTGCCTCTCCACTGACTTGCTGTGTGACCTTAAGCAAGTACCTTCCCCTCTCTAGGTCTGC
TTCCTGAAAATTCAAAGTCTCTGAAAATTCTGTTTTAAAATATGTGCTGGAGGTGGGGAGACCTGATTCCCTGCCTAGTG
CTTCCTTCGTTTTCATGGATTCTTTATTAAGCAGATATTACTTCTTCATAATAAAATCGTATATGAAACAATGACTTGAT
TTCTTTATATGAAACATACTTTGTACATCTAAAAATGTTGTTGAAATCATCTGAGCCTCTGAGCAAGAGCTTCGCCTTTG
ACTCTGAAGCCAGTGCTGTCAGCAGTTTCTCCCCCTTAATGTAATGCTGTATGTGACTCAATATGAACTAAGCAGGGGGT
GGGAGGTACAGACCACCCCAGTAATCCAGGCTTTCTCTGGAATTCTCCCACTTGTTTATAGTCTGGGCAATTTGCCTCCT
GCTGCTGCATGACTCAGACTCGGAGAGCTTCCAACCTCGCAATGAGCCAGCGGCTCCCCGGAGTCAGACGCTGGCCCTTG
TGTCTCAGGCTCCTGGAACCTTCCGATTAAGCCTCCACTTCCATCTCCTCCCTGGCCACATTCAGGCAGCAATGCCCCAA
CCCCACACAGAGACACACGCCCAATCAGTGCCCTGTACAAGCAGGGACTGTGGGTGAGAGGGTACAAGGAGGTGTTCCAA
CCCCTCACTGAGACAGGATTTCTAGCCCCTCTCTTTGGTTTGTGTCATTTTGCTTTGAAATGTAAGTTGACTTCTGATCA
GGGCAGATCTCAGTTTTATAGGACCTTGTGTAAGAATAAGCAATACAAAATAAGGAACACAAAATGAGACCTAGAGTCTT
AGAAGGGGCCCGTGCAACCAGAGGTGCCCCGAAAGCTTAAGTGTCATTGACTTCCTAGCAACCCTGACTGTGGTCTGATT
CCTTTAAGATCCAGAACAACTGCTCCCTGTACCTTAAGTAATTATCATTTAGAATAAAGTCAGAAAATTGCTTGCTGTCT
GTTTATTTTAGAATCAAGCAGCTTCTGTATAATTTGAATTTGAGATAGCATTCTTAAAAAAAAATTCCCTTCGCAATAAA
GCACCAGATTGAATGTCCGATGAGTCACAAATGCCCTTCTGAAAACAGCTGCATTAGAGGGACTCACTCAGAAAACATGA
TTTTCCTTCCTCCAAGGGTCTCCCCTAGACCCTGGCACAGAATCACCCTGGTCCCCACTCTGGTCCTCCCTTCCAGGCTC
ATTCCTGGTGCAGCAGATCACTTCCACATTTACTGCATCACATCCAGCATCAATGTTTTTAACTCCATGAATTGTTTAGA
AAATTGTATAGAGGCCAGGCGCAGTGGCTCACGCCTGTAATCCCCATACTTTGGGAGGCCAAGGCAGGCGGATCACCTGA
GGTCAGGAGTTCGAGACCAGCCTGGCCAACATGGTGAAACCCCGTCTCTACTAAAAATACAAAAATTAGTCAGGCCTGGT
GGTAGGCATCTGTAACCCCAGCTATTCGGGAGGCTGAGGTAGGAGAATCACTTGAACCCGGGAGGCAGAGGTTGCAGTGA
GCCAAGATGGTGCCACTGCACTCCAGCCTGGACAACAAAGAGCAAAACTCTGTCTCAAAAAAAAAAAAATTGTATAGATA
TTGATTAAAGCAGAACCCTGAAAGATGGTGAATGATTGGCAGGCAATGAGAGGAAGTGTCGTGTTGTCCGTTGGTGACAT
TTGCAAACTCTCCTGGTTTAAAGTGATTCCAGAGCAACTTTATTTCCATAGTCTGTGTTGCACTCAAAGTAAATATTAAT
TCCATTATGCCATAAACTTTTATTTATCATGACAGCAAGTCATAAAGTTCTCTTACTAGAATATAGCTTCTAAATTGCAT
ATCATAGATGAGATACAACACACACATTGGATTCTAGCAGCAATTACCCACCATCCTTTATGCAAGTCGGTATTAAATGC
ACTTTCTAAATATGAGTCACAGTCTGAGCTTATGCAGAAAATCTTCCATGGCTATTACCCAGGAGCCCAGAAATATGACG
TGAGGGTGACAGAGGAGCCACAGCCAGCAGGGGCCAGACACAAGGATCAAAGTGCACCTAGTCACCCGGCAGTAATTCAT
CTGAAGCATTCTTTGCATATTATGGGTCTGACTTTTTAAATTACGTAAATTGCTCATTAGAAGTGCCTTGTAATTGCCTT
CACCTGCGGTATTGCGAGAGGTGTAGATTTTAATTATTCTTTTTTGATTTGTAATGTGGTTTGACTTTTTCAGGGTGATA
GTTTGGAAAAATACAGATCTGCTCTAATTGTTGTACCAAGTGAAAATAGATGGCTCTGTTGATCAACTTCCTTGACTTTG
TGGTGGAGCGTACAAAGGAGTCTTATAGTTTCCCAGCCCTTGTTGATGACTGAGGGGCTTTGCCCGTGGGGCCCGGGGAG
GATCCCTGTGAGCCTGTAGTGCTGGTGCCTTGGGCTCAGCCAGCCCCTCAGATGACTTGTGTGGAGCCAGGCAGTGTGTT
CATGTTGGATGTAACCAGATCTCGGGGGAGTCTGCTCACCTCTTCCCCACAGCATGGATAGGACAGAAAAAGCCAAGGCC
CATTCATCATCTTTGAAGACTGGGACAGACATGATGAGTGCTGGGGGCGCCCAGGTCCGGGGTGCTCCTGGGATGCCTGC
TGCAGAGATGCTTTTCTCTCTAGCCCACTTCTCGAGGGCAGCCTGGCCTCCTGCTACCCAGCAAGTGCTCTCACTTCCTG
CAGCACAGAGGCCTCCACCTGTCCAGCCCACCCTGAGCCAAGCTGCTCTGGGCTCTGGAACGGGCTGCAGGCTTTCTCGG
GCCCCTGGGCCTGGCCCCCTGTCATCCAGCACACGGAATTGATGAGTTGTGAGACCAGAGCTGTGGCTTCGGGGGTAAAC
TTCCAGGACTCGTTCTGGCTCATCTGCAGCCATGGATGAATAAAAACTCGATCAGGACG

2 comments on commit 9a8654c

@Blosberg
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The sample.fasta is extracted from chr1:7291263-7302041 of the hg38 human genome assembly."

This could be confusing, since the cpgIsland and refGen files are for hg19. Also the name "sample" suggests something specific to the reads being aligned. I might rename it to something like "RefGenSubset_hg19.fasta

@alexg9010
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fasta file is just example data and it that case the naming is not really relevant, only the content and the extension are important.
I do not really see the need to change the sample data, as it is only used by us developers and works as expected.

We could, however, highlight in the documentation that the user should use only data from the same assembly.

Please sign in to comment.