Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADD: cross-compatibility with either Python2 or Python3 #52

Open
wants to merge 91 commits into
base: master
Choose a base branch
from

Conversation

kevinkle
Copy link
Collaborator

@kevinkle kevinkle commented Feb 4, 2018

No description provided.

@kevinkle kevinkle reopened this Feb 12, 2018
@kevinkle
Copy link
Collaborator Author

The main functions work in py27 as of 0f8cdf2. There's no rush to merge this PR, but would recommend doing so before making any changes.

@kevinkle
Copy link
Collaborator Author

Tests passing in py36, 1 failing in py27:

$ nosetests
2018-02-12 01:16:48,609 ectyper.ectyper INFO     Temporary files and directory created
2018-02-12 01:16:48,609 ectyper.ectyper INFO     Temporary files and directory created
.2018-02-12 01:16:48,610 ectyper.ectyper INFO     Starting ectyper
Serotype prediction with input:
             Namespace(detailed=False, input=u'test/Data/test_dir', output=u'test_folder_input', percentIdentity=90, percentLength=50, species=False, verify=False)
             Log file is: /home/travis/build/phac-nml/ecoli_serotyping/ectyper.log
2018-02-12 01:16:48,613 ectyper.ectyper INFO     Temporary files and directory created
2018-02-12 01:16:48,613 ectyper.ectyper INFO     Gathering genome files
2018-02-12 01:16:48,613 ectyper.genomeFunctions INFO     Gathering genomes from directory test/Data/test_dir
2018-02-12 01:16:48,613 ectyper.ectyper INFO     Removing invalid file types
2018-02-12 01:16:48,618 ectyper.genomeFunctions WARNING  test/Data/test_dir/badfasta.fasta is not a valid file
2018-02-12 01:16:48,619 ectyper.genomeFunctions WARNING  test/Data/test_dir/sample.fasta.zip is not a valid file
2018-02-12 01:16:48,621 ectyper.genomeFunctions WARNING  Compressed file is not supported: test/Data/test_dir/sampletar
2018-02-12 01:16:48,622 ectyper.genomeFunctions WARNING  test/Data/test_dir/test_junk.txt is not a fasta/fastq file
2018-02-12 01:16:48,622 ectyper.ectyper INFO     Preparing genome files for blast alignment
2018-02-12 01:16:48,625 ectyper.ectyper INFO     4 final fasta files
2018-02-12 01:16:48,625 ectyper.ectyper INFO     Standardizing the genome headers
2018-02-12 01:16:48,627 ectyper.ectyper INFO     Start creating blast database #1
2018-02-12 01:16:48,627 ectyper.ectyper INFO     Using SEROTYPE_FILE: /home/travis/build/phac-nml/ecoli_serotyping/ectyper/Data/ectyper_data.fasta
2018-02-12 01:16:48,627 ectyper.ectyper INFO     Using SEROTYPE_ALLELE_JSON: /home/travis/build/phac-nml/ecoli_serotyping/ectyper/Data/ectyper_dict.json
2018-02-12 01:16:48,754 ectyper.ectyper INFO     Start blast alignment on database #1
2018-02-12 01:16:50,872 ectyper.ectyper INFO     Start serotype prediction for database #1
2018-02-12 01:16:50,872 ectyper.predictionFunctions INFO     Predicting serotype from blast output
2018-02-12 01:16:50,913 ectyper.predictionFunctions INFO     Serotype prediction completed
2018-02-12 01:16:50,921 ectyper.ectyper INFO     Outputs stored in /home/travis/build/phac-nml/ecoli_serotyping/output/test_folder_input
2018-02-12 01:16:50,921 ectyper.ectyper INFO     
Reporting result...
2018-02-12 01:16:50,926 ectyper.predictionFunctions INFO     
genome O_prediction              O_info H_prediction           H_info
 sample            -  No alignment found          H10  Alignment found
sample2         O148     Alignment found          H44  Alignment found
sample3         O148     Alignment found          H44  Alignment found
sample4         O148     Alignment found          H44  Alignment found
.2018-02-12 01:16:50,928 ectyper.ectyper INFO     Starting ectyper
Serotype prediction with input:
             Namespace(detailed=False, input=u'', output=None, percentIdentity=90, percentLength=50, species=False, verify=False)
             Log file is: /home/travis/build/phac-nml/ecoli_serotyping/ectyper.log
2018-02-12 01:16:50,928 ectyper.ectyper INFO     Temporary files and directory created
2018-02-12 01:16:50,929 ectyper.ectyper INFO     Gathering genome files
2018-02-12 01:16:50,929 ectyper.ectyper INFO     Removing invalid file types
2018-02-12 01:16:50,929 ectyper.ectyper INFO     Preparing genome files for blast alignment
2018-02-12 01:16:50,929 ectyper.ectyper INFO     0 final fasta files
2018-02-12 01:16:50,929 ectyper.ectyper INFO     No valid genome files. Terminating the program.
.usage: nosetests [-h] -i INPUT [-d PERCENTIDENTITY] [-l PERCENTLENGTH]
                 [--verify] [-s] [--detailed] [-o OUTPUT]
nosetests: error: argument -d/--percentIdentity: 999 is an invalid positive int percentage value
.2018-02-12 01:16:50,933 ectyper.ectyper INFO     Starting ectyper
Serotype prediction with input:
             Namespace(detailed=False, input=u'test/Data/test_dir/sample.fasta,test/Data/test_dir/sample.fasta', output=u'test_list_input', percentIdentity=90, percentLength=50, species=False, verify=False)
             Log file is: /home/travis/build/phac-nml/ecoli_serotyping/ectyper.log
2018-02-12 01:16:50,934 ectyper.ectyper INFO     Temporary files and directory created
2018-02-12 01:16:50,934 ectyper.ectyper INFO     Gathering genome files
2018-02-12 01:16:50,934 ectyper.genomeFunctions INFO     Using genomes in the input list
2018-02-12 01:16:50,934 ectyper.ectyper INFO     Removing invalid file types
2018-02-12 01:16:50,936 ectyper.ectyper INFO     Preparing genome files for blast alignment
2018-02-12 01:16:50,937 ectyper.ectyper INFO     2 final fasta files
2018-02-12 01:16:50,937 ectyper.ectyper INFO     Standardizing the genome headers
2018-02-12 01:16:50,938 ectyper.ectyper INFO     Start creating blast database #1
2018-02-12 01:16:50,938 ectyper.ectyper INFO     Using SEROTYPE_FILE: /home/travis/build/phac-nml/ecoli_serotyping/ectyper/Data/ectyper_data.fasta
2018-02-12 01:16:50,939 ectyper.ectyper INFO     Using SEROTYPE_ALLELE_JSON: /home/travis/build/phac-nml/ecoli_serotyping/ectyper/Data/ectyper_dict.json
2018-02-12 01:16:51,064 ectyper.ectyper INFO     Start blast alignment on database #1
2018-02-12 01:16:52,525 ectyper.ectyper INFO     Start serotype prediction for database #1
2018-02-12 01:16:52,525 ectyper.predictionFunctions INFO     Predicting serotype from blast output
2018-02-12 01:16:52,553 ectyper.predictionFunctions INFO     Serotype prediction completed
2018-02-12 01:16:52,562 ectyper.ectyper INFO     Outputs stored in /home/travis/build/phac-nml/ecoli_serotyping/output/test_list_input
2018-02-12 01:16:52,562 ectyper.ectyper INFO     
Reporting result...
2018-02-12 01:16:52,567 ectyper.predictionFunctions INFO     
genome O_prediction              O_info H_prediction           H_info
sample            -  No alignment found          H10  Alignment found
sample            -  No alignment found          H10  Alignment found
.2018-02-12 01:16:52,569 ectyper.ectyper INFO     Starting ectyper
Serotype prediction with input:
             Namespace(detailed=False, input=u'test/Data/Escherichia.fastq', output=u'test_valid_fastq_file', percentIdentity=90, percentLength=50, species=False, verify=False)
             Log file is: /home/travis/build/phac-nml/ecoli_serotyping/ectyper.log
2018-02-12 01:16:52,569 ectyper.ectyper INFO     Temporary files and directory created
2018-02-12 01:16:52,569 ectyper.ectyper INFO     Gathering genome files
2018-02-12 01:16:52,570 ectyper.genomeFunctions INFO     Using genomes in file test/Data/Escherichia.fastq
2018-02-12 01:16:52,570 ectyper.ectyper INFO     Removing invalid file types
2018-02-12 01:16:52,571 ectyper.ectyper INFO     Preparing genome files for blast alignment
E2018-02-12 01:16:56,271 ectyper.genomeFunctions INFO     Using genomes in the input list
.2018-02-12 01:16:56,271 ectyper.genomeFunctions WARNING  123 is not found
2018-02-12 01:16:56,272 ectyper.genomeFunctions WARNING  test/Data/test_dir/test_junk.txt is not a fasta/fastq file
2018-02-12 01:16:56,273 ectyper.genomeFunctions WARNING  Compressed file is not supported: test/Data/test_dir/sampletar
2018-02-12 01:16:56,273 ectyper.genomeFunctions WARNING  test/Data/test_dir/sample.fasta.zip is not a valid file
.2018-02-12 01:16:56,275 ectyper.ectyper INFO     Start creating blast database #1
2018-02-12 01:16:56,275 ectyper.ectyper INFO     Using SEROTYPE_FILE: /home/travis/build/phac-nml/ecoli_serotyping/ectyper/Data/ectyper_data.fasta
2018-02-12 01:16:56,275 ectyper.ectyper INFO     Using SEROTYPE_ALLELE_JSON: /home/travis/build/phac-nml/ecoli_serotyping/ectyper/Data/ectyper_dict.json
2018-02-12 01:16:56,394 ectyper.ectyper INFO     Start blast alignment on database #1
2018-02-12 01:16:57,566 ectyper.ectyper INFO     Start serotype prediction for database #1
2018-02-12 01:16:57,566 ectyper.predictionFunctions INFO     Predicting serotype from blast output
2018-02-12 01:16:57,593 ectyper.predictionFunctions INFO     Serotype prediction completed
.2018-02-12 01:16:57,605 ectyper.ectyper INFO     Start creating blast database #1
2018-02-12 01:16:57,605 ectyper.ectyper INFO     Using SEROTYPE_FILE: /home/travis/build/phac-nml/ecoli_serotyping/ectyper/Data/ectyper_data.fasta
2018-02-12 01:16:57,605 ectyper.ectyper INFO     Using SEROTYPE_ALLELE_JSON: /home/travis/build/phac-nml/ecoli_serotyping/ectyper/Data/ectyper_dict.json
2018-02-12 01:16:57,733 ectyper.ectyper INFO     Start blast alignment on database #1
2018-02-12 01:16:58,612 ectyper.ectyper INFO     Start serotype prediction for database #1
2018-02-12 01:16:58,612 ectyper.predictionFunctions INFO     Predicting serotype from blast output
2018-02-12 01:16:58,613 ectyper.predictionFunctions INFO     No hit found for this blast query
2018-02-12 01:16:58,632 ectyper.predictionFunctions INFO     Serotype prediction completed
.2018-02-12 01:16:58,650 ectyper.ectyper INFO     Start creating blast database #1
2018-02-12 01:16:58,650 ectyper.ectyper INFO     Using SEROTYPE_FILE: /home/travis/build/phac-nml/ecoli_serotyping/ectyper/Data/ectyper_data.fasta
2018-02-12 01:16:58,650 ectyper.ectyper INFO     Using SEROTYPE_ALLELE_JSON: /home/travis/build/phac-nml/ecoli_serotyping/ectyper/Data/ectyper_dict.json
2018-02-12 01:16:58,773 ectyper.ectyper INFO     Start blast alignment on database #1
2018-02-12 01:16:59,867 ectyper.ectyper INFO     Start serotype prediction for database #1
2018-02-12 01:16:59,867 ectyper.predictionFunctions INFO     Predicting serotype from blast output
2018-02-12 01:16:59,893 ectyper.predictionFunctions INFO     Serotype prediction completed
.2018-02-12 01:16:59,906 ectyper.ectyper INFO     Start creating blast database #1
2018-02-12 01:16:59,907 ectyper.ectyper INFO     Using SEROTYPE_FILE: /home/travis/build/phac-nml/ecoli_serotyping/ectyper/Data/ectyper_data.fasta
2018-02-12 01:16:59,907 ectyper.ectyper INFO     Using SEROTYPE_ALLELE_JSON: /home/travis/build/phac-nml/ecoli_serotyping/ectyper/Data/ectyper_dict.json
2018-02-12 01:17:00,034 ectyper.ectyper INFO     Start blast alignment on database #1
2018-02-12 01:17:01,593 ectyper.ectyper INFO     Start serotype prediction for database #1
2018-02-12 01:17:01,593 ectyper.predictionFunctions INFO     Predicting serotype from blast output
2018-02-12 01:17:01,623 ectyper.predictionFunctions INFO     Serotype prediction completed
.2018-02-12 01:17:01,635 ectyper.ectyper INFO     Start creating blast database #1
2018-02-12 01:17:01,636 ectyper.ectyper INFO     Using SEROTYPE_FILE: /home/travis/build/phac-nml/ecoli_serotyping/ectyper/Data/ectyper_data.fasta
2018-02-12 01:17:01,636 ectyper.ectyper INFO     Using SEROTYPE_ALLELE_JSON: /home/travis/build/phac-nml/ecoli_serotyping/ectyper/Data/ectyper_dict.json
2018-02-12 01:17:01,766 ectyper.ectyper INFO     Start blast alignment on database #1
2018-02-12 01:17:03,172 ectyper.ectyper INFO     Start serotype prediction for database #1
2018-02-12 01:17:03,172 ectyper.predictionFunctions INFO     Predicting serotype from blast output
2018-02-12 01:17:03,198 ectyper.predictionFunctions INFO     Serotype prediction completed
..2018-02-12 01:17:04,113 ectyper.speciesIdentification INFO     Salmonella.fasta is identified as an invalid e.coli genome file by marker approach
.
======================================================================
ERROR: test_valid_fastq_file (test.ectyper_test.TestEctyper)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/build/phac-nml/ecoli_serotyping/test/ectyper_test.py", line 58, in test_valid_fastq_file
    ectyper.run_program()
  File "/home/travis/build/phac-nml/ecoli_serotyping/ectyper/ectyper.py", line 69, in run_program
    raw_files_dict, temp_files, verify=args.verify, species=args.species
  File "/home/travis/build/phac-nml/ecoli_serotyping/ectyper/ectyper.py", line 265, in filter_for_ecoli_files
    ffile, f, temp_dir, verify=verify, species=species)
  File "/home/travis/build/phac-nml/ecoli_serotyping/ectyper/ectyper.py", line 294, in filter_file_by_species
    genomeFunctions.assemble_reads(genome_file, combined_file, temp_dir)
  File "/home/travis/build/phac-nml/ecoli_serotyping/ectyper/genomeFunctions.py", line 260, in assemble_reads
    return split_mapped_output(output)
  File "/home/travis/build/phac-nml/ecoli_serotyping/ectyper/genomeFunctions.py", line 284, in split_mapped_output
    SeqIO.write(identif_seqs, output_handle, "fasta")
  File "/home/travis/miniconda/envs/test-environment/lib/python2.7/site-packages/Bio/SeqIO/__init__.py", line 491, in write
    count = writer_class(fp).write_file(sequences)
  File "/home/travis/miniconda/envs/test-environment/lib/python2.7/site-packages/Bio/SeqIO/Interfaces.py", line 214, in write_file
    count = self.write_records(records)
  File "/home/travis/miniconda/envs/test-environment/lib/python2.7/site-packages/Bio/SeqIO/Interfaces.py", line 199, in write_records
    self.write_record(record)
  File "/home/travis/miniconda/envs/test-environment/lib/python2.7/site-packages/Bio/SeqIO/FastaIO.py", line 200, in write_record
    self.handle.write(">%s\n" % title)
TypeError: write() argument 1 must be unicode, not str
-------------------- >> begin captured logging << --------------------
ectyper.ectyper: INFO: Starting ectyper
Serotype prediction with input:
             Namespace(detailed=False, input=u'test/Data/Escherichia.fastq', output=u'test_valid_fastq_file', percentIdentity=90, percentLength=50, species=False, verify=False)
             Log file is: /home/travis/build/phac-nml/ecoli_serotyping/ectyper.log
ectyper.ectyper: DEBUG: Namespace(detailed=False, input=u'test/Data/Escherichia.fastq', output=u'test_valid_fastq_file', percentIdentity=90, percentLength=50, species=False, verify=False)
ectyper.ectyper: INFO: Temporary files and directory created
ectyper.ectyper: DEBUG: {u'output_file': u'/home/travis/build/phac-nml/ecoli_serotyping/output/test_valid_fastq_file/output.csv', u'assemble_temp_dir': u'/tmp/tmpGQSplF/assemblies', u'output_dir': u'/home/travis/build/phac-nml/ecoli_serotyping/output/test_valid_fastq_file', u'fasta_temp_dir': u'/tmp/tmpGQSplF/fastas'}
ectyper.ectyper: INFO: Gathering genome files
ectyper.genomeFunctions: INFO: Using genomes in file test/Data/Escherichia.fastq
ectyper.ectyper: DEBUG: [u'/home/travis/build/phac-nml/ecoli_serotyping/test/Data/Escherichia.fastq']
ectyper.ectyper: INFO: Removing invalid file types
ectyper.ectyper: DEBUG: raw fasta files: []
ectyper.ectyper: DEBUG: raw fastq files: [u'/home/travis/build/phac-nml/ecoli_serotyping/test/Data/Escherichia.fastq']
ectyper.ectyper: DEBUG: {u'fasta': [], u'fastq': [u'/home/travis/build/phac-nml/ecoli_serotyping/test/Data/Escherichia.fastq']}
ectyper.ectyper: INFO: Preparing genome files for blast alignment
ectyper.subprocess_util: DEBUG: Running: [u'bowtie2', u'--score-min L,1,-0.5', u'--np 5', u'-x', u'/home/travis/build/phac-nml/ecoli_serotyping/ectyper/Data/bowtie_index/combined', u'-U', u'/home/travis/build/phac-nml/ecoli_serotyping/test/Data/Escherichia.fastq', u'-S', u'/tmp/tmpGQSplF/assemblies/reads.sam']
ectyper.subprocess_util: DEBUG: Subprocess finished successfully in 0.640 sec.
ectyper.subprocess_util: DEBUG: Running: [u'samtools', u'view', u'-F 4', u'-q 1', u'-bS', u'/tmp/tmpGQSplF/assemblies/reads.sam', u'-o', u'/tmp/tmpGQSplF/assemblies/reads.bam']
ectyper.subprocess_util: DEBUG: Subprocess finished successfully in 0.206 sec.
ectyper.subprocess_util: DEBUG: Running: [u'samtools', u'sort', u'/tmp/tmpGQSplF/assemblies/reads.bam', u'-o', u'/tmp/tmpGQSplF/assemblies/reads.sorted.bam']
ectyper.subprocess_util: DEBUG: Subprocess finished successfully in 0.179 sec.
ectyper.subprocess_util: DEBUG: Running: samtools mpileup -uf /home/travis/build/phac-nml/ecoli_serotyping/ectyper/Data/combined.fasta /tmp/tmpGQSplF/assemblies/reads.sorted.bam | bcftools call -c | vcfutils.pl vcf2fq | seqtk seq -A - > /tmp/tmpGQSplF/assemblies/Escherichia.fasta
ectyper.subprocess_util: DEBUG: Subprocess finished successfully in 2.669 sec.
--------------------- >> end captured logging << ---------------------
----------------------------------------------------------------------
Ran 15 tests in 15.825s
FAILED (errors=1)
The command "nosetests" exited with 1.
Done. Your build exited with 1.

@kevinkle
Copy link
Collaborator Author

Nosetests passing on both py27 and py26 now.

@kevinkle kevinkle self-assigned this Feb 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant