Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strand alignment error #587

Open
ConnieXuhm opened this issue Dec 3, 2020 · 1 comment
Open

Strand alignment error #587

ConnieXuhm opened this issue Dec 3, 2020 · 1 comment

Comments

@ConnieXuhm
Copy link

HI, when I try to do the strand flip process by using the pre-phased shapeit2 data, with the reference panel as 1000 genome phase3 (.vcf.gz format), I encounter the following problem. Please help me, thanks!

My code is
java -jar GenotypeHarmonizer.jar --input 1118_noflip.phased1 --inputType SHAPEIT2 --ref ALL.chr1.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes --refType VCF --output 1202_GH --update-id --outputType SHAPEIT2

And the error presents:

Started logging

Interpreted arguments: 
 - Input base path: 1118_noflip.phased1 
 - Input data type: Shapeit2 output
 - Output base path: 1202_GH
 - Output data type: Impute2 haplotypes haps / sample files
 - Reference base path: ALL.chr1.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes
 - Reference data type: VCF file
 - Number of flank variants to consider for LD alignment: 100
 - Minimum LD of flanking variants before using for LD alignment: 0.3
 - Minimum number of variants needed to for LD alignment: 3
 - Maximum MAF of variants to use minor allele as backup for alignment: 0.0
 - Update study IDs: yes
 - Match study reference alleles: no
 - Keep variants not in reference data: no
 - Minimum posterior probability for input data: 0.4
 - LD checker off
 - Force input sequence name: not forcing

Input data loaded
Exception in thread "main" java.lang.RuntimeException: BGZF file has invalid uncompressedLength: -408592984
	at net.sf.samtools.util.BlockCompressedInputStream.inflateBlock(BlockCompressedInputStream.java:380)
	at net.sf.samtools.util.BlockCompressedInputStream.readBlock(BlockCompressedInputStream.java:365)
	at net.sf.samtools.util.BlockCompressedInputStream.available(BlockCompressedInputStream.java:113)
	at net.sf.samtools.util.BlockCompressedInputStream.readLine(BlockCompressedInputStream.java:181)
	at org.molgenis.vcf.meta.VcfMetaParser.readLine(VcfMetaParser.java:126)
	at org.molgenis.vcf.meta.VcfMetaParser.parse(VcfMetaParser.java:41)
	at org.molgenis.vcf.VcfReader.parseVcfMeta(VcfReader.java:70)
	at org.molgenis.vcf.VcfReader.getVcfMeta(VcfReader.java:57)
	at org.molgenis.genotype.vcf.VcfGenotypeData.<init>(VcfGenotypeData.java:124)
	at org.molgenis.genotype.vcf.VcfGenotypeData.<init>(VcfGenotypeData.java:80)
	at org.molgenis.genotype.RandomAccessGenotypeDataReaderFormats.createGenotypeData(RandomAccessGenotypeDataReaderFormats.java:184)
	at org.molgenis.genotype.RandomAccessGenotypeDataReaderFormats.createGenotypeData(RandomAccessGenotypeDataReaderFormats.java:158)
	at org.molgenis.genotype.RandomAccessGenotypeDataReaderFormats.createGenotypeData(RandomAccessGenotypeDataReaderFormats.java:133)
	at nl.umcg.deelenp.genotypeharmonizer.GenotypeHarmonizer.main(GenotypeHarmonizer.java:325)
Caused by: java.lang.NegativeArraySizeException: -408592984
	at net.sf.samtools.util.BlockCompressedInputStream.inflateBlock(BlockCompressedInputStream.java:378)
	... 13 more
@PatrickDeelen
Copy link
Member

it seems that either the vcf.gz file or the tbi file are corrupt. You could try running the vcf-validator (https://vcftools.github.io/perl_module.html#vcf-validator) to verify this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants