Strand alignment error #587

ConnieXuhm · 2020-12-03T05:15:14Z

HI, when I try to do the strand flip process by using the pre-phased shapeit2 data, with the reference panel as 1000 genome phase3 (.vcf.gz format), I encounter the following problem. Please help me, thanks!

My code is
java -jar GenotypeHarmonizer.jar --input 1118_noflip.phased1 --inputType SHAPEIT2 --ref ALL.chr1.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes --refType VCF --output 1202_GH --update-id --outputType SHAPEIT2

And the error presents:

Started logging

Interpreted arguments: 
 - Input base path: 1118_noflip.phased1 
 - Input data type: Shapeit2 output
 - Output base path: 1202_GH
 - Output data type: Impute2 haplotypes haps / sample files
 - Reference base path: ALL.chr1.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes
 - Reference data type: VCF file
 - Number of flank variants to consider for LD alignment: 100
 - Minimum LD of flanking variants before using for LD alignment: 0.3
 - Minimum number of variants needed to for LD alignment: 3
 - Maximum MAF of variants to use minor allele as backup for alignment: 0.0
 - Update study IDs: yes
 - Match study reference alleles: no
 - Keep variants not in reference data: no
 - Minimum posterior probability for input data: 0.4
 - LD checker off
 - Force input sequence name: not forcing

Input data loaded
Exception in thread "main" java.lang.RuntimeException: BGZF file has invalid uncompressedLength: -408592984
	at net.sf.samtools.util.BlockCompressedInputStream.inflateBlock(BlockCompressedInputStream.java:380)
	at net.sf.samtools.util.BlockCompressedInputStream.readBlock(BlockCompressedInputStream.java:365)
	at net.sf.samtools.util.BlockCompressedInputStream.available(BlockCompressedInputStream.java:113)
	at net.sf.samtools.util.BlockCompressedInputStream.readLine(BlockCompressedInputStream.java:181)
	at org.molgenis.vcf.meta.VcfMetaParser.readLine(VcfMetaParser.java:126)
	at org.molgenis.vcf.meta.VcfMetaParser.parse(VcfMetaParser.java:41)
	at org.molgenis.vcf.VcfReader.parseVcfMeta(VcfReader.java:70)
	at org.molgenis.vcf.VcfReader.getVcfMeta(VcfReader.java:57)
	at org.molgenis.genotype.vcf.VcfGenotypeData.<init>(VcfGenotypeData.java:124)
	at org.molgenis.genotype.vcf.VcfGenotypeData.<init>(VcfGenotypeData.java:80)
	at org.molgenis.genotype.RandomAccessGenotypeDataReaderFormats.createGenotypeData(RandomAccessGenotypeDataReaderFormats.java:184)
	at org.molgenis.genotype.RandomAccessGenotypeDataReaderFormats.createGenotypeData(RandomAccessGenotypeDataReaderFormats.java:158)
	at org.molgenis.genotype.RandomAccessGenotypeDataReaderFormats.createGenotypeData(RandomAccessGenotypeDataReaderFormats.java:133)
	at nl.umcg.deelenp.genotypeharmonizer.GenotypeHarmonizer.main(GenotypeHarmonizer.java:325)
Caused by: java.lang.NegativeArraySizeException: -408592984
	at net.sf.samtools.util.BlockCompressedInputStream.inflateBlock(BlockCompressedInputStream.java:378)
	... 13 more

The text was updated successfully, but these errors were encountered:

PatrickDeelen · 2020-12-03T08:15:02Z

it seems that either the vcf.gz file or the tbi file are corrupt. You could try running the vcf-validator (https://vcftools.github.io/perl_module.html#vcf-validator) to verify this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Strand alignment error #587

Strand alignment error #587

ConnieXuhm commented Dec 3, 2020

PatrickDeelen commented Dec 3, 2020

Strand alignment error #587

Strand alignment error #587

Comments

ConnieXuhm commented Dec 3, 2020

PatrickDeelen commented Dec 3, 2020