Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi_intersect confused when using header=True #239

Closed
Xparx opened this issue Feb 28, 2018 · 3 comments
Closed

multi_intersect confused when using header=True #239

Xparx opened this issue Feb 28, 2018 · 3 comments

Comments

@Xparx
Copy link

Xparx commented Feb 28, 2018

I want to use the header argument for multi intersect command. The issue is that any subsequent command fails using the code gives the error below:

import pybedtools as bt

bedfiles = ["a.bed", "b.bed"]
x = bt.BedTool()
bfl = []
for bf in bedfiles:
    bfl.append(bt.example_bedtool(bf))

intsect = x.multi_intersect(i=[b.fn for b in bfl], header=True)

print(intsect.head())

MalformedBedLineError: Unable to detect format from ['chrom', 'start', 'end', 'num', 'list', '/home/user/.virtualenvs/default/lib/python3.5/site-packages/pybedtools/test/data/a.bed', '/home/user/.virtualenvs/default/lib/python3.5/site-packages/pybedtools/test/data/b.bed']

@daler
Copy link
Owner

daler commented Feb 28, 2018

Thanks for reporting. The issue here is that multiinter does not output an actual BED file, it's a report of the various intersections.

import pybedtools as bt
import pandas

bedfiles = ["a.bed", "b.bed"]
x = bt.BedTool()
bfl = []
for bf in bedfiles:
    bfl.append(bt.example_bedtool(bf))

intsect = x.multi_intersect(i=[b.fn for b in bfl], header=True)
df = pandas.read_table(intsect.fn)

See also #113; I'll add a note to the docs for multi_inter as a reminder it doesn't give BED output.

@daler
Copy link
Owner

daler commented Feb 28, 2018

To clarify, multiiner does not spit out a valid BED file when header=True. If you're OK without the header, another solution here is to use the default header=False:

import pybedtools as bt
import pandas

bedfiles = ["a.bed", "b.bed"]
x = bt.BedTool()
bfl = []
for bf in bedfiles:
    bfl.append(bt.example_bedtool(bf))

intsect = x.multi_intersect(i=[b.fn for b in bfl])
intsect.head()

gives

chr1    1       155     1       1       1       0
chr1   155     200     2       1,2     1       1
chr1   200     500     1       1       1       0
chr1   800     900     1       2       0       1
chr1   900     901     2       1,2     1       1
chr1   901     950     1       1       1       0

@daler daler closed this as completed Feb 28, 2018
daler added a commit that referenced this issue Feb 28, 2018
@Xparx
Copy link
Author

Xparx commented Feb 28, 2018

Thanks for the quick response and fix!
I currently used it with header=False but got confused when switching to True.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants