Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pipe blocking #49

Open
daler opened this issue Feb 12, 2012 · 1 comment
Open

pipe blocking #49

daler opened this issue Feb 12, 2012 · 1 comment

Comments

@daler
Copy link
Owner

daler commented Feb 12, 2012

For streaming intersections of moderate-sized files (say, >5000 features), the following blocks::

z = a.intersect(b, stream=True).intersect(c, stream=True)
len(z)

The schematic below shows what's happening with stdin/stdout and pipes. The above command hangs when trying to write to the stdin of the second process, marked below as ^^^^^^.


    FILE -> stdin-|------------------|-stdout  -> PIPE ->  stdin-|------------------|-stdout -> PIPE -> IntervalIterator
                  | intersectBed (1) |                           | intersectBed (2) |
                  |------------------|-stderr         ^^^^^^     |------------------|-stderr

Despite a forced flush of stdout of command (1) and stdin of command (2) in helpers.call_bedtools,as well as forcing flush of stdout in command (2) in the IntervalIterator, this still blocks.

In the Popen command, setting bufsize=1 or bufsize=0 doesn't help. Docs for Popen.communicate() say that it'll block for large input.

Various stackoverflow answers for similar problems describe the solution to this as using separate threads for each call, however, initial tests make interactive work in IPython a little crazy.

My guess is that workarounds like "rendering" a streaming BedTool to disk will be needed for the near-to-mid-future, since fixes to this will be difficult.

daler added a commit that referenced this issue Feb 13, 2012
daler added a commit that referenced this issue May 24, 2016
@daler
Copy link
Owner Author

daler commented May 24, 2016

Try the select module for non-blocking IO, as suggested by John in this biostars question

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant