Replies: 2 comments 5 replies
-
Part of the reason for the increased runtime is that the function to advance to a given tree takes linear time, making your total runtime complexity closer to quadratic in the number of trees. Another issue will depend on how you are using the multi-processing module. If you are using it with threads, then the global interpreter lock may be grinding performance to a halt. |
Beta Was this translation helpful? Give feedback.
-
Great question @stsmall - how long does the straightforward Since these are Relate trees you might be better of splitting the chromosome into chunks - they are slow to iterate over because the tree sequence is stored "tree-by-tree". This also means that |
Beta Was this translation helpful? Give feedback.
-
Hi,
I am calculating the mrca of a population on each tree in a tree_sequence.
I have been attempting to use python multiprocessing module to split the trees into chunks. Why? Since these are trees made in Relate they are uncompressed and when I used ts.as_list() it consumed 400G of memory.
I tried to create a list of indexes [[0,1,2,3], [4,5,6], [7,8,9]] and pass that to pool.map to split up the tree sequence.
That actually took longer than running for each tree in sequence with ts.trees(). This may be a reflection of my poor understanding of how the multiprocessing module works in python and my inexperience in dealing with the tree_sequence object.
any advice?
thanks,
scott
Beta Was this translation helpful? Give feedback.
All reactions