Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash in modelNFR when correcting subtracted fragment-size distribution (no +ve values after subtraction) #98

Open
pjbriggs opened this issue Oct 3, 2024 · 0 comments

Comments

@pjbriggs
Copy link

pjbriggs commented Oct 3, 2024

We have encountered a crash in the FragmentMixDistribution.modelNFR method (in Occupancy.py) for two samples in a recent experiment, when running nucleoatac version 0.3.4. (The program runs successfully on all other samples in the same experiment.)

For both problem samples the command line is of the form:

nucleoatac run --bed <broadpeaks> --bam <bamfile> --fasta <fastafile> --cores 8 --out <samplename>

where broadpeaks is output from MACS.

The crash produces the following (redacted) traceback:

Traceback (most recent call last):
  File "XXXX/nucleoatac034/bin/nucleoatac", line 33, in <module>
    nucleoatac_main(args)
  File "XXXX/nucleoatac034/lib/python2.7/site-packages/nucleoatac/cli.py", line 56, in nucleoatac_main
    run_occ(occ_args)
  File "XXXX/nucleoatac034/lib/python2.7/site-packages/nucleoatac/run_occ.py", line 96, in run_occ
    fragment_dist.modelNFR()
  File "XXXX/nucleoatac034/lib/python2.7/site-packages/nucleoatac/Occupancy.py", line 65, in modelNFR
    nuc[nuc<=0]=min(min(nfr)*0.1,min(nuc[nuc>0])*0.001)
ValueError: min() arg is an empty sequence

which I've identified as being due to the fact that in each case nuc[nuc>0] is empty, as all values in the nuc array are negative. As a result min(nuc[nuc>0] produces the ValueError exception above.

My understanding is that nuc is the "subtracted difference between the extrapolated nucleosome-free model and the observed fragment distribution" mentioned in the "Occupancy determination" section of the NucleoATAC paper. The specific line of code where the failure occurs appears to be a subsequent "correction" to the subtracted distribution which replaces any zero or negative values with a small positive value.

The patch below is a workaround to avoid the error, by trapping for the ValueError exception and setting the zero and negative values to min(nfr)*0.1 (the other expression that appears in the min statement above):

--- a/nucleoatac/Occupancy.py
+++ b/nucleoatac/Occupancy.py
@@ -62,7 +62,10 @@ class FragmentMixDistribution:
         nuc = np.concatenate((np.zeros(boundaries[1]-self.lower),
                             self.fragmentsizes.get(boundaries[1],self.upper) -
                             self.nfr_fit.get(boundaries[1],self.upper)))
-        nuc[nuc<=0]=min(min(nfr)*0.1,min(nuc[nuc>0])*0.001)
+        try:
+            nuc[nuc<=0]=min(min(nfr)*0.1,min(nuc[nuc>0])*0.001)
+        except ValueError:
+            nuc[nuc<=0]=min(nfr)*0.1
         self.nuc_fit = FragmentSizes(self.lower, self.upper, vals = nuc)
     def plotFits(self,filename=None):
         """plot the Fits"""

With this patch the program is then able to continue to completion for both samples.

We would have two questions:

  1. Is this workaround a sensible fix for the problem? (i.e. can we still trust output from a version of nucleoatac which uses this patch?)
  2. Are there any specific conditions or properties of the input data which might trigger the problem with the subtracted differences all being less than zero? (e.g. could it be caused by the inclusion of a specific "problem" region, or by a particular distribution of fragment length sizes etc.)

We're aware that the program is no longer actively maintained. However we would welcome any comments on the above - thanks for your help!

(Also adding @IanCodes to this issue)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant