Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2 channel files #12

Open
romaingroux opened this issue Sep 16, 2021 · 3 comments
Open

2 channel files #12

romaingroux opened this issue Sep 16, 2021 · 3 comments

Comments

@romaingroux
Copy link

Hello,

First of all, thank you very much for this helpful tool :-)

However, I encountered an unexpected issue.

The problem : I tried to generate a 2 channel bnx file. To do so, I used the provided example.xml config file located in example/ecoli. I simply modified the enzyme labels such that they both are the same (as instructed in the xml file). The enzyme part looks like this :

<enzymes>
    <!--Location of the enzymes file-->
    <file>enzymes.xml</file>
    <!--Specification of one enzyme and its label-->
    <enzyme>
        <!--id of the enzyme-->
        <id>BspQI</id>
        <label>label_0</label>
        <!--label used for this enzyme, different labels will result in different output files-->
    </enzyme>
    <!--Multiple enzymes can be specified, possibly with different labels-->
    <enzyme>
        <id>BssSI</id>
        <label>label_0</label>
    </enzyme>
</enzymes>

What I got : The bnx generation runs just fine and one file is created. Its content looks like this (only showing the header and the 2 first molecules) :

# BNX File Version:     1.2
# Label Channels:       1
# Nickase Recognition Site 1:   GCTCTTC
# Nickase Recognition Site 2:   CACGAG
# Bases per Pixel:      494
# Software Version:     omsim-v1.0
#rh     SourceFolder    InstrumentSerial        Time    NanoChannelPixelsPerScan        StretchFactor   BasesPerPixel   NumberofScans   ChipId  Fl
owCell        LabelSNRFilterType      MinMoleculeLength       MinLabelSNR     RunId
# Run Data      /usr/local/omsim-1.0.2-lw/test/ecoli/Detect Molecules   omsim-v1.0      09/16/21 11:37 AM       3033450 0.859417154047  494.521197
301   30      20249,11843,07/17/2014,840014289        1       Static  20      0.0     1       
# Number of Molecules:  282809
#0h     LabelChannel    MoleculeID      Length  AvgIntensity    SNR     NumberofLabels  OriginalMoleculeId      ScanNumber      ScanDirection   Ch
ipId  Flowcell        RunId   GlobalScanNumber
#0f     int     int     float   float   float   int     int     int     int     string  int     int     int
#1h     LabelChannel    LabelPositions[N]
#1f     int     float
#Qh     QualityScoreID  QualityScores[N]
#Qf     string  float[N]
# Quality Score QX11: Label SNR for channel 1
# Quality Score QX12: Label Intensity for channel 1
0       1       536308.01       0.27    2.36    91      1       1       -1      20249,11843,07/17/2014,840014289        1       1       1
1       5845.95 11535.19        18323.42        24478.07        27754.13        40476.98        46002.43        49860.99        53862.52        56
925.06        62799.93        65014.39        69441.45        72046.91        75174.04        77644.46        82837.95        90448.45        9187
3.76        93875.32        96576.20        99675.74        105127.86       110858.47       112577.48       116512.14       118930.49       122734
.40       124257.18       131541.49       135134.48       138489.18       146498.47       148622.37       153210.31       155180.09       161605.41       163598.80       175765.77       181337.67       190371.00       199956.76       202477.45       217681.32       220990.82       226537.26
       228432.26       230922.89       238164.57       242694.01       244488.69       250213.63       253284.58       261006.98       267399.05
       288489.28       296296.50       299103.22       306651.45       312991.56       332852.93       336394.91       340286.80       343146.52
       348799.75       365307.06       394242.59       403081.81       405914.82       422278.99       427894.53       431631.10       443037.20
       444554.66       448497.26       456172.77       460303.24       464572.50       469023.32       473902.23       477747.56       483829.37
       489140.12       499219.28       502142.72       506690.18       508580.98       516444.36       527755.67       531716.50       536023.79
       536308.01
QX11    11.3248 26.0220 6.1136  5.1985  10.6919 7.4039  6.7257  15.0430 11.8760 9.9746  7.1683  11.4347 14.6543 17.4445 11.5783 6.2301  10.1057 9.3727  6.4544  9.1212  15.5943 7.8081  22.3259 10.4775 24.7351 5.8935  8.2796  6.8363  19.5513 27.4655 6.8465  12.8575 7.8373  5.3468  9.4584  22.8599 40.6151 5.8017  17.6557 10.6642 27.4564 4.8391  15.6541 10.7837 18.0035 15.0177 9.9969  9.6208  4.5049  66.5454 7.8915  14.3392 10.5303 20.9704 16.5207 20.6540 22.3292 20.5428 6.7104  18.5969 9.1778  53.2601 9.9391  12.9255 10.3755 14.9125 8.2381  14.6157 10.2374 6.2536  9.8079  16.8139 12.3500 20.0131 28.8947 13.1106 9.5460  9.5231  5.4160  10.8018 12.2161 3.4229  29.0833 7.1468  5.5263  8.7348  17.6320 5.9391  11.5840 18.9972 11.4166
QX12    0.0448  0.0607  0.0295  0.1375  0.0427  0.0973  0.0422  0.0534  0.1047  0.0612  0.0900  0.0347  0.0297  0.0730  0.0779  0.0938  0.0441  0.0266  0.0221  0.0940  0.1171  0.0634  0.0854  0.0467  0.0549  0.0302  0.0254  0.0860  0.1043  0.0521  0.0618  0.0981  0.0074  0.0036  0.0499  0.0277  0.1194  0.0753  0.0082  0.0803  0.1018  0.0483  0.1126  0.1163  0.0681  0.0487  0.0528  0.0397  0.0673  0.0873  0.0118  0.0723  0.0920  0.1128  0.0176  0.0293  0.1237  0.0765  0.1304  0.0747  0.1018  0.1093  0.0104  0.1311  0.1266  0.0467  0.0860  0.0443  0.1298  0.0552  0.0557  0.0547  0.0663  0.0442  0.1086  0.1269  0.0863  0.1276  0.1673  0.0070  0.1434  0.0478  0.0718  0.1726  0.0277  0.0719  0.0847  0.0587  0.1015  0.0339  0.0139
0       2       224592.01       0.27    2.70    37      2       1       -1      20249,11843,07/17/2014,840014289        1       1       1
1       3014.53 4614.89 20269.36        25294.99        27942.85        31562.84        34762.62        42170.20        46046.34        48408.35
        52273.74        54797.90        57482.95        62385.90        65234.99        73309.36        77059.59        82146.31        85363.80
        88798.08        112187.82       119134.80       125954.41       138311.62       141393.92       145794.94       153520.34       155272.32
       159009.55       162399.77       171160.96       177515.82       183810.75       185780.37       191570.06       198357.73       215461.23
       224592.01
QX11    23.7897 12.5608 11.0305 8.1858  5.2397  16.1350 7.6597  31.1956 11.5797 5.9489  10.1537 13.7131 13.5428 6.1943  11.9492 7.6853  10.6203 107.1927        8.8574  16.2148 14.6434 15.0047 31.1486 19.0375 5.0293  13.9101 9.1936  12.3377 9.6316  23.3678 10.4011 10.9505 4.7670  7.3086  18.0303 12.3900 11.8448
QX12    0.0349  0.1098  0.1621  0.1133  0.0920  0.0544  0.1237  0.0414  0.0831  0.0775  0.0534  0.1074  0.0342  0.0874  0.0820  0.1313  0.0257  0.0797  0.0154  0.0631  0.0840  0.0436  0.1807  0.1674  0.0872  0.0518  0.0288  0.0164  0.1152  0.1077  0.0710  0.0396  0.0646  0.1591  0.0815  0.0342  0.0413

What I expected: Based on BioNano documentation (https://bionanogenomics.com/wp-content/uploads/2018/04/30038-BNX-File-Format-Specification-Sheet.pdf), I expected something different:

  1. The header list the two enzymes, however the # Label Channels: 1 should indicate 2 rather than 1
  2. Each molecule should contain 7 lines (0, 1, 2, QX11, QX21, QX12, QX22). Here the file created only seems to contain one channel.

Overall, I am a bit puzzled :-) Did I miss-understand something or is there a way to generate a 2 channel file as described by BioNano ?

Thanks in advance

gmiclotte added a commit that referenced this issue Sep 16, 2021
@gmiclotte
Copy link
Collaborator

Hello

You understood correctly, you could not generate a multi-channel file with OMSim v1.0.x. What you could do was generate multiple single channel files, by setting different labels for the different enzymes (you used label_0 for both enzymes in your xml), which basically contains the same information, but in a less convenient way.

As such, I have just pushed an update, OMSim v1.1.1, that introduces an additional post-processing step to merge these multiple single channel files into a single multi-channel file. You will still have to specify different label strings for each enzyme.

So if you download the latest version and change the label identifiers in the XML, then it should now create a multi-channel BNX file.

You could for example change the second label identifier to label_1, as I've done here:

<enzymes>
    <!--Location of the enzymes file-->
    <file>enzymes.xml</file>
    <!--Specification of one enzyme and its label-->
    <enzyme>
        <!--id of the enzyme-->
        <id>BspQI</id>
        <label>label_0</label>
        <!--label used for this enzyme, different labels will result in different output files-->
    </enzyme>
    <!--Multiple enzymes can be specified, possibly with different labels-->
    <enzyme>
        <id>BssSI</id>
        <label>label_1</label>
    </enzyme>
</enzymes>

Please let me know if it works for you now, or if there is still an issue.

@romaingroux
Copy link
Author

Great, thank you very much! I am going to have a look at it and keep you informed.

@romaingroux
Copy link
Author

As promised, here is a feedback. The latest version works like a charm and generates 2 channels files - that to my knowledge are compliant with BioNano format. Thank you very much :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants