Version 4 is more accurate but much slower than version 3? #307

irleader · 2024-02-28T08:25:41Z

Hi,

I benchmarked on same data with v4.1.0 and v3.5.0.

I can see some improvements on peptide recall with same massive-kb trained model, while the inference speed is much slower (same machine,GPU, same beam number). Does the same happen on your side?

If not, is it because I did not config the environment for Casanovo v4.1.0 correctly? I see something like this:
"You are using a CUDA device ('NVIDIA GeForce RTX 3080 Ti Laptop GPU') that has Tensor Cores. To properly utilize them, you should set torch.set_float32_matmul_precision('medium' | 'high') which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision"

Best regards

The text was updated successfully, but these errors were encountered:

bittremieux · 2024-02-28T16:15:14Z

Normally there should not be significant performance reductions between v4 and v3. There was a non-neligible slowdown when beam search was introduced, but that already stems from v3.2.0.

Nevertheless, performance is indeed an important point of attention. We are optimizing the beam search decoding code, which is the most time-consuming step, but this is currently still work in progress (#269).

The notification is just for informational purposes and now occurs because of newer versions of the PyTorch and PyTorch Lightning dependencies, but it doesn't have an impact on performance.

Can you give an estimate of the amount of slowdown you're experience? Is this for inference or for training? How many spectra are you processing and how long does it take?

irleader · 2024-02-29T02:41:13Z

With same 743 spectra, beam number of 5, prediction batch size of 512, also same machine, GPU and pretrained model:

v4.1.0:
Predicting DataLoader 0: 100%|███████████████████████████████████████████████████████████| 2/2 [05:24<00:00, 0.01it/s]

v3.5.0:
Predicting DataLoader 0: 100%|██████████████████████████████████████████████████████████| 2/2 [04:02<00:00, 121.11s/it]

Best regards

melihyilmaz · 2024-03-06T01:58:58Z

I didn't observe any significant difference in speed when I ran v4.1.0 and v3.5.0 with the same configurations (5 beams) on the same set of 14,257 spectra on this Colab GPU runtime.

v4.1.0:
Predicting DataLoader 0: 100% 14/14 [29:52<00:00, 128.04s/it]

v3.5.0:
Predicting DataLoader 0: 100% 14/14 [29:47<00:00, 127.67s/it]

bittremieux · 2024-03-06T07:14:53Z

So because of the small number of spectra in the first test, fluctuations in the start-up time might dominate. There doesn't seem to be a regression issue leading to a significant slowdown in v4. Nevertheless, computational efficiency is something we're actively investigating, and hopefully we'll be able to release some speed-ups soon.

bittremieux added the question Further information is requested label Feb 28, 2024

bittremieux closed this as completed Mar 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Version 4 is more accurate but much slower than version 3? #307

Version 4 is more accurate but much slower than version 3? #307

irleader commented Feb 28, 2024

bittremieux commented Feb 28, 2024

irleader commented Feb 29, 2024

melihyilmaz commented Mar 6, 2024

bittremieux commented Mar 6, 2024

Version 4 is more accurate but much slower than version 3? #307

Version 4 is more accurate but much slower than version 3? #307

Comments

irleader commented Feb 28, 2024

bittremieux commented Feb 28, 2024

irleader commented Feb 29, 2024

melihyilmaz commented Mar 6, 2024

bittremieux commented Mar 6, 2024