Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version 4 is more accurate but much slower than version 3? #307

Closed
irleader opened this issue Feb 28, 2024 · 4 comments
Closed

Version 4 is more accurate but much slower than version 3? #307

irleader opened this issue Feb 28, 2024 · 4 comments
Labels
question Further information is requested

Comments

@irleader
Copy link

Hi,

I benchmarked on same data with v4.1.0 and v3.5.0.

I can see some improvements on peptide recall with same massive-kb trained model, while the inference speed is much slower (same machine,GPU, same beam number). Does the same happen on your side?

If not, is it because I did not config the environment for Casanovo v4.1.0 correctly? I see something like this:
"You are using a CUDA device ('NVIDIA GeForce RTX 3080 Ti Laptop GPU') that has Tensor Cores. To properly utilize them, you should set torch.set_float32_matmul_precision('medium' | 'high') which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision"

Best regards

@bittremieux bittremieux added the question Further information is requested label Feb 28, 2024
@bittremieux
Copy link
Collaborator

Normally there should not be significant performance reductions between v4 and v3. There was a non-neligible slowdown when beam search was introduced, but that already stems from v3.2.0.

Nevertheless, performance is indeed an important point of attention. We are optimizing the beam search decoding code, which is the most time-consuming step, but this is currently still work in progress (#269).

The notification is just for informational purposes and now occurs because of newer versions of the PyTorch and PyTorch Lightning dependencies, but it doesn't have an impact on performance.

Can you give an estimate of the amount of slowdown you're experience? Is this for inference or for training? How many spectra are you processing and how long does it take?

@irleader
Copy link
Author

With same 743 spectra, beam number of 5, prediction batch size of 512, also same machine, GPU and pretrained model:

v4.1.0:
Predicting DataLoader 0: 100%|███████████████████████████████████████████████████████████| 2/2 [05:24<00:00,  0.01it/s]

v3.5.0:
Predicting DataLoader 0: 100%|██████████████████████████████████████████████████████████| 2/2 [04:02<00:00, 121.11s/it]

Best regards

@melihyilmaz
Copy link
Collaborator

I didn't observe any significant difference in speed when I ran v4.1.0 and v3.5.0 with the same configurations (5 beams) on the same set of 14,257 spectra on this Colab GPU runtime.

v4.1.0:
Predicting DataLoader 0: 100% 14/14 [29:52<00:00, 128.04s/it]

v3.5.0:
Predicting DataLoader 0: 100% 14/14 [29:47<00:00, 127.67s/it]

@bittremieux
Copy link
Collaborator

So because of the small number of spectra in the first test, fluctuations in the start-up time might dominate. There doesn't seem to be a regression issue leading to a significant slowdown in v4. Nevertheless, computational efficiency is something we're actively investigating, and hopefully we'll be able to release some speed-ups soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants