You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for the very interesting paper and releasing your codebase!
I have been working with your codebase for a different multimodal text generation task and observe lower performance with VL-T5 and VL-BART than other similar models. I think this might be a hyperparameter tuning issue. Do you have any advice on which particular parameters might be beneficial to tune? I am currently following the Multi30K settings for the learning rate and number of epochs from Table 14 in your paper.
The text was updated successfully, but these errors were encountered:
Hi @shrutijpalaskar. Since I had to run all pretraining/finetuning experiments on a 4 x 10GB RTX 2080 ti server (much smaller compared to recent works from big companies), I couldn't try a wide hyperparameter search, which means the current hyperparameters are under-tuned and might be far from optimal. I guess VL-T5/VL-BART model could achieve higher scores on benchmarks with better hyperparameters.
In my experiments, I didn't observe much difference when tuning parameters (ex. batch size, learning rate, epochs) during finetuning. I found improvements when using longer pretraining epochs (10epochs -> 30epochs; I didn't have time to explore longer) and bigger backbone architectures (ex. t5-small -> t5-base), which are kinda obvious.
What is your target multimodal text generation task?
Hi Jaemin,
Thanks for the very interesting paper and releasing your codebase!
I have been working with your codebase for a different multimodal text generation task and observe lower performance with VL-T5 and VL-BART than other similar models. I think this might be a hyperparameter tuning issue. Do you have any advice on which particular parameters might be beneficial to tune? I am currently following the Multi30K settings for the learning rate and number of epochs from Table 14 in your paper.
The text was updated successfully, but these errors were encountered: