-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to Fine Tune the translation task #5
Comments
You can put your parallel data in a csv file separated by tabs. !python araT5/examples/run_trainier_seq2seq_huggingface.py |
@salma-elshafey @Nagoudi how to use araT5 do machine translation? what's the script of infer |
@hust-kevin You simply add the argument --do_predict to record the predictions(test_preds_seq2seq.txt) for the provided test_file. |
@hust-kevin @BarahFazili Do I need to specify -- source lang if I am using an dialectal arabic as input? For instance arz for Egyptian Arabic which was part of the training dataset? What should my source lang tag be if I am running inferencing for another dialectal arabic not part of the training set for eg. acw-Hijazi Arabic. I am not sure if the way I am running inferencing is correct. I am using pipeline(). Is there an ideal way to run inferencing? ` model = AutoModelForSeq2SeqLM.from_pretrained("UBC-NLP/AraT5-msa-base") with open('/content/gdrive/Shareddrives/Gutenberg/MT/experiments/HuggingFace/AraT5-msa-base-acw-v2-en_JHN/acw_pred.txt','w',encoding='utf-8') as f: |
I need to fine tune the translation task should i prepare the data in specific format?
and how to fine tune your model for that task?
The text was updated successfully, but these errors were encountered: