- We used mistral 7B (https://huggingface.co/mistralai/Mistral-7B-v0.1) as our base model
- 1x4090 GPU
- 4-bit QLoRA
- Weights and Bias
We use three kind of filters
- We run the base model through open source train data, once we get the output we calculate rouge score with output and expected output.
- With cutoff threshold we filter out the data points with high rouge score.
We used platypus (https://arxiv.org/abs/2308.07317) based embedding filter with same data as in the paper, but we discarded the LLM generated data.
We extracted random examples using this filter from some tasks. Following table depicts our exact setting for the different model versions ###Submission 1