Skip to content

Commit

Permalink
add readme
Browse files Browse the repository at this point in the history
  • Loading branch information
qbc2016 committed Apr 11, 2024
1 parent 546224b commit 38757f8
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 8 deletions.
20 changes: 12 additions & 8 deletions federatedscope/llm/eval/eval_for_rougel/README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,19 @@
# Rouge-L

## Dolly-15K
To assess the performance of our fine-tuned model, we leverage the Rouge-L
metric and conduct experiments with a large number of clients, utilizing the
Dolly-15K dataset as our training corpus. The Dolly-15K dataset encompasses
a total of 15,015 data points, distributed across eight distinct tasks. For
a more comprehensive evaluation, we allocate the final task exclusively for
evaluation purposes, while dedicating the remaining ones to the training
phase. Our experimental setup involves a network of 200 clients, utilizing a Dirichlet distribution for data partitioning to emulate non-IID conditions across the client base.
metric and conduct experiments with a large number of clients, utilizing the Dolly-15K dataset as our training corpus.
The Dolly-15K dataset encompasses a total of 15,015 data points, distributed across eight distinct tasks. For a more comprehensive evaluation, we allocate the final task exclusively for evaluation purposes, while dedicating the remaining ones to the training phase. Our experimental setup involves a network of 200 clients, utilizing a Dirichlet distribution for data partitioning to emulate non-IID conditions across the client base.

To do the evaluation, run
```bash
python federatescope/eval/eval_for_rougel/eval.py --cfg
federatescope/llm/baselime/xxx.yaml
python federatescope/eval/eval_for_rougel/eval_dolly.py --cfg federatescope/llm/baselime/xxx.yaml
```

## Natural Instructions
We also leverage the Rouge-L metric and conduct experiments with a large number of clients, utilizing the Natural Instructions (NI) dataset as our training corpus. In the NI dataset, we allocate each of the 738 training tasks exclusively to a distinct client for model training, thereby cultivating a non-IID setting characterized by feature distribution skew. Meanwhile, evaluation is performed on separate test tasks.

To do the evaluation, run
```bash
python federatescope/eval/eval_for_rougel/eval_ni.py --cfg federatescope/llm/baselime/xxx.yaml
```

0 comments on commit 38757f8

Please sign in to comment.