TransAct

This repository is the official implementation of the ACL 2024 paper Pruning Large Language Models to Intra-module Low-rank Architecture with Transitional Activations.


Pruning architecture	Latency on Xiaomi 14 mobile phone

Training and Evaluation

Prepare environment following transact.dockerfile.

Create symlink to your data, models, outputs, and HuggingFace cache (mainly for large datasets).

ln -s /path/to/data data
ln -s /path/to/models models
ln -s /path/to/outputs outputs
ln -s /path/to/hf-cache hf-cache

Tweak training configs train_config.yaml and deepspeed.json.

Run the train script run_trainer.sh, for example

bash run_trainer.sh -m all \
  -a 768 -f 1536 \
  -n 128 -k 8 -p acts \
  -l 4096 -t 50 \
  -g 64 -b 4 \
  -d togethercomputer/RedPajama-Data-1T \
  -x llama -y llama2 -z 7B

Run bash run_trainer.sh -h for help.

Run the evaluation script eval.sh, for example

bash eval.sh -m all \
  -a 768 -f 1536 \
  -n 128 -k 8 -p acts \
  -l 4096 -t 50 \
  -d togethercomputer/RedPajama-Data-1T \
  -x llama -y llama2 -z 7B

Run bash eval.sh -h for help.

Citations

Please cite the paper if this repository is useful for you.

@inproceedings{shen-etal-2024-pruning,
    title = "Pruning Large Language Models to Intra-module Low-rank Architecture with Transitional Activations",
    author = "Shen, Bowen  and
      Lin, Zheng  and
      Zha, Daren  and
      Liu, Wei  and
      Luan, Jian  and
      Wang, Bin  and
      Wang, Weiping",
    booktitle = "Findings of the Association for Computational Linguistics ACL 2024",
    year = "2024",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.findings-acl.582",
    doi = "10.18653/v1/2024.findings-acl.582",
    pages = "9781--9793",
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TransAct

Training and Evaluation

Citations

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
lm-evaluation-harness		lm-evaluation-harness
modeling		modeling
pruning		pruning
training		training
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
data		data
deepspeed.json		deepspeed.json
eval.sh		eval.sh
hf-cache		hf-cache
model		model
outputs		outputs
run_trainer.sh		run_trainer.sh
train_config.yaml		train_config.yaml
transact.dockerfile		transact.dockerfile

License

XiaoMi/transact-pruning

Folders and files

Latest commit

History

Repository files navigation

TransAct

Training and Evaluation

Citations

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages