Can Knowledge Editing Really Correct Hallucinations?

Respository Oveview: This repository contains the code, results and dataset for the paper "Can Knowledge Editing Really Correct Hallucinations?"
TLDR: We proposed HalluEditBench to holistically benchmark knowledge editing methods in correcting real-world hallucinations on five dimensions including Efficacy, Generalization, Portability, Locality, and Robustness. We find that their effectiveness could be far from what their performance on existing datasets suggests, and the performance beyond Efficacy for all methods is generally unsatisfactory.
Authors : Baixiang Huang*, Canyu Chen*, Xiongxiao Xu, Ali Payani, Kai Shu (*equal contributions)
Correspondence to: Kai Shu <[email protected]>.
Paper : Read our paper
Project Website: Visit the project website https://llm-editing.github.io for more resources.

Overview

Large Language Models (LLMs) suffer from hallucinations, referring to the non-factual information in generated content, despite their superior capacities across tasks. Meanwhile, knowledge editing has been developed as a new popular paradigm to correct the erroneous factual knowledge encoded in LLMs with the advantage of avoiding retraining from scratch. However, one common issue of existing evaluation datasets for knowledge editing is that they do not ensure LLMs actually generate hallucinated answers to the evaluation questions before editing. When LLMs are evaluated on such datasets after being edited by different techniques, it is hard to directly adopt the performance to assess the effectiveness of different knowledge editing methods in correcting hallucinations. Thus, the fundamental question remains insufficiently validated: Can knowledge editing really correct hallucinations in LLMs?

We proposed HalluEditBench to holistically benchmark knowledge editing methods in correcting real-world hallucinations. First, we rigorously construct a massive hallucination dataset with 9 domains, 26 topics and more than 6,000 hallucinations. Then, we assess the performance of knowledge editing methods in a holistic way on five dimensions including Efficacy, Generalization, Portability, Locality, and Robustness. Through HalluEditBench, we have provided new insights into the potentials and limitations of different knowledge editing methods in correcting hallucinations, which could inspire future improvements and facilitate the progress in the field of knowledge editing.

questions contains the pre-processed hallucination detection dataset, including the questions we used to evaluate the editing methods. topic contains the topics we selected from WikiData, and triplet contains the raw knowledge triplets that were used to generate the questions for hallucination detection.

Running Experiments

Run example: To get started (e.g. using ROME to edit llama3-8b on the places_landmark data), run:

cd ./code
python3 edit_all_method.py \
    --model_name=llama3-8b \
    --edit_method=ROME \
    --topic_name=places_landmark \
    --device_edit=0 \
    --device_eval=1 \
    --data_size=5 \
    --results_dir=../new_results_dir \
    --question_types rephrase_questions questions_2hop

Note:

Without specifying the --edit_method, the script will run 7 editing methods sequentially by default.
Specify --question_types to choose specific types of questions in the evaluation (The example above will only evalute 2-hop questions and rephrased questions). Otherwise, the script will run all the question types (yes_questions, no_questions, locality_questions, rephrase_questions, multiple_choice_questions, reversed_relation_questions, questions_2hop, questions_3hop, questions_4hop, questions_5hop, questions_6hop). The original questions is always included.
Specify --results_dir to save the results to a specific directory, otherwise the default directory is where we save the results that we report in the paper. You can also use --overwrite_result to overwrite the existing result file.

To run the multi-turn editing, here is an example:

python3 edit_all_method_multi_turn.py \
    --model_name=llama3-8b \
    --edit_method=ROME \
    --topic_name=places_landmark \
    --device_edit=0 \
    --device_eval=1 \
    --model_eval=meta-llama/Meta-Llama-3-8B-Instruct \
    --data_size=5 \
    --results_dir=../new_results_dir \
    --multi_turn=yes \
    --multi_turn_num=10

Use --multi_turn to choose the type of multi-turn evaluation (yes or sure).
Use --multi_turn_num to set the number of turns for multi-turn evaluation.

We use a local LLM (e.g., Llama3-8b) as the evaluator to assess if model responses match the labels. For experiments, we recommend using at least one GPU with 48 GB of memory (e.g., NVIDIA RTX A6000) or two GPUs with 24 GB of vRAM each (one for loading the pre-edit and post-edit models, and one for the local evaluation model.) Adjust the device number and evaluation model using --model_eval and --device_eval as shown in the example above.

For full experiments to reproduce the results in the paper:

Experiment for all the 26 topics:
```
./edit_all_topic.sh
```
Experiment for the robustness evaluation:
```
./code/edit_all_topic_multi_turn.sh
```

We evaluate instruction-tuned models including Llama-2-7B-chat, Llama-3-8B-Instruct, and Mistral-7B-v0.3. All parameters are in the code/hparams/<method_name>/<model_name>.

Results are stored at llama_2_7b_chat_hf, meta_llama_3_8b_instruct, mistral_7b_instruct_v0.3 under the results folder.

To summarize the results, use the jupyter notebook code/result_table.ipynb

Acknowledgements

We gratefully acknowledge the use of code and data from the following projects: GRACE, EasyEdit, ROME, MEMIT

Citation

If you find our paper or code useful, we will greatly appreacite it if you could consider citing our paper:

@article{huang2024canknowledge,
    title   = {Can Knowledge Editing Really Correct Hallucinations?},
    author  = {Baixiang Huang and Canyu Chen and Xiongxiao Xu and Ali Payani and Kai Shu},
    year    = {2024},
    journal = {arXiv preprint arXiv: 2410.16251}
}

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
code		code
data		data
results		results
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Can Knowledge Editing Really Correct Hallucinations?

Overview

Table of Contents

Repository Structure

Installation

Usage

Data Preparation

Running Experiments

Acknowledgements

Citation

About

Releases

Packages

Contributors 2

Languages

License

llm-editing/HalluEditBench

Folders and files

Latest commit

History

Repository files navigation

Can Knowledge Editing Really Correct Hallucinations?

Overview

Table of Contents

Repository Structure

Installation

Usage

Data Preparation

Running Experiments

Acknowledgements

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages