A Confidence-based Knowledge Integration Framework for Cross-Domain Table Question Answering

The official repository contains the code and models for our paper A Confidence-based Knowledge Integration Framework for Cross-Domain Table Question Answering. 2024 Knowledge-Based Systems Journal (KBS).

Setup

Download the Spider dataset and put the data in datasets folder in the root directory

Postprocessing on model raw output

Before making inference with GAR, we need to format the outputs of any Seq2Seq models first. Please check the README in the model_output_postprocess folder for the details.

The postprocessing of model GAP, RAT-SQL and BRIDGE has been added in the model_output_postprocess.py file in the root directory.

Training

Using the following command to start training ranking models,

bash train_pipeline.sh <benchmark name> <seq2seq model name> <train json file> <dev json file> <seq2seq model train output file> <seq2seq model dev output file> <table schema json file> <sqlite database directory>

Inference

Using the following command to do the inference,

bash test_pipeline.sh <benchmark name> <seq2seq model name> <seq2seq model output file> <dev/test json file> <gold sql txt file> <table schema json file> <sqlite database directory>

Output files

All the outputs of the inference will be located in the output/spider/reranker directory and saved in a folder using the following naming convention, <benchmark_name>_<model_name>_<candidate_num>_<retrieval_model_name>_<reranker_embedding_name><reranker_model_name>

Debug tips

For debugging SQLGenV2, you may use the reranker_script_debug.py file in the root directory;
For debugging Dialect Builder, you may use the dialect_debug.py file in the root directory;
For debugging any ranking models, you may use the code_debug.py file in the root directory.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.idea		.idea
.vscode		.vscode
allenmodels		allenmodels
configs		configs
datagen		datagen
eval_scripts		eval_scripts
model_output_postprocess		model_output_postprocess
spider_utils		spider_utils
value_mathcing		value_mathcing
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
code_debug.py		code_debug.py
dialect_debug.py		dialect_debug.py
model_output_evaluation.py		model_output_evaluation.py
model_output_postprocess.py		model_output_postprocess.py
requirements.txt		requirements.txt
reranker_script_debug.py		reranker_script_debug.py
retrieval_model_train_script_debug.py		retrieval_model_train_script_debug.py
test_pipeline.sh		test_pipeline.sh
train_pipeline.sh		train_pipeline.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Confidence-based Knowledge Integration Framework for Cross-Domain Table Question Answering

Setup

Postprocessing on model raw output

Training

Inference

Output files

Debug tips

About

Releases

Packages

Contributors 2

Languages

License

Kaimary/CKIF

Folders and files

Latest commit

History

Repository files navigation

A Confidence-based Knowledge Integration Framework for Cross-Domain Table Question Answering

Setup

Postprocessing on model raw output

Training

Inference

Output files

Debug tips

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages