Example application for the construction and inference of an LLM-based LangChain SQL Agent that can dynamically query a database and invoke multiple visualization tools. The language model used is OpenAIs GPT-4o mini.
For this, four datasets from the European Statistical Office (Eurostat) are loaded into a local SQL database that the LLM can query for up to 15 iterations per run. It can then use the results to independently call and output one of three basic visualizations functions based on Plotly.
The four datasets are all sourced from the Health determinants part of Eurostats public dataset API and include statistics on:
- tobacco consumption by country of citizenship for the years 2014 and 2019 (Link)
- body mass index (BMI) by country of citizenship for the years 2014 and 2019 (Link)
- physical exercise by country of citizenship for the years 2014 and 2019 (Link)
- alcohol consumption by country of citizenship for the years 2014 and 2019 (Link)
The LLM agent can use the following three tool functions to visualize the results (see agent_tools.py):
- output_table(): output 2D table contents as a pretty table using Plotly table viewer
- output_bar_plot(): output a simple bar plot
- output_time_series_plot(): output one or multiple line plots along one main time axis
langchain
langchain-community
langchain-openai
sqlalchemy
pydantic
pandas
plotly