Spice.ai Demo App

This is a Spice.ai data and AI app.

Prerequisites

Spice.ai CLI installed
OpenAI API key
Hugging Face API token (optional, for LLaMA model)
curl and jq for API calls

Learn More

To learn more about Spice.ai, take a look at the following resources:

Spice.ai - learn about Spice.ai features, data, and API.
Get started with Spice.ai - try out the API and make basic queries.

Connect with us on Discord - your feedback is appreciated!

Demo Steps

Publishing a Spice App in the Cloud

Step 1: Forking and Using the Dataset

Fork the repository https://github.com/jeadie/evals into your GitHub org.

Step 2: Creating a New App in the Cloud

Log into the Spice.ai Cloud Platform and create a new app called evals. The app will start empty.
Connect the app to your repository:
- Go to the App Settings tab and select Connect Repository.
- If the repository is not yet linked, follow the prompts to authenticate and link it.

Step 3: Deploying the App

Set the app to Public:
- Navigate to the app's settings and toggle the visibility to public.
Redeploy the app:
- Click Redeploy to load the datasets and configurations from the repository.

Step 4: Verifying and Testing

Check the datasets in the Spice.ai Cloud:
- Verify that the datasets are correctly loaded and accessible.
Test public access:
- Log in with a different account to confirm the app is accessible to external users.

Initializing a Local Spice App

Initialize a new local Spice app
```
mkdir demo
cd demo
spice init
```
Login to Spice.ai Cloud
```
spice login
```
Get spicepod from Spicerack Navigate to spicerack.org, search for evals.

Click on /evals, click on Use this app, and copy the spice connect command.

Paste the command into the terminal. Navigate to spicerack.org, search for evals, click on /evals, click on Use this app, and copy the spice connect command. Paste the command into the terminal.

spice connect <username>/evals

The spicepod.yml should be updated to:

version: v1beta1
kind: Spicepod
name: demo

dependencies:
  - Jeadie/evals

Add a model to the spicepod

models:
  - name: gpt-4o
    from: openai:gpt-4o
    params:
      openai_api_key: ${ secrets:SPICE_OPENAI_API_KEY }

Start spice
```
spice run
```

Run an eval

curl -XPOST "http://localhost:8090/v1/evals/taxes"      -H "Content-Type: application/json"      -d '{
    "model": "gpt-4o"
  }' | jq

Explore incorrect results

spice sql

SELECT
  input,
  output,
  actual
FROM eval.results
WHERE value=0.0 LIMIT 5;

Optional: Create an Eval to Use a Smaller Model

Track the outputs of all AI model calls:

runtime:
  task_history:
    captured_output: truncated

Define a new view and evaluation:

views:
  - name: user_queries
    sql: |
      SELECT
        json_get_json(input, 'messages') AS input,
        json_get_str((captured_output -> 0), 'content') as ideal
      FROM runtime.task_history
      WHERE task='ai_completion'
  - name: latest_eval_runs
    sql: |
      SELECT model, MAX(created_at) as latest_run
         FROM eval.runs
         GROUP BY model
  - name: model_stats
    sql: |
      SELECT
        r.model,
        COUNT(*) as total_queries,
        SUM(CASE WHEN res.value = 1.0 THEN 1 ELSE 0 END) as correct_answers,
        AVG(res.value) as accuracy
      FROM eval.runs r
      JOIN latest_eval_runs lr ON r.model = lr.model AND r.created_at = lr.latest_run
      JOIN eval.results res ON res.run_id = r.id
      GROUP BY r.model

evals:
  - name: mimic-user-queries
    description: |
      Evaluates how well a model can copy the exact answers already returned to a user. Useful for testing if a smaller/cheaper model is sufficient.
    dataset: user_queries
    scorers:
      - match

Add a smaller model to the spicepod:

models:
  - name: llama3
    from: huggingface:huggingface.co/meta-llama/Llama-3.2-3B-Instruct
    params:
      hf_token: ${ secrets:SPICE_HUGGINGFACE_API_KEY }

  - name: gpt-4o # Keep previous model.

Verify models are loaded:

spice models

You should see both models listed:

NAME    FROM                                                         STATUS
gpt-4o  openai:gpt-4o                                                ready
llama3  huggingface:huggingface.co/meta-llama/Llama-3.3-70B-Instruct ready

Restart the Spice app:
```
spice run
```
Test the larger model or run another eval:
```
spice chat
```

Run evaluations on both models:

# Run eval with GPT-4
curl -XPOST "http://localhost:8090/v1/evals/mimic-user-queries" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o"}' | jq

# Run eval with LLaMA
curl -XPOST "http://localhost:8090/v1/evals/mimic-user-queries" \
  -H "Content-Type: application/json" \
  -d '{"model": "llama3"}' | jq

Compare model performance:
```
spice sql
```
```
SELECT
  model,
  total_queries,
  correct_answers,
  ROUND(accuracy * 100, 2) as accuracy_percentage
FROM model_stats
ORDER BY accuracy_percentage DESC;
```
This query will show:
- Total number of queries processed
- Number of correct answers
- Accuracy percentage as a percentage
You can use these metrics to decide if the smaller model provides acceptable performance for your use case.

Full Spicepod Configuration

Include the following spicepod.yml for reference:

version: v1beta1
kind: Spicepod
name: demo

dependencies:
  - Jeadie/evals

runtime:
  task_history:
    captured_output: truncated

views:
  - name: user_queries
    sql: |
      SELECT
        json_get_json(input, 'messages') AS input,
        json_get_str((captured_output -> 0), 'content') as ideal
      FROM runtime.task_history
      WHERE task='ai_completion'

evals:
  - name: mimic-user-queries
    description: |
      Evaluates how well a model can copy the exact answers already returned to a user. Useful for testing if a smaller/cheaper model is sufficient.
    dataset: user_queries
    scorers:
      - match

models:
  - name: gpt-4o
    from: openai:gpt-4o
    params:
      openai_api_key: ${ secrets:SPICE_OPENAI_API_KEY }

  - name: llama3
    from: huggingface:huggingface.co/meta-llama/Llama-3.2-3B-Instruct
    params:
      hf_token: ${ secrets:SPICE_HUGGINGFACE_API_KEY }

Name		Name	Last commit message	Last commit date
Latest commit History 144 Commits
spicepods/Jeadie/evals		spicepods/Jeadie/evals
.env		.env
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
spicepod.yaml		spicepod.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spice.ai Demo App

Prerequisites

Learn More

Demo Steps

Publishing a Spice App in the Cloud

Step 1: Forking and Using the Dataset

Step 2: Creating a New App in the Cloud

Step 3: Deploying the App

Step 4: Verifying and Testing

Initializing a Local Spice App

Optional: Create an Eval to Use a Smaller Model

Full Spicepod Configuration

About

Releases

Packages

Contributors 9

License

lukekim/demo

Folders and files

Latest commit

History

Repository files navigation

Spice.ai Demo App

Prerequisites

Learn More

Demo Steps

Publishing a Spice App in the Cloud

Step 1: Forking and Using the Dataset

Step 2: Creating a New App in the Cloud

Step 3: Deploying the App

Step 4: Verifying and Testing

Initializing a Local Spice App

Optional: Create an Eval to Use a Smaller Model

Full Spicepod Configuration

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 9

Packages