Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRAFT: Add jamfile, a JavaScript runtime for creating scripts/CLIs on top of llamafile #661

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

asg017
Copy link

@asg017 asg017 commented Dec 20, 2024

This PR adds a new jamfile project to llamafile, which is a new JavaScript runtime for scripting with llamafile completion models, embeddings, SQLite, and more.

This is a follow-up to my previous "embedfile" PR #644. There I tried to contribute an "embeddings CLI" that could embed/query different filetypes, but the "CLI-only" API wasn't very flexible, and any new feature would require C++ code.

So I came up with a new tool, jamfile. It's a JavaScript runtime based on top of quickjs-ng, a fork of the original quickjs project. It's a tiny JavaScript engine written entirely in C, which we use to build a runtime with llamafile-specific APIs. It's a standalone cosmopolitan binary where you can write scripts in .js files and execute them. There are JavaScript APIs for interacting with LLMs and embeddings models, using the same llamafile library as the llamafile server.

// hello-world.js
import { red } from "jamfile:color";
import { CompletionModel } from "jamfile:llamafile";

const model = new CompletionModel('Llama-3.2-1B-Instruct-Q4_K_M.gguf');  
const prompt = "write a single haiku about spongebob squarepants";
const response = model.complete(prompt);

console.log(red(prompt), response);
$ jamfile run hello-world.js                                       
write a single haiku about spongebob squarepants 
The sponge's optimism
Makes Krabby Patty dreams come true
Joy in every bite

There are a lot of builtin JS APIs and features, including a builtin jamfile:sqlite module for storing data in SQLite, with sqlite-vec support.

import {TextEmbeddingModel} from "jamfile:llamafile";
import {Database} from "jamfile:sqlite";  
import {open} from "qjs:std";

const README = open('README.md', 'r').readAsString();

const db = new Database();

const model = new TextEmbeddingModel('dist/.models/mxbai-embed-xsmall-v1-f16.gguf');

  

db.execute(`
  CREATE VIRTUAL TABLE vec_chunks USING vec0(
  +contents TEXT,
  contents_embedding float[384] distance_metric=cosine
);`);

function* chunks(arr, n) {
  for (let i = 0; i < arr.length; i += n) {
    yield arr.slice(i, i + n);
  }
}

const tokens = model.tokenize(README);

for(const chunk of chunks(tokens, 64)) {
  const chunk_embedding = model.embed(chunk);
  db.execute(
    'INSERT INTO vec_chunks(contents, contents_embedding) VALUES (?, ?);',
    [model.detokenize(chunk), chunk_embedding]
  );
}

function search(query) {
  const rows = db.queryAll(
    `
    SELECT
      rowid,
      contents
    FROM vec_chunks
    WHERE contents_embedding MATCH ?
    AND k = 10
    `,
    [model.embed(query)]
  );
  console.log(green(query));

  for (const {rowid, contents} of rows) {
    console.log(contents);
  }
}

search('linux CLI Tools ');
search('money supporting ecosystem ');

There's also a few JS builtins: jamfile:cli for parsing CLI arguments, jamfile:zod for structured outputs, jamfile:colors for printing coloring text to the terminal, etc.

Structured outputs are possible with CompletionModel, working with JSON schema or raw GBNf files. A sample, using the bundled-in version of zod:

import { z, zodToJsonSchema } from "jamfile:zod";
import {CompletionModel} from "jamfile:llamafile";

const model = new CompletionModel('Llama-3.2-1B-Instruct-Q4_K_M.gguf');

const MathReasoningSchema = z.object({
  steps: z.array(
    z.object({
      explanation: z.string(),
      output: z.string(),
    })
  ),
  final_answer: z.string(),
}).strict();

const schema = zodToJsonSchema(MathReasoningSchema);
const prompt = `
    You are a helpful math tutor. You will be provided with a math problem,
    and your goal will be to output a step by step solution, along with a final answer.
    For each step, just provide the output as an equation use the explanation field to detail the reasoning.

    PROMPT: how can I solve 8x + 7 = -23
`;

const result = model.complete(prompt, {schema });

console.log(result);
{
  "steps": [
    {
      "explanation": "Subtract 7 from both sides of the equation",
      "output": "-8x = -30"
    },
    {
      "explanation": "Divide both sides of the equation by -8",
      "output": "x = 3.75"
    },
    {
      "explanation": "Simplify the fraction",
      "output": "x = 3.75"
    }
  ],
  "final_answer": "But wait, the answer can't be 3.75!"
}

And finally, you could "bundle" a JavaScript file + gguf models into a jamfile itself, just like llamafile. So a developer could "compile" their JS code + GGUF models into a standalone, cross-platform single-file executable and share it with friends. There's a sample of this in jamfile/tests/standalone, I'll try to get it on huggingface soon.

This PR still needs a ton of work - make install support, error handling, man pages, docs, etc. But it works pretty well and is a ton of fun to play with. There are examples in jamfile/examples that you can take a look at, but some may be out-of-date. I'll be working on this branch throughout the holidays and will update this PR body when features are added.

But overall I'm pretty excited about this! I can see people using jamfile to create standalone CLI's, TUIs, scripts, or other AI applications. In a veeerrry long-term I could even see support for custom server-side endpoints written in JavaScript in llamafile itself, to create even more powerful AI apps with frontend support. There's a lot we could build with this!

separate jamfile project, WIP
@asg017
Copy link
Author

asg017 commented Dec 20, 2024

Marking this PR as a draft for now. Feel free to play around with it, but I'll be fixing/cleaning up a lot of the code in the next few days.

Mainly put this up so other could try it out!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant