Skip to content

Commit

Permalink
Apply suggestions from code review
Browse files Browse the repository at this point in the history
Signed-off-by: redoomed1 <[email protected]>
  • Loading branch information
redoomed1 committed Nov 14, 2024
1 parent abce96b commit 5a3b640
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 12 deletions.
22 changes: 11 additions & 11 deletions docs/ai-chat.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,9 @@ Since the release of ChatGPT in 2022, interactions with Large Language Models (L

Data used to train AI models, however, include a massive amount of _private_ data. Developers of AI software often use [Reinforcement Learning from Human Feedback](https://en.wikipedia.org/wiki/Reinforcement_learning_from_human_feedback) (RLHF) to improve the quality of LLMs, which entails the possibility of AI companies reading your private AI chats as well as storing them. This practice also introduces a risk of data breaches. Furthermore, there is a real possibility that an LLM will leak your private chat information in future conversations with other users.

If you are concerned about these practices, you can either refuse to use AI, or use [truly open-source models](https://proton.me/blog/how-to-build-privacy-first-ai) which publicly release and allow you to inspect their training datasets. One such model is [Olmoe](https://allenai.org/blog/olmoe) made by [Allenai](https://allenai.org/open-data).
If you are concerned about these practices, you can either refuse to use AI, or use [truly open-source models](https://proton.me/blog/how-to-build-privacy-first-ai) which publicly release and allow you to inspect their training datasets. One such model is [Olmoe](https://allenai.org/blog/olmoe) made by [Ai2](https://allenai.org/open-data).

Alternatively, you can run AI models locally as a more private and secure alternative to cloud-based solutions, as your data never leaves your device and is therefore never shared with third parties. This also allows you to share sensitive information to the local model without worry.
Alternatively, you can run AI models locally so that your data never leaves your device and is therefore never shared with third parties. As such, local models are a more private and secure alternative to cloud-based solutions and allow you to share sensitive information to the AI model without worry.

## Hardware for Local AI Models

Expand All @@ -43,13 +43,13 @@ To run AI locally, you need both an AI model and an AI client.

### Find and Choose a Model

There are many permissively licensed models available to download. **[Hugging Face](https://huggingface.co/models?library=gguf)** is a platform that lets you browse, research, and download models in common formats like GGUF. Companies that provide good open-weights models include big names like Mistral, Meta, Microsoft, and Google. However, there are also many community models and 'fine-tunes' available. As mentioned above, [quantized models](https://huggingface.co/docs/optimum/en/concept_guides/quantization) offer the best balance between model quality and performance for those using consumer-grade hardware.
There are many permissively licensed models available to download. **[Hugging Face](https://huggingface.co/models)** is a platform that lets you browse, research, and download models in common formats like [GGUF](https://huggingface.co/docs/hub/en/gguf). Companies that provide good open-weights models include big names like Mistral, Meta, Microsoft, and Google. However, there are also many community models and 'fine-tunes' available. As mentioned above, quantized models offer the best balance between model quality and performance for those using consumer-grade hardware.

To help you choose a model that fits your needs, you can look at leaderboards and benchmarks. The most widely-used leaderboard is [LM Arena](https://lmarena.ai/), a "Community-driven Evaluation for Best AI chatbots". There is also the [OpenLLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard), which focus on the performance of open-weights models on common benchmarks like MMLU-PRO. However, there are also specialized benchmarks which measure factors like [emotional intelligence](https://eqbench.com/), ["uncensored general intelligence"](https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard), and many [others](https://www.nebuly.com/blog/llm-leaderboards).
To help you choose a model that fits your needs, you can look at leaderboards and benchmarks, of which there are many kinds. The most widely-used leaderboard is the community-driven [LM Arena](https://lmarena.ai). Additionally, the [OpenLLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) focuses on the performance of open-weights models on common benchmarks like [MMLU-Pro](https://arxiv.org/abs/2406.01574). Furthermore, there are also specialized benchmarks which measure factors like [emotional intelligence](https://eqbench.com), ["uncensored general intelligence"](https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard), and [many others](https://www.nebuly.com/blog/llm-leaderboards).

### Model Security

When you have found an AI model of your liking, you should download it in a safe manner. When you use an AI client that maintains their own library of model files (such as [Ollama](#ollama-cli) and [Llamafile](#llamafile)), you should download it from there. However, if you want to download models not present in their library, or use an AI client that doesn't maintain its library (such as [Kobold.cpp](#koboldcpp)), you will need to take extra steps to ensure that the AI model you download is safe and legitimate.
When you have found an AI model to your liking, you should download it in a safe manner. When you use an AI client that maintains their own library of model files (such as [Ollama](#ollama-cli) and [Llamafile](#llamafile)), you should download it from there. However, if you want to download models not present in their library, or use an AI client that doesn't maintain its library (such as [Kobold.cpp](#koboldcpp)), you will need to take extra steps to ensure that the AI model you download is safe and legitimate.

We recommend downloading model files from Hugging Face, as it provides several features to verify that your download is genuine and safe to use.

Expand Down Expand Up @@ -80,7 +80,7 @@ A downloaded model is generally safe if it satisfies all of the above checks.

Kobold.cpp is an AI client that runs locally on your Windows, Mac, or Linux computer.

In addition to supporting a large range of text models, Kobold.cpp also supports image generators such as [Stable Diffusion](https://stability.ai/stable-image), and automatic speech recognition tools, such as [Whisper](https://github.com/ggerganov/whisper.cpp).
In addition to supporting a large range of text models, Kobold.cpp also supports image generators such as [Stable Diffusion](https://stability.ai/stable-image) and automatic speech recognition tools such as [Whisper](https://github.com/ggerganov/whisper.cpp).

[:octicons-home-16: Homepage](https://github.com/LostRuins/koboldcpp){ .md-button .md-button--primary }
[:octicons-info-16:](https://github.com/LostRuins/koboldcpp/wiki){ .card-link title="Documentation" }
Expand Down Expand Up @@ -113,7 +113,7 @@ Kobold shines best when you are looking for heavy customization and tweaking, su

![Ollama Logo](assets/img/ai-chat/ollama.svg){align=right}

Ollama is a command-line AI assistant that is available on macOS, Linux, and Windows. Ollama is a great choice if you're looking for an AI client that's easy-to-use and widely compatible. It also doesn't involve any manual setup, while still using inference and other techniques to make outputs faster.
Ollama is a command-line AI assistant that is available on macOS, Linux, and Windows. Ollama is a great choice if you're looking for an AI client that's easy-to-use, widely compatible, and fast due to its use of inference and other techniques. It also doesn't involve any manual setup.

In addition to supporting a wide range of text models, Ollama also supports [LLaVA](https://github.com/haotian-liu/LLaVA) models and has experimental support for Meta's [Llama vision capabilities](https://huggingface.co/blog/llama32#what-is-llama-32-vision).

Expand All @@ -132,7 +132,7 @@ In addition to supporting a wide range of text models, Ollama also supports [LLa

</div>

Ollama simplifies the process of setting up a local AI chat, as it downloads the AI model you want to use automatically. For example, running `ollama run llama3.2` will automatically download and run the Llama 3.2 model. Furthermore, Ollama maintains their own [model library](https://ollama.com/library) where they host the files of various AI models. This ensures models are vetted for both performance and security, eliminating the need to manually verify model authenticity.
Ollama simplifies the process of setting up a local AI chat, as it downloads the AI model you want to use automatically. For example, running `ollama run llama3.2` will automatically download and run the Llama 3.2 model. Furthermore, Ollama maintains their own [model library](https://ollama.com/library) where they host the files of various AI models. This ensures that models are vetted for both performance and security, eliminating the need to manually verify model authenticity.

### Llamafile

Expand All @@ -145,7 +145,7 @@ Llamafile is a lightweight single-file executable that allows users to run large
Llamafile also supports LLaVA. However, it does not support speech recognition or image generation.

[:octicons-home-16: Homepage](https://github.com/Mozilla-Ocho/llamafile){ .md-button .md-button--primary }
[:octicons-info-16:](https://github.com/Mozilla-Ocho/llamafile/?tab=readme-ov-file#llamafile){ .card-link title="Documentation" }
[:octicons-info-16:](https://github.com/Mozilla-Ocho/llamafile#llamafile){ .card-link title="Documentation" }
[:octicons-code-16:](https://github.com/ollama/ollama){ .card-link title="Source Code" }
[:octicons-lock-16:](https://github.com/Mozilla-Ocho/llamafile#security){ .card-link title="Security Policy" }

Expand All @@ -158,9 +158,9 @@ Llamafile also supports LLaVA. However, it does not support speech recognition o

</div>

Mozilla has made llamafiles available for only some Llama and Mistral models, while there are few third-party llamafiles available.
Mozilla has made llamafiles available for only some Llama and Mistral models, while there are few third-party llamafiles available. Moreover, Windows limits `.exe` files to 4GB, and most models are larger than that.

If you use Llamafile on Windows, be aware that Windows limits `.exe` files to 4GB, and most models are larger than that. To work around this restriction, you can [load external weights](https://github.com/Mozilla-Ocho/llamafile#using-llamafile-with-external-weights).
To circumvent these issues, you can [load external weights](https://github.com/Mozilla-Ocho/llamafile#using-llamafile-with-external-weights).

## Criteria

Expand Down
2 changes: 1 addition & 1 deletion includes/abbreviations.en.txt
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@
*[JNI]: Java Native Interface
*[KYC]: Know Your Customer
*[LLaVA]: Large Language and Vision Assistant (multimodal AI model)
*[LLMs]: Largue Language Models (AI models such as ChatGPT)
*[LLMs]: Large Language Models (AI models such as ChatGPT)
*[LUKS]: Linux Unified Key Setup (Full-Disk Encryption)
*[MAC]: Media Access Control
*[MDAG]: Microsoft Defender Application Guard
Expand Down

0 comments on commit 5a3b640

Please sign in to comment.