Which is your favorite AI model for Perplexica? #312

Zirgite · 2024-08-13T06:43:48Z

Zirgite
Aug 13, 2024

I tried several ai models using Perplexica. Llama 3, Llama 3.1, Phi-3 Medium, Command-r 28b.
My impressions are that even Llama 3 delivers good results, which means that even modest AI capable machines can benefit from this project using local models.
Using Phi-3 Things are getting more interesting. The model is very good. All my tests where Perplexica rivals Perplexity copilot are using Phi-3 medium.
Command-r even if this model does not rank well in leaderboards (which do measure mainly raw logic capabilities) this is truly astonishing model that excels in creative writing and RAG. And the match with Perplexica is really good.
To add further to the complexity different AI models perform in a different way with the same prompt, some deliver better results with simpler prompts some really seem to like detailed and markdown structured prompts.
I did some tests using gpt4o as arbiter. The results are that even the smallest Llama 3.1 8B is having results that are not ranking very differently to the bigger models. Two conclusions either the search engine connection is a bottle neck, or for this task there is no need for a really big model. The second aligns with my constatations that for basic search not needing reasoning the vanilla perplexity does not perform much worse that copilot perplexity and on some occasions being on par with their results.
As a whole I almost completely stopped using google and rely on perplexica and when I am not on my pc on perplexity. Especially if I search something like debugging.

overcuriousity · 2024-08-19T07:07:52Z

overcuriousity
Aug 19, 2024

I look at it from 2 sides: local-only and cloud AI.

For cloud-based AI, I like claude3-haiku: Good, fast and dirt-cheap API (using it on a daily basis for 2 months, only cost under 2$ so far). (Unfortunately in the most recent version of Perplexica at the time of writing, There seem to be issues with rate-limiting)

For local-only setups i got the best results with the popular llama3.1:8b model - my hardware is restricted to a stone-age 4GB VRAM GPU. Phi3 was a miss, as it does not embed the citations correctly in many situations, along with some minor weaknesses. I also used the two larger gemma2 models with very impressive results and for sensitive topics, dolphin-llama3:8b did the trick as censorship is a thing with the stock llama models.

Interestingly, the results are heavily impacted by the embeddings model selected - subjectively even more than with the GPT model. The by far best results i had were with nomic-embed-text served by ollama. I would recommend the latter, as performance is not heavily affected.

2 replies

Zirgite Aug 19, 2024
Author

It appears true that the embeddings do change the results, It is good to see someone to share the same as I was not entirely sure about it, I was thinking that I was hallucinating myself lol.
The biggest advantage of perplexica it is not limited to consummer grade research as the other engines are. All the commercial search ai engines are implicitely tuned to deliver easy to understand results and not to dig in seriously in the topic. But of course you can fix that telling them using the proper prompts who you are (student, researcher etc.) and what is your level of detail you need.
As a result my conclusion is to get the most of this new tool it is better not to treat them as google but to get into real discussion of the topic, in that way you can finetune the research.

overcuriousity Aug 22, 2024

Yeah, you are right - but i indeed would consider it as a replacement for Google etc. if you want to research any topic which requires reading through many search results.
I think the biggest advantage of Perplexity.ai-like applications is the possibility to save you time by summarizing all the search results into a single, high-quality answer. Perplexica is even better than commercial solutions, as it combines this with a powerful meta-search-engine like searxng, increasing privacy in the process with optional local AI-processing for maximum privacy.

Where it does not benefit much is when I search for specific code snippets for a specific problem, for that i stick either with stackoverflow or a good commercial GPT like claude.

I hope this project is not abandoned soon, as the development seemed to slow down a bit.

merodeador · 2024-10-16T08:21:53Z

merodeador
Oct 16, 2024

I've been using Mistral-Nemo 12B with Perplexica and the results are great.

0 replies

vorktanamobay · 2024-11-07T13:38:42Z

vorktanamobay
Nov 7, 2024

I agree, mistral-nemo 12b seems to be the best, at least for my 2080ti. q4_K_M is ok most of the time for me. What are you guys using for num_ctx on the chat model? The stock ollama 2048? I played around with 4096 and 8192, seems to help but really eats vram. I have a 2080ti so I can set num_ctx to 8k with q4_K_M and still get moderately fast speeds. Bumping down num_ctx to 4k with q4_K_M makes it fit in gpu vram and its pretty fast. Anyone else try messing with num_ctx? Is 8k really needed? I'm not sure. I also bumped temperature to 0.7 since I noticed in the code that is what the cloud models are using.

One other thing regarding embedding model, I also used nomic-embed-text:latest but I bumped its num_ctx to 8k, not sure if its needed but that's the max it can handle. Seems to work ok for me.

Update: I did some more testing and q5_K_M with 4096 num_ctx seems to be the sweet spot for response quality.

0 replies

bpawnzZ · 2024-11-09T02:53:22Z

bpawnzZ
Nov 9, 2024

i'm using litellm connected to openrouter using openai/gpt-4o-mini and it work great. running small open embedding model

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Which is your favorite AI model for Perplexica? #312

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 4 comments 2 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Which is your favorite AI model for Perplexica? #312

Zirgite Aug 13, 2024

Replies: 4 comments · 2 replies

overcuriousity Aug 19, 2024

Zirgite Aug 19, 2024 Author

overcuriousity Aug 22, 2024

merodeador Oct 16, 2024

vorktanamobay Nov 7, 2024

bpawnzZ Nov 9, 2024

Zirgite
Aug 13, 2024

Replies: 4 comments 2 replies

overcuriousity
Aug 19, 2024

Zirgite Aug 19, 2024
Author

merodeador
Oct 16, 2024

vorktanamobay
Nov 7, 2024

bpawnzZ
Nov 9, 2024