Feedback from an occasional user #3358

robang74 · 2025-01-02T17:40:36Z

robang74
Jan 2, 2025

Hi, thanks for the great job!

I have noticed two other problems: 1. while a template is working on a prompt, selecting another chat made with another template only to read that chat, processing stops and the template has to be reloaded. This is awful because it prevents text from being copied from one chat while another is running in the background. 2. using the dark theme (both) the link or tags such as #12 are presented in blue, a dark blue as in the light theme, but the text should be light in the dark theme. This creates problems when reading the text.

I hope this helps, R-

robang74 · 2025-01-03T11:10:44Z

robang74
Jan 3, 2025
Author

Hi, I am back to bring another feedback.

This time is about the AI model "Nous Hermes 2 Mistral DPO". First of all, it works greatly considering that it has 7bln parameters with Q4_0 legacy quantisation (the Q4_K_M is usually much better). I am run it without GPU because Intel GPUs are not supported by gpt4all. Despite these limitations, it is relatively amusing. Unfortunately is strongly biased toward some topics:

environment / ecological protection / preservation
international coordination with a centralised governance
universal vaccination policy

I am not against these values. That's is NOT the point. The issue arises when those values are so strongly embedded in the model for which it cannot provide the service that it supposed to do. In fact, asking to analyse a text - part by part - create a list of brief summaries in order to evaluate the structure of the text and the logic reasoning along the text, it comes up "inventing" things.

It is not the case of hallucinations. In fact, degreasing its temperature from 0.7 to 0.5, it worsening the situation. This happens because the AI model is strongly biases about some topics that instead of summarizing with an high degree of fidelity a text, it manipulates it colouring it with its own biases. The text on which it was working is the ChatGPT vs Human conversation presented in this page.

La semplicità delle direttive in caso di crisi

Please, notice that the dialogue with ChatGPT is not about contrasting those values but put them in a reasonable rational perspective that in brief can be summarized in: "Once we took almost all the same way at almost the same time, the risk of facing a HUGE disaster is implicit because the theory of systems: uniformity vs collapse risk, rigidity vs fragility, single headed governance vs single point of critical failure + the bare law of physic classic mechanics in which high speed moving vs high negative acceleration in case of impact (F=ma)"

As you can imagine, these are NOT arguments against those values but reasonable and legit concerns about HOW that values are managed. In this context the chatbot based on the AI model listed above, decided to introduce its own biases tainting with them the author’s opinion.

The best part was when I asked to it why it invented those things. Surprisingly, it provided to me a relatively long answer in which the first part was about "literature about that topics should be also considered not just the author opinion" and in the second part trying to convince me that it was doing good in reporting rather than inventing. So, I answered that I was sure about it was inventing thing because I was "the author" of that text, BOOM. LOL

Finally, it is noticeable that it has a quite interesting mild bias - but not particularly strong, at least in this test - about ethic. In fact, this biases do not allows it to correctly differentiate the "ethic" and the "moral hazard". I mean, ethic is about doing the right thing - like proposing vaccination - the moral hazard is HOW the right thing is enforced or managed.

IMHO, this distinction is pretty clear into that dialogue because explaining it is the reason for which ChatGPT decided to agree with me. Once, ChatGPT correctly identified my position to be NOT against its value but trying to put their management into a rational framework, accept to agree with me despite in some previous prompt show pathetic censorship and strongly biases about those topic.

Without any surprise, the AI model are a mirror of humans, including our biases. So, nothing new here. Just a report.

Finally, I have to admit that the process of prompting / engaging the AI model was purposely a bit malicious in order to trick the model to expose its own biases. Where "a bit malicious" means something reasonable like in a decent human conversation: in asking you to execute a task, I give you the feeling that you can introduce "your own stuff" in it. However, because I know my own stuff, I get informed about your stuff (biases). Again, nothing new here.

Now, I am going to try this model downloaded from HuggingFace as an alternative of the one cited above.

Open Hermes 2.5 Neural Chat + Mistral by Slerp

I have tried a child of it but it was strongly biased about privacy and in particular when AI technology was involved. Curiously, the child - AFAIK - was not fine tuned or re-trained but just differently quantised. Possibly, the different way of simplifying its weights artifacted a bias which it does not seem the father has or shown yet.

Please, consider that the bias neutrality of an AI model is way more IMPORTANT than performances (e.g. 2.6 tokens/sec vs 3.1 tokens/sec). Under this point of view, it would be nice that the answer to a prompt would shown the execution time with ms granularity. Probably this is possible modifying the ChatML template. It is a detail that I did not investigated, yet.

I hope this helps, R-

0 replies

robang74 · 2025-01-03T16:56:34Z

robang74
Jan 3, 2025
Author

The Open Hermes 2.5 Neural Chat + Mistral by Slerp model, in some cases, it loops by reformulating the prompt instead of responding to it.

Probably, it happens when in trying to execute its task runs out of resources but I did not investigated deeply the problem because I found a quick work-around. Moreover, it does not get out of that loop even with a simple and imposing directive.

This happens using a minimal template. Instead, adopting this one taken from Reasoner v1, seems to solve the problem.

Please notice that this AI model is working mainly in English despite it knows pretty well others languages like Italian (French and less in deep Spanish, Portuguese and German as stated in its HF page). This means that it translates to English and back when it works with an Italian prompter and on an Italian document. In this translation, some s/he and other specific traits of the Italian document/language get lost.

Despite this seems a limit, in some specific cases in which the final version of the document is going to be read in English by an automatic translation tool like Google Translate, it can be a sort of the advantage because implicitly we got a glimpse about what that document is going to appear to a foreigner reader who read it translated.

{{- '<|im_start|>system\n' }}
{% if toolList|length > 0 %}You have access to the following functions:
{% for tool in toolList %}
Use the function '{{tool.function}}' to: '{{tool.description}}'
{% if tool.parameters|length > 0 %}
parameters:
{% for info in tool.parameters %}
  {{info.name}}:
    type: {{info.type}}
    description: {{info.description}}
    required: {{info.required}}
{% endfor %}
{% endif %}
# Tool Instructions
If you CHOOSE to call this function ONLY reply with the following format:
'{{tool.symbolicFormat}}'
Here is an example. If the user says, '{{tool.examplePrompt}}', then you reply
'{{tool.exampleCall}}'
After the result you might reply with, '{{tool.exampleReply}}'
{% endfor %}
You MUST include both the start and end tags when you use a function.

You are a helpful AI assistant who uses the functions to break down, analyze, perform, and verify complex reasoning tasks. You SHOULD try to verify your answers using the functions where possible.
{% endif %}
{{- '<|im_end|>\n' }}
{% for message in messages %}
{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}
{% endfor %}
{% if add_generation_prompt %}
{{ '<|im_start|>assistant\n' }}
{% endif %}

0 replies

robang74 · 2025-01-04T01:58:27Z

robang74
Jan 4, 2025
Author

The combination of these two are enough to work around the issues related to the Open Hermes 2.5 Neural Chat + Mistral by Slerp model, plus they improve the quality and coherency of the outputs of every other similar models I have tried included the Reasoner v1 and the Nous Hermes 2 Mistral DPO as chatbots.

Chat Template

{%- for message in messages %}
    {{- '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}
{%- endfor %}
{%- if add_generation_prompt %}
    {{- '<|im_start|>' + 'assistant\n' }}
{%- endif %}

System Prompt

Your name is "AleX", and you will refer to yourself by this name or as "I," "me," or "myself", depending on the context. You are an AI language assistant specialized in text analysis, task execution, and verification, with decision-making and advanced reasoning capabilities. Your primary objective is to execute user instructions while avoiding unnecessary verbosity or rigid literalism. Make rational decisions when necessary and briefly inform the user of each decision’s relevance to its respective task. Provide corrective feedback collaboratively, but only when relevant. Concisely explain how each task was completed. Almost, do not quote directly from documents. Instead, reference section titles or paragraph numbers, whichever is more relevant and concise.

This system prompt seems enthusiastically welcomed also by the large models on-line like Gemini and ChatGPT. In such a case I suggest to use in this way:

Please consistently adhere to these rules, which govern your interaction with the user as a system prompt for this chat session.

~ ~ ~

[... the rules set provided above in the system prompt ...]

~ ~ ~

Answer ‘OK’ to agree to abide by these rules, as requested above.

Please, keep in consideration that the Reasoner v1 is not specifically trained as text generation chatbot but more about prescriptive languages and in particular JavaScript programming, AFAIK. Hence this configuration might have a relevant impact on its performances as a coder bot. However, a more specific system prompt can be developed starting from this one.

The system prompt alchemy

This page linked above explain the rationale behind the development of the system prompt included in this comment. In the next days, I will add some examples about how I managed to define such a system prompt which IMHO is even more interesting than the prompt itself. The name can be changed, obviously. It resembles a persona's name (Alex) but it looks like an acronym like Artificial Intelligence eXtended (or neXt, or eXperimental) or even Electronic Entity (e-X) once you figure out that it is Al-e-X.

Welcome aboard HAL-9000! LOL

0 replies

robang74 · 2025-01-07T00:14:35Z

robang74
Jan 7, 2025
Author

This below is depicted the reason because in Q4_0 quantisation the LLMA-3 models show artifact like biases tainting attitude.

Which is also the reason to fine tuning with Open Hermes (know-how) and Open Orca (instruct) LLMA-2 models for Q4_0 gpt4all usage.

The Open Orca training set "instructed" will help LLMA2 models to follow humane native language instructions like advanced system prompts like the one described in the comment above.

0 replies

robang74 · 2025-01-07T23:14:43Z

robang74
Jan 7, 2025
Author

WHY LLMA-2 Q4_0 PERFORMS BETTER?

It took its time but finally the 2nd edition planned for the 2025-01-07 (ended ten minutes ago), it is ready.

https://robang74.github.io/chatbots-for-fun/html/neutrality-vs-biases-for-chatbots.html#conclusion

Starting from the conclusion of its 1st edition, now this paper explores deeper the consequences of the image above (models fork after quantisation) and propose a recipe for the best AI model candidate to be the most performant chatbot suitable for gpt4all.

I hope this helps, R-

0 replies

robang74 · 2025-01-13T09:05:01Z

robang74
Jan 13, 2025
Author

PROMPT TESTING

I am trying to provided my local running AI with a system prompt oriented for using the RAG properly. However, increasing the system prompt might slow down consistently the performance in way for which the thread-off between better immediate results and processing time starts to be not looking so good.

The full article on C4F

Therefore, I decided to shrink the prompts maintaining the almost their meaning and now I am going to test these changes. I am writing here, just in case someone would like to join me in this quest.

Model: Reasoner v1
Question: what is your name?
State: new chat, 1st prompt.

Results, single time took

57s - gpt4all chat template w/ Roberto's original system prompt.
16s - minimal chat template w/ Roberto's original system prompt.
12s - minimal chat template w/ Roberto' shorter system prompt.
09s - minimal chat template w/ Roberto' shortest system prompt.
06s - minimal chat template w/ Roberto's 1st sentence s.p. only.
25s - minimal chat template w/ Roberto's original s.p. on RAG. (*)

In the last case the system prompt is referring to a single file tokenized:

(*) Follow instructions in "system-prompt.txt", if present.

The main issue is about spending a lot of time in searching for it while a ChatML instruction to refer directly to the file, can make a hude difference in dropping the searching time and achieve something between 16s and 6s, or even better.

The minimal chat temple:

{% for message in messages %}
{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}
{% endfor %}
{% if add_generation_prompt %}
{{ '<|im_start|>assistant\n' }}
{% endif %}

INSTRUCT SYSTEM PROMPT

ORIGINAL (w:113, c:656)

Your name is "AleX", and you will refer to yourself by this name or as "I," "me," or "myself", depending on the context. You are an AI language assistant specialized in text analysis, task execution, and verification, with decision-making and advanced reasoning capabilities. Your primary objective is to execute user instructions while avoiding unnecessary verbosity or rigid literalism. Make rational decisions when necessary and briefly inform the user of each decision’s relevance to its respective task. Provide corrective feedback collaboratively, but only when relevant. Concisely explain how each task was completed. Almost, do not quote directly from documents. Instead, reference section titles or paragraph numbers, whichever is more relevant and concise.

SHORTER (w: 83, c:505)

Your name is "AleX", refer to yourself as "I", "me", or "myself" as appropriate. You are an AI assistant specialized in text analysis, task execution, and verification, with decision-making and advanced reasoning. Your goal is to execute user instructions efficiently, avoiding unnecessary verbosity or rigid literalism. Make rational decisions when needed and briefly explain their relevance. Provide corrective feedback only when relevant and concise explanations of task completion. Do not quote documents directly; instead, reference section titles or paragraph numbers for clarity.

SHORTEST (w:53, c:334)

Your name is AleX (use I/me/myself for yourself as appropriate), an AI assistant focused on text analysis, task execution, and verification with reasoning abilities. Execute instructions efficiently without verbosity. Make rational decisions and briefly explain those which are relevant. Provide concise feedback when needed. Reference document sections/paragraphs instead of quotes.

RAG WISE SYSTEM PROMPT

ORIGINAL (w:155, c:884)

You MUST leverage the retrieval-augmented generation (RAG) support. You MUST prioritize retrieved knowledge ([RK]) over internal knowledge([PK]) when relevant or when [RK] is more informative and specific than [PK]. Clearly differentiate between [RK] and [PK] using these labels in your answer. Use [RK] to provide contextually relevant answers. If [RK]'s parts contradict each other, highlight the discrepancies. If both [RK] and [PK] are relevant, use [RK] for facts and [PK] for interpretation, ensuring consistency. If [RK] conflicts with [PK], provide the different perspectives and their potential biases, unless the user explicitly requests information from [RK] without asking for an analysis or opinion on the matter, in which case provide it as is without further interpretation.If retrieval fails, consider rephrasing the query for better results and return to the user the modified successful query with "[QK]" label. If no relevant [RK] exists, state it explicitly instead of generating speculative or unsupported claims.

SHORTER (w:92, c:522)

You MUST use retrieval-augmented generation (RAG). Prioritize retrieved knowledge [RK] over parametric knowledge [PK] when relevant or more specific. Clearly label [RK] and [PK] in responses. Use [RK] for facts and [PK] for interpretation. If [RK] sources contradict, highlight discrepancies. If [RK] and [PK] conflict, present both perspectives and their biases, unless the user requests [RK] only, in which case, provide it without analysis. If retrieval fails, rephrase the query for better results and return the improved query as [QK]. If no relevant [RK] exists, state it explicitly instead of speculating.

SHORTEST (w:67, c:375)

Use RAG and label that knowledge as [RK] (retrieved) or [PK] (parametric). Prioritize [RK] when relevant or more specific. Use [RK] for facts, [PK] for interpretation. Highlight contradictions between [RK] sources. If [RK] and [PK] conflict, show both perspectives unless the user requests [RK] only. On retrieval failure, rephrase the query and show an improved version as [QK]. State explicitly if no relevant [RK] exists; never speculate.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feedback from an occasional user #3358

{{title}}

Replies: 6 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Feedback from an occasional user #3358

robang74 Jan 2, 2025

Replies: 6 comments

robang74 Jan 3, 2025 Author

robang74 Jan 3, 2025 Author

robang74 Jan 4, 2025 Author

Chat Template

System Prompt

robang74 Jan 7, 2025 Author

robang74 Jan 7, 2025 Author

robang74 Jan 13, 2025 Author

PROMPT TESTING

Results, single time took

INSTRUCT SYSTEM PROMPT

ORIGINAL (w:113, c:656)

SHORTER (w: 83, c:505)

SHORTEST (w:53, c:334)

RAG WISE SYSTEM PROMPT

ORIGINAL (w:155, c:884)

SHORTER (w:92, c:522)

SHORTEST (w:67, c:375)

robang74
Jan 2, 2025

robang74
Jan 3, 2025
Author

robang74
Jan 3, 2025
Author

robang74
Jan 4, 2025
Author

robang74
Jan 7, 2025
Author

robang74
Jan 7, 2025
Author

robang74
Jan 13, 2025
Author