Replies: 8 comments 14 replies
-
Certainly not from scratch, as you will need hundreds of thousands besides the know-how and a lifetime to build the datasets, if you don't use existing ones, but even so, so no, not from scratch it's impossible at our level, academia has done some interesting models from scratch but nowhere as good as openai. You are left with 3 options:
|
Beta Was this translation helpful? Give feedback.
-
may i can ask, why is pdfgear(300MB small) such as good vs all others ? you can ask a 600page pdf (laodtime 10sec), maybe to search for a topic after the answer (5sec) you can ask specific questions, preferably related to a part of the answer .... also ver good to SUMMARY eg 3 pages (you can ask "give me a longeexplanation summary" if the anwer is to short) |
Beta Was this translation helpful? Give feedback.
-
Hello. On v2.6.2 and 2.7.1 I have done this with Falcon and Mini Orca, in the Prompt including the ever present Experts, in such a degree that I wrote an article on Medium about an LLM "using external information only". In this, I've happily (blindly = not asking an LLM what is its definition of "external knowledge") followed the advice in a blog post or somesuch (don't have specifics now, but it is in my archive, will search for it; ) and it worked with the PDFs in a collection. Since then I've asked a few LLMs to define "external information" and as a result I do not like/want this Prompt anymore. Note that this text was in the Prompt Template, not in the System Prompt which I've left as it was. "You imagine three experts, each of whom will build a reply for my request. Tricky, I'd say. PS Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4 |
Beta Was this translation helpful? Give feedback.
-
i have wrote this in system-prompt in v2.7.1 |
Beta Was this translation helpful? Give feedback.
-
After these downright abusive instructions, the conversation intended to /gradually focus on LocalDocs only continued in this way:
That^ is the way I've tested it, forcing it to forget and ignore eveything it knew - obviously, this brutal approach must be refined, on the lines of "everything you know about the subject of my request etc" as I've written in that Prompt on Medium. It was hard labor, and in the end I lost my patience asking this and verifying that and expecting trash instead of a coherent reply, as the LLM initially logical and coherent has eventually (like, not even 10 Prompts after 'forgetting" stuff) lost its mind and wits. The definition of this otherwise Good LLM for "external knowledge" turned out not to be what I imagined - LocalDocs, that is. So, I believe that it's more complex than "you don't use your internal knowledge / anything that you already know"; grammar, syntax, lexicon, information about which subject etc. The structure of the LLM is very complex in its own words - the lists with sources of information that this one has provided and was told to ignore/forget show that. A Prompt for LocalDocs-only has to be fine-tuned as well; but my patience's run out - that's why for now I've reverted to no custom SP and PT at all. Temperature: 0.1 |
Beta Was this translation helpful? Give feedback.
-
i see ... |
Beta Was this translation helpful? Give feedback.
-
You can do this with PrivateGTP. You dump all your PDF files into a folder, run a script so it can read all the PDFs, then you can ask questions |
Beta Was this translation helpful? Give feedback.
-
Love the idea, but for me all the info needs to stay local.
I've got some highly confidential stuff
Mark C. Robinson
…On Thu, Jun 13, 2024, 00:03 tcreek ***@***.***> wrote:
You can do this with PrivateGTP. You dump all your PDF files into a
folder, run a script so it can read all the PDFs, then you can ask questions
—
Reply to this email directly, view it on GitHub
<#1766 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BIOE5CYRGZN4TLVWKK5FK4LZHBPLBAVCNFSM6AAAAABA4MW7IOVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4TONJTGE3DI>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Hi,
I am new to the LLMs. I have a large collection of PDF files. Can I locally create a new model from scratch by training it using these PDFs? I will try if this is the case.
Beta Was this translation helpful? Give feedback.
All reactions