[addition for LocalDocs: recent news/events] ZIP file with +300 PDF news articles between 2024.01.01 and 2024.02.28 #2050
Replies: 3 comments
-
This ZIP will be deleted today and replaced with the current one - updated with news from February 29, 2024 and possibly older. Please let me know if I should open a dedicated channel (to Nomic, perhaps) for such ZIPs with current events, in order not to post the /links here and clutter the Discussion list. Thank you. |
Beta Was this translation helpful? Give feedback.
-
I think it is best if you keep these ZIP distributions to a single discussion thread on GitHub. If you want to notify the community when you post new ones, you could mention them on the Discord in #gpt4all-chat - and of course, users can subscribe to a discussion to receive instant notifications when there is an update. |
Beta Was this translation helpful? Give feedback.
-
Hello
Thank you for the ideas.
Yes, the updates are the issue... how to announce them... but then again, when a "common" user is complaining about something wrong with these local docs, who's to know if all the files in there have been engaged in the RAG, or not...
I'll try doing as you said, placing it on Discord. In the meantime, I'll try and delete this /old post on GitHub, let's see if I can 🙂
…________________________________
From: Jared Van Bortel ***@***.***>
Sent: Thursday, February 29, 2024 7:40 AM
To: nomic-ai/gpt4all ***@***.***>
Cc: SINAPSA-IC ***@***.***>; Author ***@***.***>
Subject: Re: [nomic-ai/gpt4all] [addition for LocalDocs: recent news/events] ZIP file with +300 PDF news articles between 2024.01.01 and 2024.02.28 (Discussion #2050)
I think it is best if you keep these ZIP distributions to a single discussion thread on GitHub. If you want to notify the community when you post new ones, you could mention them on the Discord in #gpt4all-chat - and of course, users can subscribe to a discussion to receive instant notifications when there is an update.
—
Reply to this email directly, view it on GitHub<#2050 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BF3GLTRR75FNRCSUE6I2OMLYV5FVNAVCNFSM6AAAAABD7RHWCCVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4DMMZSGUYDS>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Documentation
Hello.
I have put together, for use as a LocalDocs collection, some 300 (three hundred) news articles and a few science papers between January 01, 2024 and February 28, 2024.
This set is increasing by the day (open the news in browser->Print->PDF printer, or Save, if it already is a PDF like the science papers are) and I use it as a LocalDocs collection, to ask the LLMs about recent events - for instance, if curious how's the weather in Detroit, then Mistral (at least) will respond using the PDF dated 2024.02.27 about the record-breaking 73 degrees in February; or, what's the land area engulfed by flames in Texas; or, who's won the Michigan caucuses today... you'll be told.
News sources: many, among which: Al Jazeera, Associated Press, Axios, Bloomberg, CNBC, CNN, Daily Mail (what), DNyuz, Fox, France24, Guardian (UK), Medium, NHK, NPR, PBS, Politico, reddit (yay), Semafor, The Sun (whatwhat), Wired, ZeroHedge.
Language of news: English (almost all items), other (Polish, Russian, Spanish, ~10 items, irrelevant)
The ZIP file:
-- yyyyMMdd-nameofinfosource-titleofnewsarticle.pdf
for instance,
20240228-CoinTelegraph-OpenAI accuses New York Times of hacking AI models in copyright lawsuit.pdf
Everyone interested can download it from my website (which is old and legit to boot), at the address:
[(http://sinapsaro.ro/frees/_infos/_localdocs_for_gpt4all_news_2024jan01feb28.zip)]
It will be deleted on 2024.02.29, at ~17 UTC/GMT.
So maybe someone who downloads it would want to share it, too, or use it to train an LLM...
If anyone is interested in other such packages, feel free to let me know: the "archive" goes back to early 2003 (although most of the files until ~2010 are .mht).
A go to give it, you may.
Thank you.
Beta Was this translation helpful? Give feedback.
All reactions