How to pull files from GitHub, convert from Markdown and place in DocumentStore - all without Docker #5379
Replies: 2 comments 1 reply
-
Hey @eric-cooper I'd give llama-hub a try. Use https://llamahub.ai/l/github_repo and use GithubRepositoryReader to pull the docs. After you pull these docs you can then inject them into a Haystack pipeline, use MarkdownConverter, or any other component from Haystack ecosystem. |
Beta Was this translation helpful? Give feedback.
-
Absolutely, if you're looking to pull files from a GitHub repo, convert them from Markdown, and insert them into a DocumentStore without diving into Docker, you're on the right track. Consider giving Crawlbase a try! It offers straightforward crawling capabilities, making it a great fit for your needs. You can utilize the MarkdownConverter class to handle the conversion seamlessly. Once converted, chunk up the text and insert it into your DocumentStore. |
Beta Was this translation helpful? Give feedback.
-
The title kind of says it all. I am new to Haystack and looking for pointers/examples on how to
I would like to reduce complexity during initial prototyping, so I'd like to avoid Docker for now.
Thanks in advance
Beta Was this translation helpful? Give feedback.
All reactions