Replies: 2 comments 4 replies
-
👋 On Ubuntu 22.04, python 3.8, this works: from haystack.components.routers import FileTypeRouter
file_type_router = FileTypeRouter(mime_types=["text/plain", "application/pdf", "text/markdown", "application/vnd.openxmlformats-officedocument.wordprocessingml.document"])
path = "MYFILE.docx"
print(file_type_router.run([path]))
>>> {'application/vnd.openxmlformats-officedocument.wordprocessingml.document': [PosixPath('MYFILE.docx')]} Can you try it in your system? Am I missing something? |
Beta Was this translation helpful? Give feedback.
3 replies
-
Hey @jlonge4 , @anakin87 and I spoke about this and adding some Would you mind opening a PR for this @jlonge4 ? We'll review and integrate it soon after |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have noticed that in a linux env, for instance an aws lambda (python3.12), the FileTypeRouter will output docx and pptx (or other microsoft based flavors of files) as unclassified unless you first run:
How could we implement checking/adding mime types specified at init time are added at init time, reducing unclassified outputs on legitimate mime types.
Any ideas @anakin87 ?
Beta Was this translation helpful? Give feedback.
All reactions