-
Hi, I am building Haystack indexing pipeline, and as I understood from documentation PrePorcessors are language specific, while they work good with english, they are also pretty decent with different languages, but to improve perfomance I would need to specify language. Question: So I would want to build CustomLanguageDetector, which would route documents into three different Preprocessors somehting like this?
But I am getting an error lie this one:
Did someone make the same thing and could help me solve the issue? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 12 replies
-
Hello! Your approach is 90% correct! There are a couple of things to know.
There are other ways to approach this problem, so don't worry yet 😄 First of all I need to know, how does your pipeline look like, and how did you write your documents into the document store? What I have in mind is that you should rather add a |
Beta Was this translation helpful? Give feedback.
Hello! Your approach is 90% correct! There are a couple of things to know.
run
takes an argument calleddocuments
. Mind that it contains a list of documents, so in the body of the function you have to iterate over them.run
can produce output only on a single edge at a time. So if yourdocuments
list contains docs in different languages, you must discard some or throw an exception. In your case, this might be an issue...There are other ways to approach this problem, so don't worry yet 😄 First of all I need to know, how does your pipeline look like, and how did you write your documents into the document store? What I have in mind is that you should rather add a
language
metadata tag to …