Sentence segmentation with 'en_core_web_trf' shows unexpected behavior #13647
Unanswered
igormorgado
asked this question in
Help: Other Questions
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Segmentation with transformers is made in very uncommon places. Let me show some small examples. The text is the following
Processing with
en_core_web_trf
gives the followingOutput
While using
en_core_web_sm
(or md or lg), givesOutput
As expected.
My versions are the following
How can I improve/correct the segmentation created by the transformer?
Beta Was this translation helpful? Give feedback.
All reactions