Releases: INL/int-pie
Releases · INL/int-pie
1.0.0
0.9.0
Changelog
- Logging info displaying once again in Python 3.10
- Freezes the pip requirements
- Fixes empty pos and lemma not being registered as tasks (by still taking into account their \t)
- Attentional decoder now truncates to the length of the longest known token from training, instead of a hard limit of 20 chars. Truncating still occurs if the lemma it would need to produce is longer than any seen in the train set.
- Doesn't tokenize word boundaries on the following 4 punctuation marks: [ ] ' -
- Previously PIE would split on spaces, then perform an extra split on punctuation. However, we want to exclude the following cases and see them as 1 token: 't, Oorlogh-schepen, ghega[e]n
- Doesn't tokenize word boundaries for any punctuation mark that occurs in a number. E.g.: €1,50
- Uses the decoder optimizations by PaPie.