Skip to content

Commit

Permalink
More fixes towards swapped punctuation
Browse files Browse the repository at this point in the history
  • Loading branch information
XapaJIaMnu committed Aug 8, 2023
1 parent dcaaff0 commit 61f4671
Showing 1 changed file with 6 additions and 0 deletions.
6 changes: 6 additions & 0 deletions opuscleaner/filters/fix_sent_final_punct.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,12 @@
# Sometimes two punctuation marks are swapped...
if len(src) >=2 and len(trg) >= 2 and src[-2] == trg[-1] and src[-1] == trg[-2]:
trg = trg[:-2] + src[-2] + src[-1]
# Sometimes they are swapped with space around eg SPACE». -> .SPACE»
if len(src) >=3 and src[-1] in my_punct and src[-2] == '»' and src[-3] == ' ':
src = src[:-3] + src[-1] + ' ' + src[-2]
if len(trg) >=3 and trg[-1] in my_punct and trg[-2] == '»' and trg[-3] == ' ':
trg = trg[:-3] + trg[-1] + ' ' + trg[-2]


# check for the french quotes special case
if (src[-1] == '»' or src[-1] == '«') and trg[-1] not in my_punct:
Expand Down

0 comments on commit 61f4671

Please sign in to comment.