Skip to content

Commit

Permalink
Fix newline in Bliss corpus orth
Browse files Browse the repository at this point in the history
The actual functionality of the code seems ambiguous with respect to the explanatory comment some lines above.

In order to keep the information from the writing, the newline should be substituted by a space, and not by an empty string. Example:

```
HELLO
WORLD
```

Current functionality: `HELLOWORLD`.

My proposal is consistent with the human way of reading: `HELLO WORLD`.
  • Loading branch information
Icemole authored Dec 3, 2024
1 parent 23704ff commit a9b46d4
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion lib/corpus.py
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ def endElement(self, name: str):
# writing, thus we remove multiple spaces and newlines
text = self.chars.strip()
text = re.sub(" +", " ", text)
text = re.sub("\n", "", text)
text = re.sub("\n", " ", text)
e.orth = text
elif isinstance(e, Speaker) and name != "speaker-description":
# we allow all sorts of elements within a speaker description
Expand Down

0 comments on commit a9b46d4

Please sign in to comment.