Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent translation result with glossaries #16

Open
boeryepes opened this issue Jun 20, 2022 · 2 comments
Open

Inconsistent translation result with glossaries #16

boeryepes opened this issue Jun 20, 2022 · 2 comments

Comments

@boeryepes
Copy link

Hi, first of all I'm really happy with the DeepL translation and the API. Recently I've started using the glossary function and the results are inconsistent. As I am using your API (v1.2) I assume the issue is not on my side.

I've attached a screen shot of the glossary (EN-DE) and the text source text in ENGLISH and the translation to GERMAN with and without use of glossary (with glossary using a PRO account, without glossary using a FREE account). The ID of the glossary is a0f05fa1-c5b5-45b7-824c-222a540168c1.

In yellow I have highlighted where the glossary has not been applied and in red the weird behavior where the changing the first character of the ENGLISH source to a capital affects the translation to GERMAN but at least the translation is more similar to the glossary (though not fully).
Inconsistent application of glossary

Any clue?

regards, Klaas

@daniel-jones-deepl
Copy link
Member

Hi Klaas, good to hear :) Thanks for creating this issue, and thanks for all the details.

I reproduced some of the cases you reported, so I can confirm your usage is correct; I reported these cases to the responsible team at DeepL. Unfortunately I don’t have any fixes at the moment.

There are a couple of tips that might help with glossaries in general though: firstly our glossaries (and translations in general) tend to perform better with larger sentences, due to the added context. Though usually if the translation input matches a glossary source term we'd expect the translation output to match the glossary target term.

Secondly, you might get better results by using natural capitalisation in glossary terms i.e. use the capitalisation that is most natural for the language, e.g. “to read Shakespeare” -> “Shakespeare lesen”, “red phone” -> “rotes Telefon”.
For your glossary that means some English terms should be lowercased: “vehicle block”, “vehicle rotation”, “vehicle schedules”, “shift”, etc. I tested these changes in your case as well, unfortunately it doesn't fix the issue; we're looking into it.

@boeryepes
Copy link
Author

boeryepes commented Jun 20, 2022

Thanks Daniel, happy to contribute. With regards to natural capitalization you are opening a can of worms as the Americans would say ... capitalization can be intentional by the author and influenced by the type of document. For instance if the author is a client whom we cannot influence (e.g. has issues a request for proposal) and/or the document is a legal document then capitals can have a meaning. In general the translation should not depend on additional interpretations to the target language unless it is 'natural' in the language to depend on capitalization. For instance in German nouns are capitalized.

And indeed, the glossary should be taken literally but even there DeepL has an issue. The Glossary target term is 'Umlaufplan' but DeepL translates to 'umlaufpläne' which in German should always be with capital 'U'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants