-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rosetta and TMs (translation memory) #249
Comments
Hi Peter!
That's correct: Rosetta's main task is offering a user-friendly interface to interact (read/write)
Yes, but as every "project" only manages its own gettext catalog, if you manage different projects, then you cannot access translations you've already provided in other projects. From my limited understanding of what a TM is, such a tool should maintain a database of all the corpus a translator has ever produced, so that when a new string needs to be translated, the tool will provide suggestions based on some possibly fuzzy match on the database. So if that's what you're expecting, then no, Rosetta doesn't provide that kind of feature at the moment, because again: the only datasource is the po catalog itself.
This is provided through a series on interfaces to online translation services, such as Google Translate, Bing Translate, Yandex translate and such. But a professional translator will probably frown upon these services and rather prefer their own TM corpus. 🤷🏼♂️
No, you can download the PO catalog for the current project, but that's it.
Ah well, now: if any such thing exists and is well documented (if there is a catalog to upload and / or an API to query) then I don't see why that wouldn't be doable. Hope this helps, further discussion and PRs welcome 😉 |
Rosetta is simply a Django app that processes pofiles and compiles their correspondent mofiles. Rosetta does not imposes any way of pre or post-process your files. You could emulate a translation memory using a pofile compendium. But keep in mind that this process of discovering new fuzzy matches is not managed by Rosetta, but by the scripts written by the developer of the project. I understand the lack of use of Rosetta in the translation industry, because, for example, if you need to go back for translations removed from the files, these will not be found in a separate database. If I'm not wrong, you are asking for a pofile compendium that could be added as another pofile of the project and a button (or whatever other system) that could discover new matches, then another button that could download pofiles in different formats. Is this correct? |
Alright, so a PO file compendium, which is a concept of GNU gettext, corresponds to what other translation tools maintain as a TM?
Exactly. Basically, I want to satisfy the expectations of translation agencies. They can
According to the Transifex docs downloading a TMX is possible. I wouldn't be surprised if that was actually a PO file compendium converted to XML. (You need a paid plan to do this, for what I can see on Transifex.) For what regards Rosetta, in theory, the simplest approach (as a concept) might be
That compendium could then be used to allow for automatic pre-translation or assisted translation (suggestions). It would be an automatic, fully integrated TM that doesn't need any separate management effort by the user. Allowing to download a TMX could be an optional feature. Would that be realistic? |
Having a compendium is only the first step. We'd also need an intelligent way of matching past translations from the compendium and produce fuzzy suggestions in the PO catalog being translated. |
True. If we had that, though, we could address one side of the criticism already: "It doesn't have a TM" would cease to be true. And converting a PO file compendium to TMX seems to be a thing that is already addressed by free projects. – Just saying. |
The Translation Toolkit seems to be a very good candidate to manage, import, export and possibly search TMX documents in Python. |
Hi there!
We're using Rosetta 0.9.4 on Django 2.2.17, and all is good. Apart from skepticism of professional translators, of course. The main theme is, "The tool doesn't provide a TM, hence we can't use it."
I need some help to understand this topic better.
Note that my wife is a professional translator and project manager in the translation industry, so I am informed largely about the concepts of "traditional translation" of documents (e.g. SDL Trados, Across, OmegaT) but also about the approach emerged from the software development industry (e.g. Transifex, Crowdin), which I have hands-on experience with.
Where is Rosetta's TM?
From my understanding, Rosetta is more or less a nice front-end to manipulate
.po
files, extracted by Python'sgettext
module integrated in Django. There are no models, yet still Rosetta does "automatic translation", which is visible by fuzzy matches (which I assume is also a feature coming fromgettext
again, really).So in essence, the
.po
files themselves are that TM already. There is no additional or separate component, but as the entire "document" is identical to all (successful) translations that have been done in the past, there is not even a need for a separate TM. It's all read into "Rosetta's memory" in its entirety. There is no disadvantage of having "no TM", given we only deal with our domain specific vocabulary.Is this view correct?
External TMs?
A related question, after having clarified whether Rosetta has a TM or no, is there a way to
to add, say, more flexibility to the translation process?
The text was updated successfully, but these errors were encountered: