Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add natural translation for DSL #574

Draft
wants to merge 30 commits into
base: master
Choose a base branch
from

Conversation

BrentBlanckaert
Copy link
Collaborator

@BrentBlanckaert BrentBlanckaert commented Dec 11, 2024

You can run the pre-processor by using

python -m tested.nat_translation ./exercise/simple-example/program_language_map/suite.yaml en # English translation

tested/nat_translation.py Fixed Show fixed Hide fixed
@pdawyndt
Copy link
Contributor

Maybe we could support translations of a testplan (and rollout of templates?) as

python -m tested.translate <testplan>

with an extra option (or argument) to pass the natural language for the translation.

tested/nat_translation.py Fixed Show fixed Hide fixed
tested/nat_translation.py Fixed Show fixed Hide fixed
@BrentBlanckaert
Copy link
Collaborator Author

BrentBlanckaert commented Dec 12, 2024

Maybe we could support translations of a testplan (and rollout of templates?) as

python -m tested.translate <testplan>

with an extra option (or argument) to pass the natural language for the translation.

This should work.

@BrentBlanckaert
Copy link
Collaborator Author

@pdawyndt in #559 it also says that translations for files should be provided. In what sense?
I've got something like the following:

- files: !natural_language
    en:
      - name: "file.txt"
        url: "media/workdir/file.txt"
    nl:
      - name: "fileNL.txt"
        url: "media/workdir/fileNL.txt"

This seems pointless since I could also just do:

- files:
  - name: "file.txt"
    url: "media/workdir/file.txt"
  - name: "fileNL.txt"
    url: "media/workdir/fileNL.txt"

tested/nat_translation.py Fixed Show fixed Hide fixed
tested/nat_translation.py Fixed Show fixed Hide fixed
@pdawyndt
Copy link
Contributor

Not really pointless as TESTed will show all "linked files" to the students. For each file, TESTed will try to find its name in the expression/statement and then turn that into a hyperlink. If it doesn't find the filename, it will add it to a list of files that is displayed for the testcase.

@BrentBlanckaert
Copy link
Collaborator Author

BrentBlanckaert commented Dec 13, 2024

@pdawyndt , I started looking for adding a translation table like

translation:
  animal:
    en: "animal"
    nl: "dier"
  result:
    en: "result"
    nl: "resultaat"

Is it even usefull to then also add support in a statement like the following:

- statement: !natural_language
   en: 'result = Trying(10, "{animal}")'
   nl: 'resultaat = Proberen(10, "{animal}")'

I would suggest not even searching lookingany deeper when a natural_language map is already found and only using translation map when the expected (like a string) is given.

@pdawyndt
Copy link
Contributor

pdawyndt commented Dec 13, 2024

I definitely have many exercises where this (the combination of translation and template strings) is useful. So I would say yes. If we use Python format strings, then we could even write your example as

- statement: !natural_language
   en: 'result = Trying(10, {animal!r})'
   nl: 'resultaat = Proberen(10, {animal!r})'

And not even bother about using single or double quotes or escaping any quotes in the thing you put in the placeholders (which otherwise adds a lot of complication on the side of the DSL-author).

If you have a variable statement pointing to the format string for the statement, a dictionary translation containing the merged translation from the DSL hierarchy and a dictionary data containing the testcase data, turning the template string into the actual string (by filling up the placeholders) would then come down to

statement = statement.format(**translation, **data)

If we also allow data to be an YAML-array instead of a YAML map (positional instead of named placeholders), then formatting is done by

statement = statement.format(*data, **translation)

For example, if data = [3, 4, 7] then we could have

'{} +  {} = {}'

or with explicit positions (which would also allow reodering and reusing the array values)

'{0} +  {1} = {2}'

tested/nat_translation.py Fixed Show fixed Hide fixed
@BrentBlanckaert
Copy link
Collaborator Author

BrentBlanckaert commented Dec 15, 2024

Currently I've implemented support for !natural_language and a translation map you can define globally, in a tab and in a context. Here is a quick rundown of everything that is possible:

The translation map looks like the following:

translation:
  animal:
    en: "animals"
    nl: "dieren"
  result:
    en: "results"
    nl: "resultaten"

This can be defined

  • Next to the tabs (globally)
  • In a tab
  • In a context

The !natural_language map can be defined in the following ways:

In a tab

  • If tab (the name) is a dict, it means it's a !natural_language map where using !natural_language is not necessary.
    • After that translation of the !natural_language map, it is assumed that the name will be a string. This will then be formatted using the translation maps.

In a testcase

  • For a statement or expression using !natural_language is mandatory.
    • If it is there it'll first perform the translation.
    • After that, it'll check if it's a dict. If it is, then we do formatting based of the translation maps on each value.
    • If it's a string we just perform formatting on that.
  • When a stdin is a dict it is assumed that it's a !natural_language map. So using !natural_language is not necessary.
    • From this dict a translation is performed.
    • The result of that should be a string, which is always formatted even if stdin wasn't a dict.
  • For arguments the same holds as stdin except that the result will be a list and formatting is performed on each item.
  • stderr, exception and stdout follow the same structure:
    • The usage of !natural_language mandatory. If it's there we'll do the translation. If the result is a string, it will be formatted.
    • If a dict remains, we'll look at the "data" key ("message" for exception)
      • Check if that is a dict:
      • If it is, perform translation (no !natural_language needed).
    • The value of "data" should be a string or should be one after the translation. That string is formatted.
  • For files I've only added support for usage of !natural_language. No formatting is done.
  • If the return is an Oracle:
    • We look at the arguments and do the exact same as specified before.
    • After that we look at the value. If translation is done, it's mandatory to use !natural_language. That translation will turn it in a list, dict, int or string. This will be parsed an correctly formatted.
  • If it's not an Oracle, we check if it's a !natural_language map. If it is, we parse the result of the translation for possible formatting.
  • Otherwise just parse the value for possible formatting.
  • When using a description using !natural_language is also mandatory. The result of that translation will be formatted if its a sstring
  • When it's a dict, check the "description" key. If that is a dict, then it's a translation. After the value of the "description" key should always be a string and formatted.

tested/nat_translation.py Fixed Show fixed Hide fixed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants