Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create a cognates data folder #7

Open
mircealungu opened this issue Apr 12, 2018 · 2 comments
Open

create a cognates data folder #7

mircealungu opened this issue Apr 12, 2018 · 2 comments
Assignees

Comments

@mircealungu
Copy link
Member

mircealungu commented Apr 12, 2018

for each pair of languages should have a single folder from-to (no need for to-from because the cognates file is the same). convention: only have the folder for the alphabetically ordered pair of words (e.g. en-nl, and not nl-en)

- cognates.txt -- format: word_from <space> word_to
- blacklist-<expert-id>.txt  -- format: word_from <space> word_to
- rules.txt -- format: pattern_from <space> patter_to
- algo-params.cfg -- simple pairs of key/value (e.g. edit_distance=0.2 [1])

optional:
- experts.txt -- details about the <expert-id>
- params.txt -- params that were used by the candidates generation algorithm when generating cognates

[1] example of config file: https://github.com/zeeguu-ecosystem/Zeeguu-Core/blob/master/default_core.cfg


@mircealungu
Copy link
Member Author

mircealungu commented Apr 12, 2018

@joelgrondman - I think we need also a whitelist-<expert-id>.txt! Agree?

@joelgrondman
Copy link

Ok, so whitelist- instead of cognates.txt or is whitelist intended to be the candidate list?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants