-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loading Lexeme by ID doesn't work #37
Comments
From 2.x we're switching to a lexicon automatically derived from tēzaurs.lv data export - the current default lexicon setup (Lexicon_v2.xml) does not have a lexeme with ID 43716. Explicitly loading the old lexicon ( |
Thanks for the explanation. I took the lexeme ID from here (this is probably lexicon v2) and tried with both |
When will this switching to v2 happen? Is web functionality of the same version & database, that this morphology library has at the moment? For example, your Also, will this library give information if a verb is of pabeigta/nepabeigta veida? |
Pabeigtība in Latvian is not a verb feature to be easely derived from morphological elements like endings and infixes, so even if it were there (I doubt it), you should not trust it or use it. |
I clearly understand that. But there is other information that cannot be derived from word form only:
but this information is in the lexicon though. That is why I am asking this question: does this information exist anywhere at all (some dictionaries)? |
Some of the Tēzaurs sources contains transitivity, see "trans." or "intrans." near the head of verb entries. Also for transitivity there are more or less strightforward syntactical check - transitivity basically mean, if verb can be used together with object:
Meanwhile pabeigtība in Latvian is much more fuzzy and vague than transitivity, thus, I don't think it will end up in Tēzaurs. As some languages has morphological distinction for that, linguists of the world do speak of such feature, but for Latvian it is purely semantical. Types of adverbs and adjectives, I think, mostly comes from grammar books, where it mostly goes like one type can be enumerated and all the rest goes in other type. As tagset used in korpuss.lv requires this feature, Tēzaurs probably will eventually become augumented with it. When it comes to proper/common nouns: well, traditional dictionaries like LLVV or MLVV just do not include proper nouns. Tēzaurs sometimes contains markings "vietv." or "persv.", e.g., http://tezaurs.lv/#/sv/Liepa, but as it is with all the things in the Tēzaurs - coverage is partial. The same as with adjective/adverb type, some augumentation eventually will be done, but I don't know when or to what extent yet. For now quite telling is the usage of capital letters in the entryword - if it contains at least one capital letter anywere, it is either some abbrievation or proper noun, but not your average common noun. |
Thank you very much. I didn't know that the information that I put in the list, wasn't fully added to tezaurs. Is it possible to add pabeigtība as a parameter, so that maybe I can try to add it to some verbs? I think I might have an idea how to do that not manually. |
Umm, how do You plan to obtain such info? Usually even linguists strugle to assign pabeigtība unambiguously. |
If we had Latvian-Russian dictionary electronically, then we might had been able to parse verb articles for presense of double verbs, for example: Compare perf.-imp. "встречаться-встретиться" vs imp. "нравиться". We might mine this data and add it to tezaurs. Without doubt, this is raw and preliminary data that needs linguists' approval, but this might be a good start, isn't it? |
I'm not convinced:
|
Thank you for your explanation. I will try to accept the fact that this parameter is much more ephemeral in Latvian, than in Slavic languages. Nevajag to likt Prokrusta gultā =D But in some cases, like "izdarīt" we can always say that it is used only as perfective, isnt it? |
Returning to the original question I eventually got it to work (even with being weird that this nethod demands lemma explicitly, while lexeme must contain it):
|
I tried to use function calls, as you mentioned here:
The returned lexeme is
null
. What did I do wrong? Thank you, @PeterisP !The text was updated successfully, but these errors were encountered: