Skip to content

lauma/LVSegmenter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

77 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LVSegmenter

Domain name segmenter for Latvian.

For usage sample see SegmenterUI.java

As a word list for Latvian we sugest to use filtered result from https://github.com/PeterisP/morphology/blob/master/src/tools/java/lv/semti/Vardnicas/VarduSaraksts.java and for English e.g. http://www-01.sil.org/linguistics/wordlists/english/wordlist/wordsEn.txt Lists are handled using Patricia Trie from Apache Commons https://commons.apache.org/proper/commons-collections/apidocs/org/apache/commons/collections4/trie/PatriciaTrie.html

About

Domain name segmenter for Latvian.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages