Skip to content

Try to guess in which language is written a (unicode encoded) text

License

Notifications You must be signed in to change notification settings

antoineB/language-guess

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is some scala code to help deducing a language from text.
It is simple, it (with unicode standard) find the alphabet in which the text is write.
Then (if necessary, some language are written in a specific alphabet) it match a number of words to your database.
And find the most language name returned.
This could be irrelevant if you use a few number of words.

** setting up the database
see Mysql in DB.scala

** adding dictionary
see the function  DB.init
see the format of file in dict

** the number of words match
see language.maxWordsToTest
see language.textSizeLimit
see language.phraseSizeLimit

About

Try to guess in which language is written a (unicode encoded) text

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages