Become a sponsor to Paul O'Leary McCann
Howdy. I'm working on open-source NLP projects, particularly improving the ease-of-use of Japanese text processing software.
While there are many good tools for working with Japanese text, many are hard to use, and English documentation is often absent, which makes integration into multi-language projects difficult. My goal is to make working with Japanese text easier so the fruit of the international NLP research community can be used more easily.
If you have an open source NLP project and would like to add Japanese support please feel free to contact me.
Some projects I maintain:
- mecab-python3, a popular MeCab wrapper for Python
- fugashi, a modern Cython wrapper for MeCab
- (unofficially) Japanese support in spaCy
- posuto, a wrapper for Japan Post postal code data
- cutlet, a Japanese to romaji converter
As an independent software developer, I rely on contract work for my primary income. Any income through sponsorships allows me to prioritize open source work. Thanks for any help you can give ❤
Featured work
-
polm/fugashi
A Cython MeCab wrapper for fast, pythonic Japanese tokenization and morphological analysis.
C++ 398 -
polm/posuto
🏣📮〠 Japanese postal code data.
Python 204 -
polm/cutlet
Japanese to romaji converter in Python
Python 306 -
explosion/spaCy
💫 Industrial-strength Natural Language Processing (NLP) in Python
Python 30,213 -
WorksApplications/SudachiPy
Python version of Sudachi, a Japanese tokenizer.
Python 391