Fast Spelling correction & Approximate string search
The LinSpell spelling correction algorithm does not require edit candidate generation or specialized data structures like BK-tree or Norvig's algorithm. In most cases LinSpell is faster and requires less memory compared to BK-tree or Norvig's algorithm. LinSpell is language and character set independent.
Copyright (C) 2017 Wolf Garbe
Version: 1.0
Author: Wolf Garbe <[email protected]>
Maintainer: Wolf Garbe <[email protected]>
URL: https://github.com/wolfgarbe/linspell
Description:
https://seekstorm.com/blog/symspell-vs-bk-tree/
License:
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License,
version 3.0 (LGPL-3.0) as published by the Free Software Foundation.
http://www.opensource.org/licenses/LGPL-3.0
single word + Enter: Display spelling suggestions
Enter without input: Terminate the program
- Query correction (10–15% of queries contain misspelled terms),
- Chatbots,
- OCR post-processing,
- Automated proofreading.
The word frequency list was created by intersecting the two lists mentioned below. By reciprocally filtering only those words which appear in both lists are used. Additional filters were applied and the resulting list truncated to ≈ 80,000 most frequent words.
- Google Books Ngram data (License) : Provides representative word frequencies
- SCOWL - Spell Checker Oriented Word Lists (License) : Ensures genuine English vocabulary
SymSpell vs. BK-tree: 100x faster fuzzy string search & spell checking
LinSpell is contributed by SeekStorm - the high performance Search as a Service & search API