You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
cues are the sentence splitters. tokens is the word plitter.
If you change var tokens = text.split(' '); to var tokens = text.split('');, you would split the text into chars.
However, it probably won't output anything of value, if anything.
The algorithm works by mapping the structure of the sentences: Positions of the verbs, adjectives, subjects, ... This is how it learns and reproduces the general style of the training text.
If you split in chars instead of words, the POS (Part of Speech) tagging won't work, it won't be able to learn any style, and therefor it probably won't be able to output much.
The text generation is based on statistics rather than machine learning. During training a graph is made that maps the relationship between words, which is then used to generate the text. The output only makes sense because the POS is able to re-build a generally properly structured sentence, but without the POS, the output will probably be nonsense.
I guess all I need is to put spaces between the characters on the text file dataset?
The text was updated successfully, but these errors were encountered: