Recommended way to parse surrogate pairs? #64

aggieben · 2021-02-06T21:10:03Z

I'm working on a TOML parser, and I'm a bit at a loss for how to parse unicode characters that have surrogates in UTF-16/UCS-2 (I mention TOML because these codepoints are valid in it). I'm not deeply familiar with the CharStream in FParsec, but at a first reading it doesn't seem to have any notion of surrogates, and deals entirely with sequences of individual characters of type char.

Is there a way to parse surrogate pairs?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recommended way to parse surrogate pairs? #64

Recommended way to parse surrogate pairs? #64

aggieben commented Feb 6, 2021 •

edited

Loading

Recommended way to parse surrogate pairs? #64

Recommended way to parse surrogate pairs? #64

Comments

aggieben commented Feb 6, 2021 • edited Loading

aggieben commented Feb 6, 2021 •

edited

Loading