Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTF-8 #5

Open
tonys0 opened this issue Sep 18, 2019 · 3 comments
Open

UTF-8 #5

tonys0 opened this issue Sep 18, 2019 · 3 comments

Comments

@tonys0
Copy link

tonys0 commented Sep 18, 2019

Hi, I really appreciate the simplicity of using your web tool. It lets you see the characters from the font before you convert it, but after you download the GFX font, you discover that it only contains the low ASCII letters. I know, that Adafruit_GFX can only use this part of the character set. But with only little tricks, one could fake in a few characters needed to complete his national set, if he only had the C code for them. Or is there another (perhaps more correct) way to implement UTF-8 bitmap fonts similar as GFX ?

@ropg
Copy link
Owner

ropg commented Sep 18, 2019

I would love for someone with a more thorough understanding of unicode and code pages to show me how to extract relevant bits from TTFs and insert then into the corresponding GFX fonts... It seemed to me at the time that there is often some user interaction required in mapping from one to the other, and I remember thinking that was all a bit complicated. But if you know what to do "by hand", I will happily put it in the tool so people can use the automated version.

@tonys0
Copy link
Author

tonys0 commented Sep 20, 2019

Basically, instead of having one-byte encoding for ASCII characters (actually only using values 32-127) Unicode uses uint16 values. For some complicated reasons, it is not practical to use uint16 (two bytes) values, mostly because all documents would be twice the size and also the character codes would generate problems in existing editors. So some ingenious guy invented the UTF-8 code in which ASCII characters are coded with one byte. Most Latin characters are coded with two bytes and rest of the languages (incl. Chinese etc.) are coded as three to four bytes. The first byte decides how many bytes will follow in the UTF-8 code for a character. Arduino is using UTF-8 for Serial Monitor, so the choice is obvious. But do not worry, there is an algorithm for UTF-8 -> Unicode calculation. The important thing is, that the font descriptor must be capable to store the first and last used character value as uint16, which is not the case in existing GFX fonts structure. But somehow it should be possible to easily adapt Adafruit_GFX library to fonts with Unicode extension. How do you find the actual graphical structure for characters > 127 in the .ttf file is more than I can tell. Let's keep contact and see what we can do.

@Bodmer
Copy link

Bodmer commented Feb 2, 2020

This is a great utility for generating fonts. I did hack Adafruit_GFX to handle UTF-8 decoding and the 16 bit Unicode range (Basic multilingual plane). See here for the hacked library. Note that the Unicode range extends beyond 16 bits (see Wikipedia).

All that is needed from this tool is a range dialogue box to fill in that specified the start and end characters. I used the youTube tutorial link (see readMe in hacked library link) and command line to generate the Japanese font (code point range 12353 - 12435) used in the hacked version of Adafruit_GFX.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants