-
Notifications
You must be signed in to change notification settings - Fork 521
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Markdown blocks have trouble rendering unicode emoji #749
Comments
@jameschensmith if you're open to looking into this, I would look here to start: d2/lib/textmeasure/markdown.go Line 88 in 8fa00ca
For Markdown to work, D2 has to manually measure everything and match it with CSS. For example, H1 tags have x padding, y line height, z margin. d2/lib/textmeasure/markdown.go Line 33 in 8fa00ca
The way fonts work, we measure each glyph: d2/lib/textmeasure/textmeasure.go Line 172 in 8fa00ca
If you can reproduce this with labels, then it's not a markdown problem. But that might not be ascertainable, since we pad labels, so it won't have that noticeable cutoff. I suspect the emoji unicode gets measured incorrectly. I'm not really sure what the correct measurement should be, whether it's constant or some unit size of the font-size. Good luck and please let me know if you get stuck, happy to dig into it more with you! |
Thanks, @alixander. Did a little more debugging this morning, and wanted to leave some more notes here. I did some more tests with other emojis, and the results are very strange. For instance, the cloud emoji (☁️) renders with proper width, as does a very complex emoji like "couple with heart: woman, man, medium-light skin tone" (👩🏼❤️👨🏼) which consists of multiple runes. Then you have the monkey emoji (along with others), and it gets cropped. Here's a simple playground showing two emojis measured differently. Also, I believe that this playground example shows that it may not be just markdown. Basically, It doesn't look like it has to do with the amount of runes an emoji requires. Unicode emoji resources: |
oh interesting. maybe the font we use, Source Sans Pro, actually just has a subset of emojis supported. Maybe we need to package an emoji font like https://github.com/samuelngs/apple-emoji-linux, detect if the user used any emojis, and then inject that font into their SVG if so. Line 1062 in 0da4e15
|
Another update (things are getting clearer the more tests I run). The cloud emoji used in the last example is actually two runes. So basically, the emojis that are only one rune are the ones that get cut, whereas the emojis that are multiple runes can actually render too wide. I've provided an example below which shows emojis using 1-8 runes (ordered from smaller to larger). This may actually be a good regression test to add in the future. |
hah, wow interesting, til about emojis |
This is pretty complicated 😅 I've read in some places that emojis are double-width characters, but also that not all emojis have the same width (I think flags are one of those exceptions). I also read here that best solution might be to check the font for the glyph width. In any case, I think one improvement to what exists now would at least be to treat characters that consist of sequences of runes as a single glyph. Then, treating the emoji glyphs as double-width, or some other solution could come after that. |
If my understanding of the code is correct, d2/lib/textmeasure/textmeasure.go Lines 20 to 28 in 0da4e15
And Lines 121 to 123 in 0da4e15
So all non-ASCII codepoints (including emojis) have the same bounding box as the unicode replacement character. What we could do is render any unsupported character as the unicode replacement character, until those other characters are supported. This sounds like a bad idea, I know 🙈 There's quite a bit of problems involved here, as I've noted in previous comments. |
ohh that makes sense. I don't think the shim is necessary though, better to overflow into the padding a bit imo. My fear is just that even if we measure this "right", if it's not part of the font we use, and users on different machines will just fallback to their OS's emoji font, which will over/underflow anyway. I don't think there's an option other than having the TTF support a large corpus of emojis |
Might have to implement this #638 first. Emoji packs might be huge to embed just for a 🙈 |
@bo-ku-ra oh wow, good find. this looks incredibly relevant |
japanese uses double-byte characters. |
Summary
When using unicode emoji in a markdown block, the output appears to be getting trimmed. I tried to see if this was an issue with goldmark, but I don't think it is. I see that there is goldmark-emoji for
:joy:
-style emojis, but it didn't look like d2 was using this.Issue appears to be for all layout engines.
Open to working on this if provided some guidance (would be my foray into the codebase).
Resources
D2 test file
— Playground Link
Exported SVG & PNG
The text was updated successfully, but these errors were encountered: