-
-
Notifications
You must be signed in to change notification settings - Fork 21.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add support for bidi text output #10546
Conversation
cf3636e
to
792cfd6
Compare
@bruvzg Could I ask you to push less often to your PR branch? Normally I'd encourage pushing changes often, but sadly Travis CI is already overloaded and has a huge backlog, and your many updates don't help: https://travis-ci.org/godotengine/godot/pull_requests I really wish Travis CI was clever enough to cancel builds of dangling commits on force pushed branches, but seems they still don't have the feature. |
@akien-mga OK |
51629a1
to
a85f4b6
Compare
I was waiting for this :D. good luck |
5542bb8
to
09aa033
Compare
I did some tests on BidiEditTest project:
|
this is not bidi related, but existing
|
bca2d3d
to
865f429
Compare
scene/resources/paragraph_layout.cpp
Outdated
Vector<String> feature_rng = v_features[i].split(":"); | ||
if (feature_rng.size() == 3) { | ||
if (feature_rng[1].is_valid_integer() && feature_rng[2].is_valid_integer()) { | ||
if ((feature_rng[0].length() == 5) && (feature_rng[0].begins_with("-") || feature_rng[0].begins_with("!"))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hb_feature_from_string()
should correctly handle feature negation and ranges already, see hb-view --help-features
for the supported syntax.
Single line IS a paragraph and DO require layout. There's simply no such thing as plain text. Anyway this arguing is completely pointless, if you think there's reasonable way to implement this in draw_string/get_string_size, just do it in your way. |
No one? Is this why there are close to twenty threads asking for proper Arabic support on Godot? Or why all of the Arab speakers and readers here are pushing for proper BiDi support? Do you think implementing a "simple switch" will also work for Godot's interface in Arabic? No one asked for proper support? Then why even bother having the option to switch the engine language to Arabic? Without proper BiDi support, it's (at best) half-assed, and (at worst) an insult. This thread by Rami Ismail has close to 300 responses showing there is interest in having gamedev tools to support BiDi natively, and this is mainly the Arabic side of things. @reduz Please at least take the time to read this article (which @khaledhosny and I previously shared) by Ramsey Nasser about the complexities of rendering non-Latin text in computing. There is an increase in demand for proper support of many RTL (and U/D) scripts in software as well as in the tools that allow us to build them. And to echo to what's been said before: If it was easy to do, then Mozilla and Google wouldn't have dropped their attempts at implementation for existing solutions like HarfBuzz. Now if you think you can do better, please do. But as others have pointed out: it's not that simple. Because if it was, we wouldn't be seeing this conversation drag on for close to four years. |
I think the no one wanted feature @reduz mentioned is not about BiDi support. |
But as @bruvzg has stated:
This only becomes obvious when you start looking into non-latin scripts. Please read the above linked article on this topic. |
By layout, I mean everything you support in the TextLayout class. This is not needed for single line. You only need to detect that RTL is used and draw that text accordingly. |
@reduz: Arabic support is more than just RTL. Arabic glyphs change shape depending on the position in the string (initial, medial, final), even if we discount diacritics (which we shouldn't). Therefore both RTL and font shaping are absolutely necessary for workable Arabic support even for a single line of text. |
@Zireael07 It's been stated and repeated many times over this thread, and we're still given the same response of "IT CAN'T BE THAT DIFFICULT" even though there have been multiple talks, articles and videos about how difficult it is. @reduz Have you read the article both @khaledhosny and I shared? It goes into detail about the complexities of implementing proper BiDi in any piece of software and gives some good advice as to what could be done. For example: calling on the system implementation of BiDi is one suggestion. |
@Zirael07, @MoustafaC I am not saying shaping should not be supported, only that you can infer how to do it from a single string, without the need of a layout class. |
Guys, you seem to be misunderstanding that I don't want Bidi or that I think no one wants it. This issue is about the technical implementation. I suggested a way that requires no change to most existing controls and also keeps it simple for people working with text lines. Bruvzg is against it because he thinks there is duplicated code, which I told him I don't mind at all. It has nothing to do with the complexity of shaping itself. |
Guys calm down, I want to help but I lack the knowledge in order to do so. However as a developer I want to use Arabic easily and I don't care how did it implemented as long as its easy to use and has good performance. |
Just to agree with above, DirectWrite for example offers nothing other than paragraph layout and offers it as an Inversion-of-Control like API which understands a paragraph and should understand other layout things like images and so if is used on browser layout engine. GUI controls, at least at platforms I had access, support multi-paragraph, and on Android you can pass SpannableString on any UI widget (which even supports text mixed with images), so providing a dedicated single line layout sounds like to me also as an unusual and eager optimization which as you saw above on DirectWrite (and CoreText) doesn't provide any dedicated support to it. If AWT does that, one should consider how archaic it is, SWT which uses native components of each platforms can have multi-paragraph on buttons for example,
$ cat <<EOF | jshell --class-path /usr/lib/java/swt-gtk-4.6.0.jar
import org.eclipse.swt.*
import org.eclipse.swt.graphics.*
import org.eclipse.swt.layout.*
import org.eclipse.swt.widgets.*
var display = new Display()
var shell = new Shell(display)
shell.setLayout(new RowLayout())
var button = new Button(shell, SWT.PUSH)
button.setText("Multi\nline\nlabel")
button.addListener(SWT.Selection, e -> System.exit(0))
button.setImage(display.getSystemImage(SWT.ICON_QUESTION))
shell.pack()
shell.open()
while (!shell.isDisposed())
if (!display.readAndDispatch())
display.sleep()
display.dispose()
EOF |
It's not just about code duplication. I'm against bad, inconvenient, inefficient, misleading, non-expendable API. It's the notion of "simple" text and "simple" controls itself that's wrong. PS: I won't abandon everything that was done here completely, I intend to use this code myself after-all and will eventually make it into module and/or gdnative, probably with one major change inspired by @akien-mga's comment - removing internal border/padding/table and making TextLayout a subclass of Container to accept any other Control/Container as text embedded objects. |
That is exactly where I think you are wrong. You think a good API is a generic one that covers all uses cases by being expandable and modular, while @reduz believes that a good API must first be kept simple for the most common use cases, and then use workaround to solves uncommon problems. This simple design philosophy lead Godot to the point it is now, it should be followed when deciding how to include new features. Drawing a string IS simple, what is complex is drawing string that needs font shaping or bidi. So the aim here is to keep the API simple for "simple to draw" strings, and add workarounds/specific functions for "hard to draw" strings. Something which is not the case in your proposal. |
This is exactly what I mean. You are showing that your problem is not about Bidi, Arabic or Shaping. Your problem is you don't like how Godot draws strings in general. This is what I mean when I state that no one asked for what you want, not Bidi, Shaping or Arabic. This is really a shame, because we are very short on contributors who understand Bidi or Shaping. Many want this feature, you have the ability to make it happen but you don't want to because you want to change something in the engine design that no one asked for and that is unrelated to Bidi, Shaping or Arabic. I was taking a look to minibidi to understand better how shaping works. I think I feel I understand the algorithm a lot more now, so If you feel you can't contribute anything else in this area, I will probably give it a try to implementing this library. It may not be the best solution, but should work for most users. |
@ebraminio Except those UI toolkits are huge and they care more about correctness than performance and resource usage. We don't aim for that, but have enough tools where, if you want something specific, you can do it yourself. Also, it is a false that all controls support spannable strings in toolkits. If you look at Qt or Gtk, most controls don't support it, only a few main ones like Label, Button, etc. The rest (tree, list, combo, popupmenu, etc.) are all single line, which is what we are discussing here. If even dedicated toolkits work like this, why we should go and make everything more complex just for the sake of it? There clearly is not an use case for this here. |
Two different things, shaping can happen without bidi and bidi can happen without shaping. Indic language needs shaping but not bidi necessarily, Hebrew may work without a dedicated library like harfbuzz for shaping but needs bidi more. Arabic needs both of them. How you are going to render Nastaliq script, needed for Urdu people, with some home grown solution... (since then Google has developed a font for Urdu language, and interestingly, Apple has shipped that font for Urdu language)
Right, but DirectWrite, referred on the first part of my comment, both text renderer and shaper of Windows just offers that paragraph API which anyone should use even to get a simple kerning from a font. You see, if one, theoretically, just stick to Microsoft proprietary DirectX family APIs, use DirectWrite for text rendering, and Direct3D for rendering, they won't need to go on these arguments.
I was specifically referring to Android for SpannableString. On Android's Java, SpannableString implements CharSequence (which java.lang.String also implements it) and many Android's Widgets, if not all, can take that as input.
I think open-source projects are about being a little more inclusive if possible and not excluding majority of humans languages that easily. Putting more penalty for "complex" text rendering was the case with many software, even some really performance critical ones (referring to Blink/Chrome and Firefox), and they also find that it is not constructive to have two different path for "Simple" and "Complex" scripts (terms I feel a little discriminating as the native speaker of one those languages) so they've merged those two paths in order to get correct Arabic and also Latin (substituting and kerning, both available on Latin, have a look at Zapfino for example) from one correct and tested text rendering path. |
void Font::draw(RID p_canvas_item, const Point2 &p_pos, const String &p_text, const Color &p_modulate, int p_clip_w, const Color &p_outline_modulate) const {
Vector2 ofs;
int chars_drawn = 0;
bool with_outline = has_outline();
bidi_char* _in = (bidi_char*)memalloc(sizeof(bidi_char) * p_text.length());
bidi_char* _out = (bidi_char*)memalloc(sizeof(bidi_char) * p_text.length());
for (int i = 0; i < p_text.length(); i++) {
_in[i].origwc = p_text[i];
_in[i].wc = p_text[i];
_in[i].index = i;
}
do_bidi(_in, p_text.length());
do_shape(_in, _out, p_text.length());
for (int i = 0; i < p_text.length(); i++) {
int width = get_char_size((CharType)_out[i].wc).width;
if (p_clip_w >= 0 && (ofs.x + width) > p_clip_w)
break; //clip
ofs.x += draw_char(p_canvas_item, p_pos + ofs, (CharType)_out[i].wc, (CharType)_out[i + 1].wc, with_outline ? p_outline_modulate : p_modulate, with_outline);
++chars_drawn;
}
if (has_outline()) {
ofs = Vector2(0, 0);
for (int i = 0; i < chars_drawn; i++) {
ofs.x += draw_char(p_canvas_item, p_pos + ofs, (CharType)_out[i].wc, (CharType)_out[i + 1].wc, p_modulate, false);
}
}
memfree(_in);
memfree(_out);
} For quick and dirty support it's portably OK. Maybe it's actually best solution, |
|
This kind of shaping do_shape offers is only useful for a subset of Arabic script languages and not Urdu, Uighur and .... Also not Indic languages scripts, Khmer, Bengali and many other scripts. If you have issue with ICU and its data particularly, I suggest to replace it. HarfBuzz can work without it using built-in UCDN and feel free to replace your bidi algorithm with something lightweight. I think @khaledhosny knows other alternatives. |
@ebraminio As @bruvzg says, I am more inclined to use a solution that is small and works for most common cases or markets, while a specialized module can be shipped via GDNative that covers the rest, if you really want to support these markets. This is not with the intention to discriminate, but for the sake of being practical,to avoid bloating games if they don't target these markets, or have no intention to support multiple languages (which is the case with most small indie games) |
@bruvzg that sounds good, I can give it a try and will make a modular API so it can be replaced for something more advanced, likely via GDNative. Minibidi will probably work for LineEdit and TextEdit too, but but I think for Label and RichText label it will not be enough, as I am not quite sure what the logic for newlines is.. still, maybe I could just detect that the input is RTL (I saw some code inside minibidi to do this), and simply flip the spanning code? |
@bruvzg you have to think that the intended use for this is game translations. you usually make your game using translation keys (stuff like DIALOG_AT_BATTLE_1 or INVENTORY_MENU_TITLE), and then a big spreadsheet that has the keys translated to text in multiple languages.. so this should definitely work automaticaly, swapping to RTL if detected. |
For multiple lines you should:
For minibidi original order should be preserved in |
First of all, I should thank you again for the efforts you putting on this opensource work and maybe as a outsider it was better to not go for intervening your processes. minibidi feels like kind of hack only to support terminals and even it is not the way macOS's terminal works and it is weird that a game engine wants to go that way.
Do you agree this more like a chicken-egg issue?
Understandable generally, I don't like ICU data overhead. HarfBuzz overhead is totally less than 1MB and forget about Arabic, you can apply it to Latin to get better results (as it applies Kernings and Ligature correctly and it has a super simple API) (disclaimer: I am collaborator on HarfBuzz project but this is not advertising it). If you were targeting only one the platforms, Windows, Linux, macOS, you could use the platform text shaper just reliably so shipping HarfBuzz at least worth the hassle. I am not sure, maybe you can integrate HarfBuzz with minibidi? Not having a correct bidi is not that important to me than not supporting minor languages only HarfBuzz can help to support. At the end, I think in opensource projects we should think more about people count in general than market and $ they can raise, thus, considering all the languages in software design as first class citizen worth the hassle at the end, let alone it is the correct way for implementing text rendering stack anyway. |
I have tested this minibidi stuff and it's utter shit, definitely not worth any effort. At least it will do something, minibidi is really only for fallback monospace fonts.
|
This PR uses ICU and HarfBuzz libraries, to implement rendering of bi-direction text, complex scripts and OpenType typographic features.
Thirdparty libs:
HarfBuzz (full lib except non-relevant backends and UCDN)
ICU4C ("common" library, unicode base data and break iterator data)
Implements:
TextLayout
class to layout paragraph of rich text. Provides bidirectional analysis and reordering, cursor positioning (split cursors for mixed directional text), hit testing, justification, line wrapping.It can be used to layout text and embedded objects (images, spans and tables with nested layouts) in a single flow.
TextLayout
content itemsTextLayoutItem*
:TextLayoutItemText
- text with attributes.TextLayoutItemImage
- embedded image.TextLayoutItemSpan
- embeddedTextLayout
.TextLayoutItemTable
- simple embedded table withTextLayout
s.TextLayoutRect
- control that displaysTextLayout
TextHitInfo
-TextLayout
hit test result, represents a character position, position where new text is inserted into the layout, leading and trailing edges.Changes:
) to do render hex code boxes (
) in place of missing chars. No breaking changes.
Font
/DynamicFont
: adds methods to provide direct access to font glyphs forTextLayout
and embedded 167 byte 'font' (Demo project:
BidiTest.zip
Related:
#3081, #9961, #982