-
Notifications
You must be signed in to change notification settings - Fork 292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IRC to Discord: ISO 8859-1 encoded umlauts (at least ä, ö) are corrupted #237
Comments
Any plans to do something to this? It does look quite nasty and it's difficult to get people to change their encodings as most IRC clients are capable of reading messages both in ISO 8859-1 or UTF-8 in the same channel, heuristically autodetecting which encoding is used. "It's worked for years, you are the only one who complains." |
I'm not particularly sure where the issue would be arising in the process but it's plausible it'd be in our IRC library, which unfortunately doesn't appear to be very well maintained and as such it may be a while before it's fixed and would be an upstream issue. It might not be that it's a problem with |
I don't suppose you could give me a server this happens on, or a client that would produce this problem? It might help track it down. It also seems like a thing to be fixed in the upstream library, irc-upd, which I've now taken over. |
Sure I can. I'll need to ask around a bit first, but I suspect mIRC prior to 7.0 will work right out the bat. I'll update with a reproductible testcase asap. |
It looks like the upstream library actually converts these (and displays them) just fine if you add Could you try that and get back? |
Ubuntu Server 14.04:
Bot is running with utf-8 enabled under ircOptions. Will report if I still see errors. |
Fixes the issue. Should be the default setting. (Edit: and I'm tired atm so I go with few words.) |
Fixed. The three affected users my channel had are no longer pushing � characters into Discord. |
If it's the default setting it seems likely to cause some problems with how often it fails to install. I'll suggest a note in the README, instead, then users can set it up themselves? ( |
The problem hits on all non-English speaking channels so for them there is no option: they must have icu anyway. Thus instead I'd work on README (or add a separate documentation) so that it has good instructions on how to check if icu is installed and how to install it if it is missing. Also, the software could check itself whether icu exists and show a message with link to documentation, and set the default encoding value to |
It only affects those that don't use Unicode encoding, so shouldn't (I think) affect eg Korean speakers. I was going to add a section to the README to check out the |
Yes, |
While slightly offtopic, here is an example on how much friction there still is to move everything to UTF-8 on IRC: irssi/irssi#671 I was slightly surprised to see that there really is a Japanese server that serves everything in ISO-2022-JP, including channels and nicknames. |
* Add a note about how to install optional charset converter dependencies Fixes part of #252 and #237. * Default ircOptions.encoding to utf-8 if node-irc can convert encodings Also warn when started if the IRC library cannot convert between encodings, in case users rely on this behavior. * Improve config check and warning message around encodings
Some ISO 8859-1 output from old IRC clients gets pushed into Discord like this:
[3:44 PM] BOT IRC: <User> ny l�htis l�mmitt��n saunan
The text was updated successfully, but these errors were encountered: