Skip to content

Commit

Permalink
Handle 8-bit characters under LOCALE=C
Browse files Browse the repository at this point in the history
When the character set is specified as ASCII (by setting the locale to
`C`), we should handle data outside the 7-bit range gracefully by simply
copying it, even if it is technically no longer ASCII.

Cygwin, however, wants to be a lot stricter than that.

Let's be more lenient in MSYS2 by making the strict 7-bit only handling
contingent on the `STRICTLY_7BIT_ASCII` macro (which we never set
because we don't want it).

This addresses the expectations of two of Git's test cases:
t7400.108(submodule with UTF-8 name) and t9300.193(X: handling
encoding).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
  • Loading branch information
dscho committed Dec 18, 2023
1 parent 0338b1a commit 693c449
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 2 deletions.
2 changes: 1 addition & 1 deletion newlib/libc/stdlib/mbtowc_r.c
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ __ascii_mbtowc (struct _reent *r,
if (n == 0)
return -2;

#ifdef __CYGWIN__
#ifdef STRICTLY_7BIT_ASCII
if ((wchar_t)*t >= 0x80)
{
_REENT_ERRNO(r) = EILSEQ;
Expand Down
2 changes: 1 addition & 1 deletion newlib/libc/stdlib/wctomb_r.c
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ __ascii_wctomb (struct _reent *r,
if (s == NULL)
return 0;

#ifdef __CYGWIN__
#ifdef STRICTLY_7BIT_ASCII
if ((size_t)wchar >= 0x80)
#else
if ((size_t)wchar >= 0x100)
Expand Down
4 changes: 4 additions & 0 deletions winsup/cygwin/strfuncs.cc
Original file line number Diff line number Diff line change
Expand Up @@ -616,7 +616,11 @@ _sys_mbstowcs (mbtowc_p f_mbtowc, wchar_t *dst, size_t dlen, const char *src,
to store them in a symmetric way. */
bytes = 1;
if (dst)
#ifdef STRICTLY_7BIT_ASCII
*ptr = L'\xf000' | *pmbs;
#else
*ptr = *pmbs;
#endif
memset (&ps, 0, sizeof ps);
}

Expand Down

0 comments on commit 693c449

Please sign in to comment.