Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid garbage characters showing around formatted data #1447

Closed
Crunkle opened this issue Feb 18, 2020 · 11 comments
Closed

Invalid garbage characters showing around formatted data #1447

Crunkle opened this issue Feb 18, 2020 · 11 comments

Comments

@Crunkle
Copy link
Contributor

Crunkle commented Feb 18, 2020

I am seeing invalid characters appear around formatted numbers on (some) systems using multilingual locales. Attached is an example image - the left side is the output I see (en-US) and the right side is from a friend's system (fr-CA):

bad_format

Where the output is written for all values like this:

logger->info("Tested, OK: {}", -128);

I am using the v1.5.0 tagged commit, but the same issue occurs on the latest branch. As far as I have tested, using a regular printf gives the correct output without any garbage.

@tt4g
Copy link
Contributor

tt4g commented Feb 19, 2020

I guess character encoding and data type problems.

spdlog formats log messages with the fmt library.
fmt performs the format according to the numeric type.

Look at the output of building the code below on my Windows machine.
The console character code is set to Windows-1252 (ISO-8859-1).

#include "spdlog/spdlog.h"
#include "spdlog/sinks/stdout_sinks.h"

#include <locale>

int main(int, char**)
{
    auto console = std::make_shared<spdlog::sinks::stdout_sink_mt>();
    auto logger =
            std::make_shared<spdlog::logger>("console", console);

    spdlog::register_logger(logger);

    std::locale locale;
    logger->info("my locale={}", locale.name());

    char ch = 192;
    logger->info("Tested, OK: {}", ch);

    int num = 192;
    logger->info("Tested, OK: {}", num);

    spdlog::shutdown();

    return 0;
}

Console:

$ chcp 1252
$ spdlog_issue_1050.exe
[2020-02-19 10:05:27.146] [console] [info] my locale=C
[2020-02-19 10:05:27.148] [console] [info] Tested, OK: À
[2020-02-19 10:05:27.149] [console] [info] Tested, OK: 192

When 192 is formatted as char type, "À" is printed.
This is because "À" corresponds to 192 in the code page of ISO-8859-1.
fmt recognizes and formats 192 as char rather than number type, so 192 is never print.

Check the data types that are formatted in your code, and provide code that can reproduce the issue.

@Crunkle
Copy link
Contributor Author

Crunkle commented Feb 19, 2020

The types that are formatted are std::uint8_t, std::uint32_t, etc as well as const char * immutable strings. I do not think this is a code page issue - running chcp 437 before does not work nor do any others.

To make things worse, the values that are output change every time the program is run which lead to my conclusion that it must be referencing garbage memory somewhere.

See here for a second run of the same binary file to compare to the first image:

bad_format_2

This does not sound like the issue you describe to me personally. I am running an identical binary file on my side and have correct formatting. The only alternate idea I have is to run a debugger remotely and see where these values are coming from. I can try directly using the fmt library too?

@tt4g
Copy link
Contributor

tt4g commented Feb 19, 2020

Can you isolate the problem?
If the problem is reproduced even if you output the log with logger->info(fmt::format("{}", ...));, spdlog is probably not the cause.

@Crunkle
Copy link
Contributor Author

Crunkle commented Feb 19, 2020

I have now narrowed it down and reproduced it using a printf(fmt::format...). I will have a final look tomorrow and close this if needed, but it is looking like spdlog may not be the culprit.

@tt4g
Copy link
Contributor

tt4g commented Feb 19, 2020

If you determine that it is not related to spdlog, close this issue.

@Crunkle
Copy link
Contributor Author

Crunkle commented Feb 19, 2020

Upon double checking, fmt is indeed the cause. It uses __builtin_clz to count digits which is not reliable on Windows/Clang toolchains. Fixed by manually patching fmt with further exclusions.

See fmtlib/fmt#519.

@Crunkle Crunkle closed this as completed Feb 19, 2020
@gabime
Copy link
Owner

gabime commented Feb 19, 2020

Which fmt version are you using? seems like an old issue that was fixed long time ago..

@Crunkle
Copy link
Contributor Author

Crunkle commented Feb 19, 2020

I am using the latest. It looks like some compilers will compile the __builtin_clz to a lzcnt instruction which is unreliable on certain older CPUs. This has nothing to do with locales as I originally thought, my friend's system just had an older Intel CPU without lzcnt support.

Other examples: https://stackoverflow.com/a/40621975.

@gabime
Copy link
Owner

gabime commented Feb 19, 2020

@Crunkle As a reference for users who might encounter this issue, could you please provide the spec of the bad build env (compiler version, os version, cpu)?

@Crunkle
Copy link
Contributor Author

Crunkle commented Feb 19, 2020

Sure, although do note that I was using unstable versions of LLVM and MinGW. Specifically, I am using an older version of Clang 9.0 targeting x86_64-w64-windows-gnu. This was built using a bleeding edge version of MinGW w64 GCC 9.2 (SEH) under Windows.

The affected machine was running a Haswell i7 4790k on the latest Windows 10. I am only able to reproduce it on Haswell and earlier processors.

I do not think that many users will have trouble with this. The toolchain I am using is very unstable and only suitable for development, but I hope this helps if any future issues arise!

@gabime
Copy link
Owner

gabime commented Feb 19, 2020

Thanks for the info!

bachittle pushed a commit to bachittle/spdlog that referenced this issue Dec 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants