Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All newlines in short messages are replaced with whitespaces #100

Closed
fupgang opened this issue Feb 21, 2024 · 10 comments · Fixed by #102
Closed

All newlines in short messages are replaced with whitespaces #100

fupgang opened this issue Feb 21, 2024 · 10 comments · Fixed by #102

Comments

@fupgang
Copy link
Contributor

fupgang commented Feb 21, 2024

Describe the bug

Since the graylog server does not accept empty ("") short messages, GelfEncoder class replaces them with "Empty message replaced by logback-gelf". A blank ("\n") short message is replaced, too.

    protected String normalizeShortMessage(final String shortMessage) {
        // Graylog doesn't like newlines in short messages: https://github.com/Graylog2/graylog2-server/issues/4842
        final String sanitizedShortMessage = sanitizeShortMessage(shortMessage);

        // Short message is mandatory per GELF spec
        if (sanitizedShortMessage.isEmpty()) {
            addWarn("Log message was empty - replaced to prevent Graylog error");
            return "Empty message replaced by logback-gelf";
        }

        return sanitizedShortMessage;
    }

But the current implementation of sanitizeShortMessage replaces every newline with a whitespace.
Thus a non-blank short message with linebreaks gets squashed and is finally less readable in graylog ui.

The above mentioned issue 4842 for graylog2-server states:

Graylog is not accepting a newline "\n" as a valid GELF short_message

It refers to a single newline, so the current implementation maybe too strict.

To me it looks like the GELF Payload Specification does not totally deny newlines for short messages.

What do you think?

To Reproduce

import de.siegmar.logbackgelf.GelfEncoder;
import org.junit.jupiter.api.Test;

import static org.junit.jupiter.api.Assertions.assertEquals;

class GELFEncoderTest {

	@Test
	void newlines() {
		/*
		 * unknown operand:
		 * x=1 § 1
		 *     ^
		 */
		String short_message = "unknown operand:\nx=1 § 1\n    ^";

		MyGelfEncoder gelfEncoder = new MyGelfEncoder();
		String normalized = gelfEncoder.normalizeShortMessage(short_message);

		assertEquals(short_message, normalized);  // fails
	}

	static class MyGelfEncoder extends GelfEncoder {
		// for public access
		@Override
		public String normalizeShortMessage(String shortMessage) {
			return super.normalizeShortMessage(shortMessage);
		}
	}

}

Additional context

  • using logback-gelf 5.0.1, same results with 5.0.0
  • the implementation was different in 4.x
@osiegmar
Copy link
Owner

The current implementation does indeed seem to be doing too much. Would you like to send a pull request that changes that without breaking the class’s interface?

@fupgang
Copy link
Contributor Author

fupgang commented Feb 22, 2024

Yes, I would send a pull request, but I need additional information.

What is the expected behavior of normalizeShortMessage()?

  • It should replace empty strings with a something like "Empty message replaced by logback-gelf".
  • It should replace blank strings alike.
    See GelfCodec in graylog2-server
if (StringUtils.isBlank(shortMessageNode.asText()) && StringUtils.isBlank(messageNode.asText())) {
    throw new IllegalArgumentException(prefix + "has empty mandatory \"short_message\" field.");
}

This could simply be implemented like this:

@Override
public String normalizeShortMessage(String shortMessage) {
    if (shortMessage.isBlank()) {
        addWarn("Log message was blank - replaced to prevent Graylog error");
        return "Blank message replaced by logback-gelf";
    }
    return shortMessage;
}
  • Should it still trim leading and trailing whitespaces like the current implementation does? (Simply return shortMessage.trim();)
    This can be helpful in some situations, but I don't think this is enforced by the GELF specification. Therefore we should not always enforce this. It could be an additional configurable feature (later).

  • It should probably not replace any whitespace char with ' ' like the current implementation does? This is the nature of this issue. But I don't know the initial motivation for implementing it this way.

@fupgang
Copy link
Contributor Author

fupgang commented Feb 22, 2024

  • It should shorten the string to maxShortMessageLength.

@osiegmar
Copy link
Owner

I'd say, it should:

  • Strip leading whitespaces (using String#stripLeading)
  • Shorten the string to maxShortMessageLength, if longer than that
  • Strip trailing whitespaces (using String#stripTrailing)
  • Replace blank strings with something like "Empty message replaced by logback-gelf"

Stripping whitespaces could be optional (configurable).

@fupgang
Copy link
Contributor Author

fupgang commented Mar 1, 2024

I just created the PR #101, which does not include the option to strip inner whitespaces.

But I wonder, if we should rethink the design:

  • The PatternLayout formats the ILoggingEvent to a string.
    • This could include stripping or squashing of whitespaces.
    • This could include trimming to a max length.
    • Some of this is already possible with PatternLayout and an appropriate pattern (trimming like %.-250(%m) ).
    • Allowing custom implementations of Layout provides further flexibility.
  • The GelfEncoder encodes that string and additional information to a json message, conforming to the GELF spec.
    • The already formatted string remains unchanged.
    • Except the case of blank strings, that are replaced with "Empty message replaced by logback-gelf".

@fupgang
Copy link
Contributor Author

fupgang commented Mar 7, 2024

PR #101 is now mergeable.

What do you think of my previous comment?

@osiegmar
Copy link
Owner

osiegmar commented Mar 8, 2024

What do you think of my previous comment?

I think, you're absolutely right! I built the whole sanitising logic based on the misunderstanding of Graylog2/graylog2-server#4842

Formatting the message itself (including shortening, trimming, etc.) should be the sole responsibility of the pattern layout. Only the blank string handling has to be addresses – as you pointed out.

Could you create another PR for that?

@fupgang
Copy link
Contributor Author

fupgang commented Mar 14, 2024

PR #102 is mergeable.

Will there be a patch or minor release soon?

osiegmar pushed a commit that referenced this issue Mar 16, 2024
#102)

encoder uses layout for all message formating, but still prevents blank messages (#100)
@osiegmar osiegmar reopened this Mar 16, 2024
@osiegmar
Copy link
Owner

Will there be a patch or minor release soon?

A major release (due to breaking changes) is planned within the next two weeks.

@osiegmar
Copy link
Owner

osiegmar commented Apr 4, 2024

Just released 6.0.0 - thanks for your contribution!

@osiegmar osiegmar closed this as completed Apr 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants