Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cat -n doesn't handle CRLF newline delimiters correctly #6248

Closed
pfmoore opened this issue Apr 17, 2024 · 3 comments · Fixed by #6763
Closed

cat -n doesn't handle CRLF newline delimiters correctly #6248

pfmoore opened this issue Apr 17, 2024 · 3 comments · Fixed by #6763
Labels

Comments

@pfmoore
Copy link

pfmoore commented Apr 17, 2024

This is with the latest version from github, built locally via cargo build --release --features windows

❯ coreutils printf "hello\r\nworld\r\nthere\r\n" | coreutils cat -n
     1  hello     2
world
there

Weirdly, looking at the output via xxd, the CRLF sequence is switched to LFCR...

❯ target\release\coreutils printf "hello\r\nworld\r\nthere\r\n" | target\release\coreutils cat -n | xxd
00000000: 2020 2020 2031 0968 656c 6c6f 2020 2020       1.hello
00000010: 2032 090a 0d77 6f72 6c64 0a0d 7468 6572   2...world..ther
00000020: 650a 0d                                  e..
❯ target\release\coreutils printf "hello\r\nworld\r\nthere\r\n" | xxd
00000000: 6865 6c6c 6f0d 0a77 6f72 6c64 0d0a 7468  hello..world..th
00000010: 6572 650d 0a                             ere..
@OshinoShinobu-Chan
Copy link
Contributor

I'm trying to fix this issue. And I find out that it is caused by src/cat.rs:write_new_line in Line537. The code is as follows:

fn write_new_line<W: Write>(
    writer: &mut W,
    options: &OutputOptions,
    state: &mut OutputState,
    is_interactive: bool,
) -> CatResult<()> {
    if state.skipped_carriage_return && options.show_ends {
        writer.write_all(b"^M")?;
        state.skipped_carriage_return = false;
    }
    if !state.at_line_start || !options.squeeze_blank || !state.one_blank_kept {
        state.one_blank_kept = true;
        if state.at_line_start && options.number == NumberingMode::All {
            write!(writer, "{0:6}\t", state.line_number)?;
            state.line_number += 1;
        }
        writer.write_all(options.end_of_line().as_bytes())?;
        if is_interactive {
            writer.flush()?;
        }
    }
    Ok(())
}

I think there's two problems in this function:

  1. If cat is not in the show_end mode, the \r will be ingnored in function. This will happen every time it meet \r\n.
  2. Suppose we correctly print the \r/^M, the rest of the code will treat the following \n as a single line itself. Here's an example:
./target/debug/coreutils cat -nE ../crlf.txt
     1  Hello^M     2   $
     3  world^M     4   $
     5  there$

I think these problems can be fixed by modiying the branch statement in function write_new_line as follows;

    if state.skipped_carriage_return {
        if options.show_ends {
            writer.write_all(b"^M")?;
        } else {
            writer.write_all(b"\r")?;
        }
        state.skipped_carriage_return = false;

        write_end_of_line(writer, options.end_of_line().as_bytes(), is_interactive)?;
        return Ok(());
    }

I'll be glad to submit a PR if you agree to my proposal.

Besides, I think tests for this kind of case should be added. But I have no idea where should I add the new test. I'm new in here so please give some guidance, thanks.

@sylvestre
Copy link
Contributor

Sure, don't hesitate to submit the PR

@cakebaker
Copy link
Contributor

Besides, I think tests for this kind of case should be added. But I have no idea where should I add the new test.

You can add the test to /tests/by-util/test_cat.rs.

OshinoShinobu-Chan added a commit to OshinoShinobu-Chan/coreutils that referenced this issue Oct 2, 2024
OshinoShinobu-Chan added a commit to OshinoShinobu-Chan/coreutils that referenced this issue Oct 2, 2024
@cakebaker cakebaker linked a pull request Oct 2, 2024 that will close this issue
cakebaker pushed a commit that referenced this issue Oct 2, 2024
* fix issue #6248

* add test to cat for the case of issue #6248
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants