Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash when first line is empty and -L is specified with a one character separator. #38

Closed
ghuls opened this issue Sep 21, 2021 · 5 comments · Fixed by #39
Closed

Crash when first line is empty and -L is specified with a one character separator. #38

ghuls opened this issue Sep 21, 2021 · 5 comments · Fixed by #39

Comments

@ghuls
Copy link
Contributor

ghuls commented Sep 21, 2021

Crash when first line is empty and -L is specified with a one character separator.

# File with empty first line.
$ printf '\n1\t2\t3\n4\t5\t6\n'

1       2       3
4       5       6

# Get first 2 columns with regex "\t".
$ printf '\n1\t2\t3\n4\t5\t6\n' | hck -d '\t' -f 1,2 -

1       2
4       5

# Get first 2 columns with regex code using a real TAB character (created by bash).
$ printf '\n1\t2\t3\n4\t5\t6\n' | hck -d $'\t' -f 1,2 -

1       2
4       5


# Get first 2 columns with literal separator option using a real TAB character (created by bash).
$ printf '\n1\t2\t3\n4\t5\t6\n' | hck -L -d $'\t' -f 1,2 -
thread 'main' panicked at 'attempted to index slice up to maximum usize', /software/hck/src/lib/core.rs:528:59
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

# Will print all columns as "\t" is not treated as a regex.
$ printf '\n1\t2\t3\n4\t5\t6\n' | hck -L -d '\t' -f 1,2 -

1       2       3
4       5       6

# With one space as delimiter, we get a crash too when using the literal separator option.
$ printf '\n1\t2\t3\n4\t5\t6\n' | hck -L -d ' ' -f 1,2 -
thread 'main' panicked at 'attempted to index slice up to maximum usize', /software/hck/src/lib/core.rs:528:59
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

# With two spaces as delimiter, it works.
$ printf '\n1\t2\t3\n4\t5\t6\n' | hck -L -d '  ' -f 1,2 -

1       2       3
4       5       6
@ghuls
Copy link
Contributor Author

ghuls commented Sep 21, 2021

Something like this probably will fix it.

diff --git a/src/lib/core.rs b/src/lib/core.rs
index dc45f43..476dfdc 100644
--- a/src/lib/core.rs
+++ b/src/lib/core.rs
@@ -520,7 +520,11 @@ where
                     line.push((start, index - 1));
                     start = index + 1;
                 } else if bytes[index] == newline {
-                    line.push((start, index - 1));
+                    if (index != 0) {
+                        line.push((start, index - 1));
+                    } else {
+                        line.push((0, 0));
+                    }
                     let items = self.fields.iter().flat_map(|f| {
                         let slice = line
                             .get(f.low..=min(f.high, line.len().saturating_sub(1)))

Although for performance reasons it might be better to check before the for index in iter loop if the first byte is a newline character, so this condition does not have to be checked in each iteration.

@sstadick
Copy link
Owner

sstadick commented Sep 21, 2021

I closed it too soon. v0.6.5-alpha is adding an extra line when there is a blank line at the start.

@sstadick sstadick reopened this Sep 21, 2021
@sstadick
Copy link
Owner

Okay, see v0.6.5 release which should be building now.

@sstadick
Copy link
Owner

@ghuls also, again, thanks for taking the time to make a detailed issue! You make these really easy to fix 👍

@ghuls
Copy link
Contributor Author

ghuls commented Sep 21, 2021

Thanks for the fast fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants