Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output delimiter should be same as input #46

Closed
mbhall88 opened this issue Nov 17, 2021 · 8 comments
Closed

Output delimiter should be same as input #46

mbhall88 opened this issue Nov 17, 2021 · 8 comments

Comments

@mbhall88
Copy link

mbhall88 commented Nov 17, 2021

cut automatically uses the input delimiter as the output delimiter. I find it quite annoying to have to specify the delimiter twice. I wonder if you would consider changing the default behaviour to do so? I appreciate this is a breaking change, but might be better to do this before hitting v1?

@sstadick
Copy link
Owner

Hi! I would lean toward no because a regex can be used as the delimiter and the behavior in that scenario could be surprising. But I'll think about it a bit more and leave this issue open for now.

The other bit of my reasoning is that while iterating on a one-liner, tabs are preferable to look at the outputs, then, once you've settled on the command, it's not too onerous to switch to a different output delimiter.

@sstadick
Copy link
Owner

@ghuls, as the only other active user of hck that I know of, do you have any thoughts on this?

@ghuls
Copy link
Contributor

ghuls commented Nov 17, 2021

@sstadick I generally work with TAB delimited files and not often with files with other delimiters. I use it for processing of multi gigabyte files and not for small text files, so typing a few extra characters does not bother me. TABs by default make everything more readable imho. awk also uses \s+ as field separator and as output separator.

The need to add a dolar sign before '\t and \n, when used as output separator, annoys me more. It would be nice if those are recognized by default.

❯  hck -d ':' -D $'\t' /etc/group | head
root	x	0	
bin	x	1	
daemon	x	2	
sys	x	3	
adm	x	4	
tty	x	5	
disk	x	6	
lp	x	7	
mem	x	8	
kmem	x	9	

❯  hck -d ':' -D '\t' /etc/group | head
root\tx\t0\t
bin\tx\t1\t
daemon\tx\t2\t
sys\tx\t3\t
adm\tx\t4\t
tty\tx\t5\t
disk\tx\t6\t
lp\tx\t7\t
mem\tx\t8\t
kmem\tx\t9\t

❯  hck -d ':' -D '\n' /etc/group | head
root\nx\n0\n
bin\nx\n1\n
daemon\nx\n2\n
sys\nx\n3\n
adm\nx\n4\n
tty\nx\n5\n
disk\nx\n6\n
lp\nx\n7\n
mem\nx\n8\n
kmem\nx\n9\n

/v/leuven-user/303/vsc30366 vsc30366 in r23i27n23 in /v/leuven-user/303/vsc30366 via 🐏  467GiB/756GiB 62% 
❯  hck -d ':' -D $'\n' /etc/group | head
root
x
0

bin
x
1

daemon
x

@sstadick
Copy link
Owner

That's a good point, the $ is pretty annoying.

@sstadick
Copy link
Owner

#47 will fix the unescaping for -d and -D.

@mbhall88 I've compromised and added a -I options that will reuse the input delimiter as the output delimiter if the intput delimiter is a literal and no other output delim is specified. So to replicate cut you could now do something like: hck -LId' ' <file> and the space will be reused for the output.

@ghuls
Copy link
Contributor

ghuls commented Nov 23, 2021

@sstadick Thanks for fixing the unescaping. Works much more convenient now.

@sstadick
Copy link
Owner

@ghuls see #49 / v0.7.1 for incoming fix for the header line parsing to apply the same unescaping.

@sstadick
Copy link
Owner

sstadick commented Jan 28, 2022

Closing for now, please reopen if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants