Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should STDOUT always set flush_on_newline by default? #7431

Closed
michael-selig opened this issue Feb 14, 2019 · 31 comments · Fixed by #7470 or #8935
Closed

Should STDOUT always set flush_on_newline by default? #7431

michael-selig opened this issue Feb 14, 2019 · 31 comments · Fixed by #7470 or #8935

Comments

@michael-selig
Copy link

I am just starting with Crystal, being a Ruby programmer for many years. I am impressed with its performance, but playing around, I found one thing that badly affects performance when writing to standard output: that flush_on_newline is always set by default on STDOUT. My trivial test:

s = "a" * 60

10000000.times do
  puts s
end

takes real: 10.4s user: 2.4s sys: 8.0s when its output is redirected to a file. (FYI the same thing run by the Ruby 2.5.1p57 interpreter takes real: 3.8s user: 3.5s sys: 0.3s)

Yet if I add the line:

STDOUT.flush_on_newline = false

at the top, the time drops dramatically to: real: 0.5s user: 0.3s sys: 0.2s, which is more what I was expecting.

I suggest that the definition of STDOUT in kernel.cr be changed to:

STDOUT = IO::FileDescriptor.new(1).tap { |f| f.flush_on_newline = true if f.tty? }

so that flush_on_newline is only set when STDOUT is a TTY. I am fairly sure that this is how Unix & Linux C programs normally behave. Typically I believe that standard error is completely unbuffered, but appears to be line buffered in Crystal. This, however, probably isn't a big deal.

@ysbaddaden
Copy link
Contributor

STDOUT is more and more used to log things (systemd, docker expect that), if we don't flush on newline, we can easily reach out-of-sync logs when not logging furiously, or even truncated logs if crystal crashes.

I believe the flush on newline default is good, even when it's not a tty. One can always fix this for some cases with STDOUT.flush_on_newline = false.

If we change the default to change on tty or not, then we should also fix logger to flush after each message.

@michael-selig
Copy link
Author

That is true, but I would have thought that developers of loggers would have to be conscious of the fact that each log msg needs to be flushed. Should a regular person writing a program writing to STDOUT (like I did) have to worry about whether STDOUT needs to be set to flush or not to get decent performance (one of Crystal's claim to fame)? Also consider code like:

log_file = File.new("/var/spool/myapp.log", "a")

which returns a file with flush_on_newline off, requiring a logger developer to either turn it on, or flush themselves after writing. Why should STDOUT be different?

So I would argue that the logger should be the one with the onus of "fixing" the issue by turning on flush_on_newline.

Also having STDOUT set to flush by default seems to run contrary to all other languages I know, including Ruby.

@asterite
Copy link
Member

asterite commented Feb 14, 2019

I agree with @michael-selig

I originally added flush_on_newline = true to STDOUT and STDERR because I noticed that when you do STDOUT.puts in Ruby then you would immediately see the output in the console. But I didn't know that if the output is not a TTY then it wouldn't flush right away.

The interesting thing is that if you check the value of STDOUT.sync it's always false, regardless of whether it's a TTY or not. So I guess Ruby flushes whenever it sees that the device is a TTY (I see they store this info in the file pointer itself at the beginning of the program, at least from some C code I've seen).

We should probably do the same. And we could also remove flush_on_newline: we should always flush on TTY and only flush when a buffer page is full (similar to Ruby). This will also simplify our code a bit.

To do that, we can probably check for TTY at the beginning of the program (which we do now) and set sync to true/false (or use another flag, like Ruby, but that seems a bit magical too me, plus I'd like to see that if it's a TTY sync becomes true and if not it becomes false).

Then loggers should flush after each line. For loggers it makes sense.

@asterite
Copy link
Member

(side topic: should we rename topic:stdlib:files to topic:stdlib:io?)

@SlayerShadow
Copy link

I was surprised when Kemal logs is suddenly stopped after an ordinary build. 😄
Interesting point: changing default behaviour forces all third-party logging systems be patched (to work properly in dockers, for ex).

@asterite
Copy link
Member

Logger does flush, so that might not be the issue.

@asterite asterite reopened this Aug 30, 2019
@asterite
Copy link
Member

I think this might have been a mistake.

If you run a crystal program through Process.run(...) and want to redirect output to a file or something else, and have a live stream, it won't work unless you change that crystal program to do STDOUT.sync = true. It's very unintuitive.

I think I prefer a slower output that one can still make faster by doing STDOUT.sync = STDOUT.tty? but have the default work in an intuitive way.

@asterite
Copy link
Member

Should a regular person writing a program writing to STDOUT (like I did) have to worry about whether STDOUT needs to be set to flush or not to get decent performance (one of Crystal's claim to fame)?

Yes. Crystal's philosophy is that you can get really good performance if you want. But we prefer intuitive behavior over performance when possible, and here we broke that rule.

@michael-selig
Copy link
Author

michael-selig commented Aug 30, 2019

How default STDIN/OUT/ERR buffering should work has always been controversial - Posix only says that it is "implementation dependent".

we prefer intuitive behavior over performance when possible

I guess this comes down to what you interpret "intuitive" to mean. A C programmer would expect the default buffering behaviour to work like the C library, which (on Linux anyway) is:
stdin - line buffered
stdout - buffered unless it is a TTY (interactive device), when it is line buffered
stderr - unbuffered
[https://eklitzke.org/stdout-buffering has a good explanation]
I was coming to Crystal with this mindset, so "intuitive" to me means this behaviour. Of course to others it may mean the opposite. In my experience most programming languages buffer STDOUT when it is directed to a file.

@asterite
Copy link
Member

Interesting article, thank you! I guess I don't know what's best to do.

@sam0x17
Copy link
Contributor

sam0x17 commented Aug 31, 2019

I'm not super familiar with the default behaviors for this in different environments, but I've always wanted it to be time based, as in flush if data sits in the buffer unflushed for more than X milliseconds. This could be implemented by having a timestamp for the last flush and knowledge that there is data in the buffer.

@michael-selig
Copy link
Author

@asterite I noticed that unlike crystal, ruby appears to flush STDOUT & STDERR before forking a new process. eg: Try running

STDOUT << "1 "
system "echo -n 2"
STDOUT << " 3\n"

redirected to a file using both ruby & crystal (ensuring STDOUT is buffered). I think Ruby's behaviour addresses your concern & is more intuitive.

@asterite
Copy link
Member

asterite commented Sep 2, 2019

@michael-selig No, it doesn't solve my problem.

The problem is this. You have one program that outputs to STDOUT. For example:

# subprogram.cr
100.times do |i|
  puts i
  sleep 1
end

Then you have another program that streams the output of that program to a file:

# program.cr
file = File.open("output.txt")
Process.run("subprogram", output: file)

Now you execute program.cr and want to see if it works. You do tail -f output.txt and you don't get any output. You only get output when the buffer is flushed, which might take a while.

How do you fix it? There's no way to fix this without recompiling subprogram.cr to do STDOUT.sync = true or by making it configurable.

I just tried this program in Go:

package main

import (
	"fmt"
)

func main() {
	for i := 0; i < 100; i++ {
		fmt.Println(i)
	}
}

When you run it with TTY it takes 2.5 seconds. When you run it and redirect to a file it takes 4 seconds (I don't know why). And I can tail -f just file the file that I'm piping to.

I also tried putting a sleep in between, using Println or Print and I can always tail the program.

It seems Go cares more about being able to tail a file and get immediate output than to win a few extra seconds.

I think that if you do care about those extra seconds you can always enable C's behaviour by doing STDOUT.sync = STDOUT.tty?. It's just one line that you need to add to your program. For any other file you'll get expected (but possibly slower) behavior.

I think we should revert the PR that changed this.

@asterite
Copy link
Member

asterite commented Sep 2, 2019

A bit more: now I understand that fmt.Println in Go just outputs to os.Stdout and that's unbuffered. You can also do fmt.Prinltn(os.Stdout, "...") with the same result. If you want it buffered then you must create a buffered wrapper around that and use that.

Well, in our case we can let STDOUT.sync = true always be true and let users change it if they want to.

@michael-selig
Copy link
Author

@asterite Sorry, I did misunderstand your problem, which I thought was to show that the output of the subprogram could be written to the file before stuff written previously by the calling program. This is what my example tried to show.

I have never used Go, so I can't really comment on your example. However, I tested the code in your "subprogram.cr" in Ruby, and equivalent code in Python and C (using buffered STDIO), and all 3 behave like Crystal does using buffered output - nothing is written to the output file until the program finishes.

Well, in our case we can let STDOUT.sync = true always be true and let users change it if they want to.

Or we can do the opposite :-)

The only way I can think of to make it "intuitive" to everyone and also maintain good performance is to implement what @sam0x17 suggested: when data is written into an empty buffer, set a timer to flush the buffer. It would also need to force a flush of the buffer before passing the file descriptor to another process. However, I am not sure if this "automatic timed flushing" is really worth the effort of implementing, given that I am not aware of other languages that do.

The usage cases where buffered behaviour is unintuitive seem to be things like logging or writing status to STDOUT in long-running programs, or using STDOUT for debug messages, where the order and timing of those messages is important. In short-running programs that write data (text, HTML, XML, JSON etc) to STDOUT, speed should win.

So assuming that "automatic timed flushing" isn't implemented, for which types of programs should coders have to worry about buffering? I would suggest the former.

@RX14
Copy link
Contributor

RX14 commented Sep 9, 2019

At the very least flush_on_newline should be true for STDOUT and STDERR, sync would be fine too. Especially for STDERR.

I've been bitten by this, when writing a program which outputs newline-deliminated JSON to the parent process via stdout. Behaviour shouldn't differ between TTY and pipe. It makes debugging a nightmare.

If your application depends on stdout performance (very rare) it can always change the mode.

@michael-selig
Copy link
Author

Behaviour shouldn't differ between TTY and pipe. It makes debugging a nightmare.

Unix & Linux programmers have come to expect that STDOUT works like that in many if not most languages. Also when you want to write a large amount of data to STDOUT via a pipe, why should performance suffer? Sure, if you're just writing a small amount of JSON or whatever to STDOUT you won't notice the difference.

If your application depends on stdout performance (very rare) it can always change the mode.

Not sure if I agree that this is rare. I would argue that if you want to debug you can always change the mode. That way you get performance by default. If it's the other way around, programmers who aren't aware of the need to set buffering on STDOUT will get poorer performance and likely not even be aware of it. At least when you are debugging, you'll soon find out that it is being buffered!

@RX14
Copy link
Contributor

RX14 commented Sep 17, 2019

Poor performance is far far far less frustrating to a user than their application changing semantics for unclear reasons. Because some people are used to it already, is not a good argument for this

@asterite
Copy link
Member

Because some people are used to it already, is not a good argument for this

Exactly my thoughts!

However, I tried the original program with STDOUT.sync = true and it takes like 1 minute vs half a second to finish. I think flush_on_newline can improve the situation a bit, making it take 10 seconds instead of 1 minute, but still a lot more than half a second... but maybe it's acceptable for outputting 10000000 times "a" * 60. And one can also make it faster by just doing STDOUT.flush_on_newline = false.

So I'm thinking maybe it's better to revert the whole thing to how it was before this entire discussion started. It was actually pretty good back then.

@asterite
Copy link
Member

But I'll try to discuss this today with Juan and Brian if I have time.

@asterite
Copy link
Member

We talked about this today.

On one hand we think not buffering when writing to a file is good because you just want to write it as fast as possible. This matches what C, Ruby, Python and current Crystal does.

On the other hand we'd like there to be output from time to time so a user watching that file, for example with tail -f, or via another process, can see (not necessarily real-time) feedback.

So maybe the idea of having it fully buffered but flushing from time to time, say, once per second, could be ideal: prevents too many syscalls but you get feedback.

But so far we didn't agree on any approach, plus we don't really know how to do the "flush every X time unit".

Right now the behavior matches C, Python and Ruby so it's good enough. But let's leave this open and come back to this later.

@michael-selig
Copy link
Author

Tough decision over 1 line of code!

One more suggestion if I may:
If we are sticking with buffering, flush STDOUT before forking another process (which is inheriting STDOUT). This at least ensures that the output appears before that of the sub process. I believe that Ruby does this.

@sam0x17
Copy link
Contributor

sam0x17 commented Sep 18, 2019

To reiterate my thoughts, I think flushing every X bytes or X milliseconds, which ever happens first, is the way to go eventually, as this will perform well and have the expected behavior in both use cases.

@RX14
Copy link
Contributor

RX14 commented Sep 18, 2019

plus we don't really know how to do the "flush every X time unit".

spawn do
  loop do
    sleep 100.milliseconds
    STDOUT.flush
    STDERR.flush
  end
end

But, I still prefer flush_on_newline.

Printing "aaaaaa" 10 million times is a microbenchmark and the results are hardly representitive of real-world applications.

@michael-selig
Copy link
Author

A possible compromise:
There is a method Kernel#print which writes to STDOUT & flushes. Perhaps the other Kernel print-to-STDOUT methods should flush too (puts, p, printf)? You will then get line buffered output if you use these simple methods (eg: debugging, logging) but fully buffered otherwise. Yes, making Kernel#puts flush would make the original example slower, but if it is clearly documented, you would know to use STDOUT.puts if you want buffered output.

I am actually uncertain if this is a great idea, but I think it does make behaviour more obvious than having to specifically turn on or buffering. It won't help if you pipe buffered output to tail -f. But it does make the IO method buffering behaviour of STDOUT consistent with other files (except when it's a tty).

@asterite
Copy link
Member

The behavior you describe is how it was done in the very beginning of Crystal. But it wasn't very nice that doing puts vs stdout puts behaved differently.

@sam0x17
Copy link
Contributor

sam0x17 commented Sep 20, 2019

I'll revise my suggestion -- we should flush on:

  1. newline
  2. buffer filled
  3. 50ms has passed

Whichever occurs first

@RX14
Copy link
Contributor

RX14 commented Sep 20, 2019

If you flush on newline and 50ms, then you might as well just flush on newline and loose the complexity.

@sam0x17
Copy link
Contributor

sam0x17 commented Sep 21, 2019

@RX14 I disagree. Suppose you print a few characters and then sit for 100ms. It would be odd for that not to get flushed, but neither a newline detector nor a buffer flush are going to account for that.

@asterite
Copy link
Member

I really thing we should bring back the old functionality.

https://gitter.im/crystal-lang/crystal?at=5e7a32ff65bc2a6fa687efd6

just lost 2 hours because doing output in systemd services doesn't flush stdout anymore :P
is there some way to get IO#flush_on_newline back?

It's really unintuitive that flushing is disabled in one case or the other.

I very much prefer things to be a bit slower than unexpected.

Thoughts?

@oprypin
Copy link
Member

oprypin commented Mar 25, 2020

Another earlier reference on gitter
https://gitter.im/crystal-lang/crystal?at=5e597ebb376d882250bfde5e
Yes, please. Thanks for the consideration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants