Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.26.1 i386 deb package completely broke the compiler #6624

Closed
stronny opened this issue Aug 30, 2018 · 13 comments
Closed

0.26.1 i386 deb package completely broke the compiler #6624

stronny opened this issue Aug 30, 2018 · 13 comments
Labels
kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:stdlib

Comments

@stronny
Copy link

stronny commented Aug 30, 2018

$ dpkg -S /usr/bin/crystal
crystal: /usr/bin/crystal
$ dpkg -l | grep crystal
ii  crystal     0.26.1-1       i386         no description given
$ /usr/bin/crystal -h
Failed to raise an exception: END_OF_STACK
[0xb5ca8813] ???
[0xb535e383] ???
[0xb535f411] ???
[0xb53645ec] ???
[0xb539c469] ???
[0xb5362991] ???
[0xb722b816] ???

as a side note, maybe add a description to the package?

@bcardiff
Copy link
Member

@stronny I am checking the installer using a vagrant xenial 32 bits and I was able to run the installer as documented here without troubles.

vagrant@ubuntu-xenial:~$ dpkg -S /usr/bin/crystal
crystal: /usr/bin/crystal
vagrant@ubuntu-xenial:~$ dpkg -l | grep crystal
ii  crystal                          0.26.1-1                                   i386         no description given
vagrant@ubuntu-xenial:~$ /usr/bin/crystal -h
Usage: crystal [command] [switches] [program file] [--] [arguments]

Command:
   ....

@stronny
Copy link
Author

stronny commented Aug 30, 2018

My setup is a debian stable virtualbox guest. I've just noticed that the bug is reproduced only as a particular user, but not as another one. I think it has to do with fd reopening in the update, here's the tail of an strace:

[pid 12842] clock_gettime(CLOCK_REALTIME, {tv_sec=1535657416, tv_nsec=132236618}) = 0
[pid 12842] epoll_create1(EPOLL_CLOEXEC) = 4
[pid 12842] pipe2([5, 6], O_NONBLOCK|O_CLOEXEC) = 0
[pid 12842] getcpu(0x9afd8, 0x10, NULL) = -1 EFAULT (Bad address)
[pid 12842] open("/dev/urandom", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 7
[pid 12842] fcntl64(7, F_SETFD, FD_CLOEXEC) = 0
[pid 12842] fstat64(7, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 9), ...}) = 0
[pid 12842] fcntl64(7, F_SETFD, FD_CLOEXEC) = 0
[pid 12842] read(7, "\271\245K\203\374\322<\270\20^\245\315l\302\35=\360\263dm$\373\326\251uV\334\325\274\352\2316"..., 8192) = 8192
[pid 12842] ioctl(0, TIOCGWINSZ, {ws_row=38, ws_col=151, ws_xpixel=0, ws_ypixel=0}) = 0
[pid 12842] readlink("/proc/self/fd/0", "/dev/pts/4", 256) = 10
[pid 12842] stat64("/dev/pts/4", {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 4), ...}) = 0
[pid 12842] fstat64(0, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 4), ...}) = 0
[pid 12842] open("/dev/pts/4", O_RDWR|O_LARGEFILE) = -1 EACCES (Permission denied)
[pid 12842] fcntl64(-1, F_GETFL)        = -1 EBADF (Bad file descriptor)
[pid 12842] mmap2(NULL, 155648, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb51b4000
[pid 12842] mmap2(NULL, 155648, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb518e000
[pid 12851] futex(0xb51f4c08, FUTEX_WAIT_PRIVATE, 2, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 12851] futex(0xb51f4c08, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 12842] munmap(0xb518e000, 155648)  = 0
[pid 12842] ioctl(2, TIOCGWINSZ, {ws_row=38, ws_col=151, ws_xpixel=0, ws_ypixel=0}) = 0
[pid 12842] readlink("/proc/self/fd/2", "/dev/pts/4", 256) = 10
[pid 12842] stat64("/dev/pts/4", {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 4), ...}) = 0
[pid 12842] fstat64(2, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 4), ...}) = 0
[pid 12842] open("/dev/pts/4", O_RDWR|O_LARGEFILE) = -1 EACCES (Permission denied)
[pid 12842] fcntl64(-1, F_GETFL)        = -1 EBADF (Bad file descriptor)
[pid 12842] writev(2, [{iov_base="Failed to raise an exception: EN"..., iov_len=43}, {iov_base=NULL, iov_len=0}], 2Failed to raise an exception: END_OF_STACK
) = 43

Do you need any additional details from me to narrow down the fault?

@bcardiff
Copy link
Member

@stronny since you are using a vagrant machine and it happens with certain user but not all it would be create to have steps to reproduce the issues consistently. From there it will be easier to narrow the cause. Otherwise we would need to guess in the dark what are the difference between these two users.

@stronny
Copy link
Author

stronny commented Aug 30, 2018

I'm not using a vagrant machine though. Comparing straces I can point to this:

This is a normal run:

[pid 13305] readlink("/proc/self/fd/0", "/dev/pts/5", 256) = 10
[pid 13305] stat64("/dev/pts/5", {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 5), ...}) = 0
[pid 13305] fstat64(0, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 5), ...}) = 0
[pid 13305] open("/dev/pts/5", O_RDWR|O_LARGEFILE) = 8
[pid 13305] fcntl64(8, F_GETFL)         = 0x8002 (flags O_RDWR|O_LARGEFILE)

This is a faulty one:

[pid 12842] readlink("/proc/self/fd/0", "/dev/pts/4", 256) = 10
[pid 12842] stat64("/dev/pts/4", {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 4), ...}) = 0
[pid 12842] fstat64(0, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 4), ...}) = 0
[pid 12842] open("/dev/pts/4", O_RDWR|O_LARGEFILE) = -1 EACCES (Permission denied)
[pid 12842] fcntl64(-1, F_GETFL)        = -1 EBADF (Bad file descriptor)

I think there is a lack of an error check somewhere and an error sentinel of -1 goes into fcntl64() which promptly explodes.

@Sija
Copy link
Contributor

Sija commented Aug 30, 2018

I think there is a lack of an error check somewhere and an error sentinel of -1 goes into fcntl64() which promptly explodes.

Like in 2530bb6#diff-97b852ed0df274031d263857a85db195R30 maybe?

@Timbus
Copy link
Contributor

Timbus commented Sep 2, 2018

Oof, ok. Checking the error is one thing, but what to do after a failure.
Can the tty be reopened with less permissions? Or is this tty somehow unopenable? If so, what's the correct fallback?

@RX14
Copy link
Contributor

RX14 commented Sep 3, 2018

@Timbus no, just act as if ttyname_r failed, and return new(fd, blocking: true)

@stronny
Copy link
Author

stronny commented Sep 5, 2018

So the problem is that the second user (the one that's failing) runs under sudo -i. As a result it can't access /dev/pts/*, because files there are crw--w---- and owned by the first user and tty group, which none are members of.

I don't think I've seen any problems with this fairly standard setup with any other nonblocking IO, so either I'm just ignorant of other programs that can't use read_nonblock(), or perhaps they do something differently?

@Timbus
Copy link
Contributor

Timbus commented Sep 6, 2018

The problem is that there is no such thing as read_nonblock(), which is a huge flaw in posix imo.

So, in order for crystal to read a handle without blocking, you need to set the handle into nonblocking mode. This would be fine except it is the same handle as the current interactive terminal's handle, and setting nonblock on it will force anything else sharing the handle to perform nonblocking reads (this includes the shell, piped/sibling processes, and forked subprocesses).

So yes, it can be done differently without reopening the TTY. It just breaks many other things. Perhaps in the future there may be a threaded fallback mode or something, but for now it just goes back to blocking on a TTY read.

@stronny
Copy link
Author

stronny commented Sep 18, 2018

Just a quick clarification if I may. IIUC what you are doing is something like 1) new_fd = reopen(STDIN); 2) set_nonblock(new_fd) which makes perfect sense, except what is done is fcntl(char_device), which is not a handle, it's a device file itself. So what's the point of reopening the tty if at the end you operate on the same tty? If your goal is to set nonblocking mode separately for this process, how is it achieved by mutating global state?

Edit: I'm wrong, fcntl indeed accepts an fd, not a device. Please disregard the above comment.

@stronny
Copy link
Author

stronny commented Sep 18, 2018

Another thing I wanted to ask. It seems that Crystal doesn't reopen the tty, it just opens 3 new fds, and leaves the old ones hanging. Is that intentional behaviour? I've checked with this:

slice = "real stdout\n"
LibC.write(1, slice, slice.size)

puts "stdout?"

which gives me:

[pid  1066] write(1, "real stdout\n", 12real stdout
) = 12
[pid  1066] write(8, "stdout?\n", 8stdout?
)    = 8

@RX14
Copy link
Contributor

RX14 commented Sep 18, 2018

@stronny yes it's intentional, please read #6518

@RX14
Copy link
Contributor

RX14 commented Oct 22, 2018

Fix PR was merged.

@RX14 RX14 added kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:stdlib labels Oct 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:stdlib
Projects
None yet
Development

No branches or pull requests

5 participants