Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_pty: test_basic() fails randomly on Travis CI #75341

Closed
vstinner opened this issue Aug 9, 2017 · 21 comments
Closed

test_pty: test_basic() fails randomly on Travis CI #75341

vstinner opened this issue Aug 9, 2017 · 21 comments
Labels
3.7 (EOL) end of life tests Tests in the Lib/test dir

Comments

@vstinner
Copy link
Member

vstinner commented Aug 9, 2017

BPO 31158
Nosy @vstinner, @xdegaye, @vadmium, @stratakis, @diekmann
PRs
  • bpo-31158: test_pty now handles partial writes #3037
  • bpo-31158: Fix test_pty.test_basic() #3802
  • bpo-31158: Fix nondeterministic read in test_pty #3808
  • [3.6] bpo-31158: Fix nondeterministic read in test_pty (GH-3808) #3852
  • [2.7] bpo-31158: Fix nondeterministic read in test_pty (#3808) #3853
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-10-02.09:59:15.484>
    created_at = <Date 2017-08-09.10:52:13.951>
    labels = ['3.7', 'tests']
    title = 'test_pty: test_basic() fails randomly on Travis CI'
    updated_at = <Date 2017-10-02.09:59:15.483>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2017-10-02.09:59:15.483>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-10-02.09:59:15.484>
    closer = 'vstinner'
    components = ['Tests']
    creation = <Date 2017-08-09.10:52:13.951>
    creator = 'vstinner'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 31158
    keywords = ['patch']
    message_count = 21.0
    messages = ['299987', '299988', '299991', '299995', '300067', '303199', '303241', '303249', '303252', '303257', '303259', '303261', '303275', '303276', '303286', '303397', '303518', '303519', '303522', '303523', '303524']
    nosy_count = 5.0
    nosy_names = ['vstinner', 'xdegaye', 'martin.panter', 'cstratak', 'Cornelius Diekmann']
    pr_nums = ['3037', '3802', '3808', '3852', '3853']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue31158'
    versions = ['Python 2.7', 'Python 3.6', 'Python 3.7']

    @vstinner
    Copy link
    Member Author

    vstinner commented Aug 9, 2017

    https://travis-ci.org/python/cpython/jobs/262606831

    0:13:26 load avg: 134.81 [256/403] test_pty
    test test_pty failed -- Traceback (most recent call last):
      File "/home/travis/build/python/cpython/Lib/test/test_pty.py", line 109, in test_basic
        self.assertEqual(b'For my pet fish, Eric.\n', normalize_output(s2))
    AssertionError: b'For my pet fish, Eric.\n' != b'For m'

    @vstinner vstinner added 3.7 (EOL) end of life tests Tests in the Lib/test dir labels Aug 9, 2017
    @vstinner
    Copy link
    Member Author

    vstinner commented Aug 9, 2017

    I don't know why test_pty fails.

    #3037 makes sure that all data is written into the master fd using a new write_all() helper function which loops until all data is written: handle os.write() partial write.

    @vadmium
    Copy link
    Member

    vadmium commented Aug 9, 2017

    Checking for short writes is worthwhile, but in bpo-29070 it looks like Cornelius identified the main problem was short _reads_. See the parts of his patch to do with “_os_read_exactly” and related functions.

    @stratakis
    Copy link
    Mannequin

    stratakis mannequin commented Aug 9, 2017

    I've observed the exact same issue on Fedora at random times.

    @diekmann
    Copy link
    Mannequin

    diekmann mannequin commented Aug 10, 2017

    I observed the same issue, but the problem occurs mainly when reading data. In my proposed patch in bpo-29070, I use the existing pty._writen() to make sure all data is written. As Martin mentioned, reading is a problem. My patch proposes _os_timeout_read(), _os_readline(), _os_read_exactly(), _os_read_exhaust_exactly() to give fine-grained, deterministic control. Cheeky advertisement: Anybody cares to review bpo-29070 or cherry pick the pty._writen() and _os_read* parts? :)

    @vstinner
    Copy link
    Member Author

    Failure on x86 Gentoo Refleaks 3.x too:

    http://buildbot.python.org/all/builders/x86%20Gentoo%20Refleaks%203.x/builds/104/steps/test/logs/stdio

    ======================================================================
    FAIL: test_basic (test.test_pty.PtyTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/test/test_pty.py", line 105, in test_basic
        self.assertEqual(b'For my pet fish, Eric.\n', normalize_output(s2))
    AssertionError: b'For my pet fish, Eric.\n' != b'For m'

    @vstinner
    Copy link
    Member Author

    Oh, in fact, it's not that hard to reproduce the failure on Linux:

    haypo@selma$ ./python -m test test_pty -m test_basic -F 
    Run tests sequentially
    0:00:00 load avg: 2.17 [  1] test_pty
    0:00:00 load avg: 2.17 [  2] test_pty
    0:00:00 load avg: 2.17 [  3] test_pty
    (...)
    0:00:00 load avg: 2.17 [ 15] test_pty
    0:00:00 load avg: 2.17 [ 16] test_pty
    test test_pty failed -- Traceback (most recent call last):
      File "/home/haypo/prog/python/master/Lib/test/test_pty.py", line 105, in test_basic
        self.assertEqual(b'For my pet fish, Eric.\n', normalize_output(s2))
    AssertionError: b'For my pet fish, Eric.\n' != b'For m'

    test_pty failed
    15 tests OK.

    1 test failed:
    test_pty

    Total duration: 486 ms
    Tests result: FAILURE

    @diekmann
    Copy link
    Mannequin

    diekmann mannequin commented Sep 28, 2017

    Regarding PR 3802:

    • Does reading the string b'I wish to buy a fish license.\n' not cause a problem, too?
    • Is reading len(expected) bytes the correct behavior for systems where normalize_output is needed?

    I also fixed these issues in one of my (unmerged) pull requests (bpo bpo-29070). Unfortunately, the patch I propose does way too many things at the same time and I don't have time currently to untangle it :( Feel free to cherry pick the _os_readline function:
    https://github.com/python/cpython/pull/2932/files#diff-6c897ae50dce4ffa73bde779206a2059R60
    Don't hesitate to put your name on it, if you take the time to review and cherry pick :)

    Cheers!

    @vstinner
    Copy link
    Member Author

    Does reading the string b'I wish to buy a fish license.\n' not cause a problem, too?

    This string TEST_STRING_1 is used for a single os.write() call, whereas TEST_STRING_2 is splitted and written in two parts with two os.write() calls.

    I prefer minimal change, since I don't know well the pty module.

    Is reading len(expected) bytes the correct behavior for systems where normalize_output is needed?

    Yeah, I looked at this function. normalize_output() can return a string shorter than the input: len(normalize_output(s2)) <= len(s2).

    So I think that len(s2) < len(expected) is correct.

    @diekmann
    Copy link
    Mannequin

    diekmann mannequin commented Sep 28, 2017

    > Does reading the string b'I wish to buy a fish license.\n' not cause a problem, too?

    This string TEST_STRING_1 is used for a single os.write() call, whereas TEST_STRING_2 is splitted and written in two parts with two os.write() calls.

    I prefer minimal change, since I don't know well the pty module.

    I like the idea of minimal change, too :D

    Yet, I think your patch does not solve the core problem of read/write being nondeterministic. In theory, a pty is similar to a pipe (with termios processing in the middle). So any os.write to a pty fd is nondeterministic and may put less bytes into the pty buffer than given to the write call (see the return value of os.write, which test_pty.py does not check). Multiple writes are buffered by the kernel, until the buffer is full. So the kernel already accumulates the chunked writing for us. Usually, this works fine.

    Similarly, a os.read may also be nondeterministic, depending on how many bytes are ready in the pty buffer. This may have nothing to do with the chunked writing of the test_pty.py module because the kernel is doing the read/write syscalls and handling the pty buffer.

    Here is a PoC: I checked out your code and stressed my system with it. I have 2 physical and 2 virtual cores and started 8 instances of the test to stress my kernel with scheduling, locking of kernel buffers (the pty buffer), and making read/write more nondeterministic.

    ./python -b -m test -j 8 test_pty -m test_basic -F

    Here is what I got:
    0:00:13 load avg: 3.25 [119/1] test_pty failed
    test test_pty failed -- Traceback (most recent call last):
      File "XXX/cpython/Lib/test/test_pty.py", line 99, in test_basic
        normalize_output(s1))
    AssertionError: b'I wish to buy a fish license.\n' != b'I wish to buy a fish license.'

    > Is reading len(expected) bytes the correct behavior for systems where normalize_output is needed?

    Yeah, I looked at this function. normalize_output() can return a string shorter than the input: len(normalize_output(s2)) <= len(s2).

    So I think that len(s2) < len(expected) is correct.

    Warning, obscure corner cases ahead.
    In theory, given read is completely nondeterministic and we are on one of the strange systems which need normalize_output, the following could happen:
    We have b'For my pet fish, Eric.\r\n' in the pty buffer. We read b'For my pet fish, Eric.\r' from the buffer into s2. Now len(s2) == len(expected) but a b'\n' is still unread in the buffer. This would make the test fail. I admit, this is a corner case, but also an argument that a clean test case may want to have a readline function.

    @diekmann
    Copy link
    Mannequin

    diekmann mannequin commented Sep 28, 2017

    Oh, one more thing, since it is unclear whether read or write is causing the error. I replaced
    os.write(slave_fd, TEST_STRING_1)
    by
    pty._writen(slave_fd, TEST_STRING_1)
    which makes sure all bytes are written.
    After some time, when the load gets high and the machine gets noisy, I get the same error:

    0:01:09 load avg: 6.17 [597/1] test_pty failed
    test test_pty failed -- Traceback (most recent call last):
      File "XXX/cpython/Lib/test/test_pty.py", line 99, in test_basic
        normalize_output(s1))
    AssertionError: b'I wish to buy a fish license.\n' != b'I wish to buy a fish license.'

    So the main problem is a nondeterministic read. But I'm only testing on Linux. Other systems with different kernel may behave differently. And the documentation allows a kernel to make both short reads and writes.

    @diekmann
    Copy link
    Mannequin

    diekmann mannequin commented Sep 28, 2017

    I also added a pull request which I cannot get to fail on my system. What do you think, haypo?

    Sorry for the noise!

    @vstinner
    Copy link
    Member Author

    Cornelius:Here is a PoC: I checked out your code and stressed my system with it. (...) AssertionError: ..."

    Did you get this failure with my PR 3802?

    Cornelius: "We have b'For my pet fish, Eric.\r\n' in the pty buffer. We read b'For my pet fish, Eric.\r' from the buffer into s2. Now len(s2) == len(expected) but a b'\n' is still unread in the buffer. This would make the test fail. I admit, this is a corner case, but also an argument that a clean test case may want to have a readline function."

    Right. I can just modify my condition from "len(s2) < len(expected)" to "b'\n' not in s2".

    Cornelius: "I also added a pull request which I cannot get to fail on my system. What do you think, haypo?"

    Your code calls read(1) in a loop until it gets the newline character b'\n'.

    Is it better to os.read(1024) in a loop until the output string contains b'\n'?

    I have no strong preference.

    @vstinner
    Copy link
    Member Author

    Hum, let me clarify: my main goal is to get a stable CI. I don't really care of the exact implementation of one specific unit test in test_pty ;-)

    @diekmann
    Copy link
    Mannequin

    diekmann mannequin commented Sep 28, 2017

    Yes, I get the AssertionError with the latest version of PR 3802. From the high load avg of my system, you can see that the error occurs very rarely and that I need to stress my system to trigger it. With PR 3802, the error occurs way less frequently than without it. So we are definitely moving into the right direction. With PR 3808, I could not trigger any error (on my system).

    Changing the conditions to "b'\n' not in s2" should work. But we actually mean to read a line, so this should be better reflected in the code. I prefer calling a readline() function directly, which is almost self-documenting code.

    Your code calls read(1) in a loop until it gets the newline character b'\n'.
    Is it better to os.read(1024) in a loop until the output string contains b'\n'?
    Behavior may be different if there are multiple short lines in the buffer (which should not be the case in the unit test, but this may be a problem if somebody copies the code and uses it somewhere else).
    pitrou in the discussion of PR 3808 suggests the ultimate answer to the question: Just use an existing readline function from the library :-)
    I added this to PR 3808.

    Personal thought: I care about good code in the unit tests because people might look at this as reference how to use a module and copy&paste from it. I want the tests to be deterministic, which---as long as the tests pass---implies a stable CI ;-)

    @vadmium
    Copy link
    Member

    vadmium commented Sep 30, 2017

    I prefer Cornelius’s current proposal (rev 4f8137b) because it fixes both sites, rather than just patching the immediate problem site.

    I don’t think read(1) is a big problem, just less efficient. But if you prefer to do larger reads, that should be fine too. You could even use os.fdopen(...).readline(), which would use BufferedReader. It is not documented, but BufferedReader should do large reads and return the line without waiting to fill its internal buffer.

    FWIW I think it would be okay to remove the “chunked output” test. It isn’t exercising the “pty” module any more than the “Writing to slave_fd” test above it. All we need to do is verify that the “master_open” function returns a PTY master and that “slave_open” returns the corresponding slave terminal file descriptor. Checking “isatty” and sending one line of data through seems sufficient.

    @vstinner
    Copy link
    Member Author

    vstinner commented Oct 2, 2017

    New changeset e6f62f6 by Victor Stinner (Cornelius Diekmann) in branch 'master':
    bpo-31158: Fix nondeterministic read in test_pty (bpo-3808)
    e6f62f6

    @vstinner
    Copy link
    Member Author

    vstinner commented Oct 2, 2017

    I prefer Cornelius’s current proposal (rev 4f8137b)

    Ok, fine. I merged his PR 3852. I created backports to 2.7 and 3.6.

    @vstinner
    Copy link
    Member Author

    vstinner commented Oct 2, 2017

    New changeset 66fb5ef by Victor Stinner (Miss Islington (bot)) in branch '3.6':
    [3.6] bpo-31158: Fix nondeterministic read in test_pty (GH-3808) (GH-3852)
    66fb5ef

    @vstinner
    Copy link
    Member Author

    vstinner commented Oct 2, 2017

    New changeset 20cbc1d by Victor Stinner in branch '2.7':
    bpo-31158: Fix nondeterministic read in test_pty (bpo-3808) (bpo-3853)
    20cbc1d

    @vstinner
    Copy link
    Member Author

    vstinner commented Oct 2, 2017

    Thanks Cornelius Diekmann for your fix ;-)

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life tests Tests in the Lib/test dir
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants