Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python federated execution sometimes segfaults #703

Closed
housengw opened this issue Oct 31, 2021 · 1 comment · Fixed by #675
Closed

Python federated execution sometimes segfaults #703

housengw opened this issue Oct 31, 2021 · 1 comment · Fixed by #675

Comments

@housengw
Copy link
Contributor

housengw commented Oct 31, 2021

minimal program to reproduce the problem:

target Python {
    timeout: 10 msec
}

reactor a {
    timer t(0, 1 msec)
    output o;
    reaction(t) -> o {=
        o.set(1);
        print("a: sent a message to b")
    =}
}

reactor b {
    input i;
    reaction(i) {=
        print("b: received a message from a")
    =}
    
}

federated reactor {
    a = new a();
    b = new b();
    a.o -> b.i;
}

program output

oysteryio-2:scratch wonghouseng$ bash /Users/wonghouseng/lingua-franca-master/runtime-EclipseXtext/scratch/bin/HelloWorld
Federate HelloWorld in Federation ID 'd773bb9a0ec81be80fb99144eb7d010a10cf5ce4ac8259ad'
#### Launching the runtime infrastructure (RTI).
RTI: Federation ID: d773bb9a0ec81be80fb99144eb7d010a10cf5ce4ac8259ad
RTI: Number of federates: 2
RTI: Clock sync: init
RTI: Clock sync exchanges per interval: 10
Starting RTI for 2 federates in federation ID d773bb9a0ec81be80fb99144eb7d010a10cf5ce4ac8259ad
RTI using TCP port 15045 for federation d773bb9a0ec81be80fb99144eb7d010a10cf5ce4ac8259ad.
RTI: Listening for federates.
#### Launching the federate a.
#### Launching the federate b.
#### Bringing the RTI back to foreground so it can receive Control-C.
RTI -i ${FEDERATION_ID} -n 2 -c initial exchanges-per-interval 10
Federation ID for executable /Users/wonghouseng/lingua-franca-master/runtime-EclipseXtext/scratch/src-gen/HelloWorld/b/HelloWorld_b.py: d773bb9a0ec81be80fb99144eb7d010a10cf5ce4ac8259ad
Python(1330,0x10e12de00) malloc: Incorrect checksum for freed object 0x7ff04b70cb60: probably modified after being freed.
Corrupt value: 0x1094176b0
Python(1330,0x10e12de00) malloc: *** set a breakpoint in malloc_error_break to debug
Federate 1: Connected to RTI at localhost:15045.
Federate 1: ---- Start execution at time Sat Oct 30 22:14:29 2021
---- plus 453717000 nanoseconds.

The program seems to loop infinitely there. If I press CTRL-C, I get:

^CFederate 1: ERROR: Read 0 bytes, but expected 9.
Federate 1: FATAL ERROR: Failed to read MSG_TYPE_TIMESTAMP message from RTI.
Federate 1: ---- Elapsed logical time (in nsec): 0
Federate 1: ---- Elapsed physical time (in nsec): 74,308,920,000
/Users/wonghouseng/lingua-franca-master/runtime-EclipseXtext/scratch/bin/HelloWorld: line 68:  1330 Abort trap: 6           python3 /Users/wonghouseng/lingua-franca-master/runtime-EclipseXtext/scratch/src-gen/HelloWorld/a/HelloWorld_a.py -i $FEDERATION_ID
#### Received ERR signal on line 68. Invoking cleanup().
Killing federate 1330.
Killing federate 1331.

It happens once every 4 ~ 5 times I run the program. Also, the error message is sometimes different. Here's another error message I got from running the same program

oysteryio-2:scratch wonghouseng$ bash /Users/wonghouseng/lingua-franca-master/runtime-EclipseXtext/scratch/bin/HelloWorld
Federate HelloWorld in Federation ID '0b536f4d41d915e7ad3a7f8c9ef7d18930048adae8b7db68'
#### Launching the runtime infrastructure (RTI).
RTI: Federation ID: 0b536f4d41d915e7ad3a7f8c9ef7d18930048adae8b7db68
RTI: Number of federates: 2
RTI: Clock sync: init
RTI: Clock sync exchanges per interval: 10
Starting RTI for 2 federates in federation ID 0b536f4d41d915e7ad3a7f8c9ef7d18930048adae8b7db68
RTI using TCP port 15045 for federation 0b536f4d41d915e7ad3a7f8c9ef7d18930048adae8b7db68.
RTI: Listening for federates.
#### Launching the federate a.
#### Launching the federate b.
#### Bringing the RTI back to foreground so it can receive Control-C.
RTI -i ${FEDERATION_ID} -n 2 -c initial exchanges-per-interval 10
Federation ID for executable /Users/wonghouseng/lingua-franca-master/runtime-EclipseXtext/scratch/src-gen/HelloWorld/b/HelloWorld_b.py: 0b536f4d41d915e7ad3a7f8c9ef7d18930048adae8b7db68
Federation ID for executable /Users/wonghouseng/lingua-franca-master/runtime-EclipseXtext/scratch/src-gen/HelloWorld/a/HelloWorld_a.py: 0b536f4d41d915e7ad3a7f8c9ef7d18930048adae8b7db68
Python(1351,0x111f81e00) malloc: Incorrect checksum for freed object 0x7fcc86d10ab0: probably modified after being freed.
Corrupt value: 0x10febd6b0
Python(1350,0x11240be00) malloc: Incorrect checksum for freed object 0x7fd7e35263a0: probably modified after being freed.
Corrupt value: 0x10e3fd6b0
Python(1350,0x11240be00) malloc: *** set a breakpoint in malloc_error_break to debug
Python(1351,0x111f81e00) malloc: *** set a breakpoint in malloc_error_break to debug

It loops infinitely there. If I press CTRL-C, I get:

^C/Users/wonghouseng/lingua-franca-master/runtime-EclipseXtext/scratch/bin/HelloWorld: line 68:  1350 Abort trap: 6           python3 /Users/wonghouseng/lingua-franca-master/runtime-EclipseXtext/scratch/src-gen/HelloWorld/a/HelloWorld_a.py -i $FEDERATION_ID
/Users/wonghouseng/lingua-franca-master/runtime-EclipseXtext/scratch/bin/HelloWorld: line 68:  1351 Abort trap: 6           python3 /Users/wonghouseng/lingua-franca-master/runtime-EclipseXtext/scratch/src-gen/HelloWorld/b/HelloWorld_b.py -i $FEDERATION_ID
#### Received ERR signal on line 68. Invoking cleanup().
Killing federate 1350.
Killing federate 1351.
/Users/wonghouseng/lingua-franca-master/runtime-EclipseXtext/scratch/bin/HelloWorld: line 15: kill: (1350) - No such process
/Users/wonghouseng/lingua-franca-master/runtime-EclipseXtext/scratch/bin/HelloWorld: line 15: kill: (1351) - No such process
#### Killing RTI 1348.
/Users/wonghouseng/lingua-franca-master/runtime-EclipseXtext/scratch/bin/HelloWorld: line 17: kill: (1348) - No such process
@housengw housengw linked a pull request Oct 31, 2021 that will close this issue
@Soroosh129
Copy link
Contributor

Wow, nice catch! I will look into it.

Soroosh129 added a commit that referenced this issue Nov 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants