Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various bugfixes and cleanups in the support for federated programs #323

Merged
merged 86 commits into from
Jan 22, 2024
Merged
Show file tree
Hide file tree
Changes from 69 commits
Commits
Show all changes
86 commits
Select commit Hold shift + click to select a range
5ad7f92
Clean up RTI
edwardalee Dec 20, 2023
e41eb63
Use lf_print_error_system_failure
edwardalee Dec 20, 2023
028e15a
Clean up federates
edwardalee Dec 20, 2023
fb47b2b
Point to lingua-franca/federated-cleanup
edwardalee Dec 20, 2023
c1c755d
Fix possible segfault on tracing termination
edwardalee Dec 21, 2023
548e278
Remove one more deadlock risk
edwardalee Dec 21, 2023
444ebee
Fix compile error and bogus comparison
edwardalee Dec 21, 2023
ab7605e
Prevent sending redundant reply to stop request
edwardalee Dec 21, 2023
b9b17af
Fixed compile error
edwardalee Dec 21, 2023
d451273
Treat the stop request from the RTI as if a local stop request had be…
edwardalee Dec 21, 2023
5bad8b9
Adjust port binding retries to realistic times
edwardalee Dec 22, 2023
4875564
RTI sends RESIGN on abnormal termination
edwardalee Dec 22, 2023
59ab5d2
Free environment only after all logging and debug statements
edwardalee Dec 23, 2023
e043239
Better handling of socket shutdown.
edwardalee Dec 23, 2023
24dab5a
Major refactoring of network functions
edwardalee Dec 26, 2023
6a1e313
Send all messages to stdout, not stderr
edwardalee Dec 27, 2023
a31a5d4
Allow scheduling at current time before execution starts
edwardalee Dec 27, 2023
b849176
Better handling of startup
edwardalee Dec 27, 2023
535cacb
Made execution_started an environment flag
edwardalee Dec 27, 2023
e17ee9a
Prevent spurious error at start
edwardalee Dec 27, 2023
f406b26
Comment only
edwardalee Dec 27, 2023
b571826
Pop events after wait, not before
edwardalee Dec 27, 2023
dfde25f
Ensure dummy events before start of execution
edwardalee Dec 27, 2023
8898631
Typo in comment
edwardalee Dec 27, 2023
be4a03a
Fixed modal reactors
edwardalee Dec 27, 2023
95b7dd8
Reworked socket send functions
edwardalee Dec 28, 2023
3f8ee20
Typo
edwardalee Dec 28, 2023
9f6600f
Better timing
edwardalee Dec 28, 2023
6d77482
Fix bug that can lead to deadlock on STP violation
edwardalee Dec 28, 2023
68b8c27
Merge branch 'main' into federated-cleanup
edwardalee Dec 28, 2023
3feb677
Tuning macros
edwardalee Dec 28, 2023
47cce71
Comments only
edwardalee Dec 29, 2023
2b0796e
Comments only
edwardalee Dec 29, 2023
089489f
Last known status of a port doesn't lag current tag
edwardalee Dec 29, 2023
96adc51
Resurrected port search for RTI. Set maximum number of RTIs on a host…
edwardalee Dec 29, 2023
6261541
Fixed authenticated
edwardalee Dec 29, 2023
7ddc477
Fixed debug message
edwardalee Dec 30, 2023
191a261
Fixed tracing of stop request messages
edwardalee Dec 30, 2023
c042ddd
Comment out too-verbose debug message
edwardalee Dec 30, 2023
61c310f
Fixed deadlock with race in lf_request_stop
edwardalee Dec 30, 2023
283ff04
Impose a time out for response to stop requests
edwardalee Dec 30, 2023
1d3a2e1
Tolerate socket closing during reading physical connection
edwardalee Dec 30, 2023
99edcf9
Have _lf_schedule_at_tag return trigger_handle_t
edwardalee Dec 31, 2023
3c95c2f
Allow tardy messages to unblock reactions and clarify docs
edwardalee Dec 31, 2023
0d4d997
General cleanup
edwardalee Jan 1, 2024
8de37dd
Clean up doxygen docs
edwardalee Jan 1, 2024
b901418
Probably uncessary precaution on connection failure
edwardalee Jan 1, 2024
a07f78b
Fix a bug in EIMT on microstep/after delay interaction
edwardalee Jan 1, 2024
9de19ab
Comments and formatting only
edwardalee Jan 2, 2024
2f1c9df
Removed message_record, replace with pqueue_tag
edwardalee Jan 2, 2024
0ba93da
Fixed RTI compile errors
edwardalee Jan 2, 2024
3c5a96b
Update test for void return value
edwardalee Jan 3, 2024
c374f31
Make resign messages backward compatible
edwardalee Jan 3, 2024
3763535
Fixed tracing for FAILED message
edwardalee Jan 3, 2024
67db29f
Fixed tracing of RESIGN
edwardalee Jan 3, 2024
3d0e8de
make clean removes executables
edwardalee Jan 3, 2024
a0e1c22
Tolerate incomplete message reads for decentralized
edwardalee Jan 6, 2024
78970c8
Formatting only
edwardalee Jan 7, 2024
8ba8a78
Removed noisy debug message
edwardalee Jan 7, 2024
48bccef
Exit RTI immediately if federate fails
edwardalee Jan 9, 2024
0106dfe
Improve the handling of tardy messages.
edwardalee Jan 10, 2024
6712510
Update last known status also for centralized coordination
edwardalee Jan 10, 2024
5de7c5e
Do not wait for tag to advance if MLAA is finite
edwardalee Jan 10, 2024
a8fa18f
Added includes (why weren't these needed before?)
edwardalee Jan 10, 2024
9da5094
Check that ports are in fact unknown before looping
edwardalee Jan 10, 2024
a961d9c
Fixed use of write_to_socket
edwardalee Jan 11, 2024
9a797bb
Fix deadlock caused by STP violation
petervdonovan Jan 13, 2024
08309a9
Merge branch 'main' into federated-cleanup
edwardalee Jan 13, 2024
5e1d9bb
Removed outdated comments
edwardalee Jan 13, 2024
f6e090d
Update core/federated/RTI/main.c
edwardalee Jan 14, 2024
f6e685e
Update core/federated/RTI/rti_common.c
edwardalee Jan 14, 2024
edfde09
Absorb delay functionality into lf_tag_add()
edwardalee Jan 14, 2024
161f00a
Clarify comments for eimt_strict()
edwardalee Jan 14, 2024
fd05ada
Print error on failure to write trace file
edwardalee Jan 14, 2024
f4ab3d8
Comment only
edwardalee Jan 14, 2024
e1783f1
Update core/federated/RTI/rti_remote.c
edwardalee Jan 15, 2024
4b7c940
Comment only
edwardalee Jan 14, 2024
5ff00a2
Comment only
edwardalee Jan 14, 2024
193bd66
Move freeing of local RTI to termination function
edwardalee Jan 15, 2024
7f84a33
Don't exit immediately on federate failure
edwardalee Jan 15, 2024
753d79c
Clean up error handling in receive_and_check_fed_id_message
edwardalee Jan 15, 2024
e44d284
Do not overwrite NET with message tag unless less
edwardalee Jan 15, 2024
c37968d
Comments only
edwardalee Jan 18, 2024
30601a8
Comments only
edwardalee Jan 20, 2024
ea398c7
Trace before write and after read
edwardalee Jan 21, 2024
6e4af8e
Do not acquire mutex during abnormal termination
edwardalee Jan 21, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion core/federated/RTI/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,6 @@ add_executable(
${CoreLib}/utils/pqueue_base.c
${CoreLib}/utils/pqueue_tag.c
${CoreLib}/utils/pqueue.c
message_record/message_record.c
)

IF(CMAKE_BUILD_TYPE MATCHES DEBUG)
Expand Down
79 changes: 68 additions & 11 deletions core/federated/RTI/main.c
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/

#include "rti_remote.h"
#include "net_util.h"
#include <signal.h> // To trap ctrl-c and invoke a clean stop to save the trace file, if needed.
#include <string.h>

Expand All @@ -67,16 +68,50 @@ static rti_remote_t rti;
*/
const char *rti_trace_file_name = "rti.lft";

/** Indicator that normal termination of the RTI has occurred. */
bool normal_termination = false;

/**
* @brief A clean termination of the RTI will write the trace file, if tracing is
* enabled, before exiting.
* Send a failed signal to the specified federate.
*/
void termination() {
static void send_failed_signal(federate_info_t* fed) {
size_t bytes_to_write = 1;
unsigned char buffer[bytes_to_write];
buffer[0] = MSG_TYPE_FAILED;
int failed = write_to_socket(fed->socket, bytes_to_write, &(buffer[0]));
if (failed == 0) {
LF_PRINT_LOG("RTI has sent failed signal to federate %d due to abnormal termination.", fed->enclave.id);
} else {
LF_PRINT_LOG("RTI failed to send failed signal to federate %d on socket ID %d.", fed->enclave.id, fed->socket);
edwardalee marked this conversation as resolved.
Show resolved Hide resolved
}
if (rti.base.tracing_enabled) {
stop_trace(rti.base.trace);
lf_print("RTI trace file saved.");
tracepoint_rti_to_federate(rti.base.trace, send_FAILED, fed->enclave.id, NULL);
}
}

/**
* @brief Function to run upon termination.
* This function will be invoked both after main() returns and when a signal
* that results in terminating the process, such as SIGINT. In the former
* case, it should do nothing. In the latter case, it will send a MSG_TYPE_FAILED
* signal to each federate and attempt to write the trace file, but without
* acquiring a mutex lock, so the resulting files may be incomplete or even
* corrupted. But this is better than just failing to write the data we have
edwardalee marked this conversation as resolved.
Show resolved Hide resolved
* collected so far.
*/
void termination() {
if (!normal_termination) {
for (int i = 0; i < rti.base.number_of_scheduling_nodes; i++) {
federate_info_t *f = (federate_info_t*)rti.base.scheduling_nodes[i];
if (!f || f->enclave.state == NOT_CONNECTED) continue;
send_failed_signal(f);
}
if (rti.base.tracing_enabled) {
stop_trace_locked(rti.base.trace);
lf_print("RTI trace file saved.");
}
lf_print("RTI is exiting abnormally.");
}
lf_print("RTI is exiting.");
}

void usage(int argc, const char* argv[]) {
Expand All @@ -86,7 +121,7 @@ void usage(int argc, const char* argv[]) {
lf_print(" -n, --number_of_federates <n>");
lf_print(" The number of federates in the federation that this RTI will control.\n");
lf_print(" -p, --port <n>");
lf_print(" The port number to use for the RTI. Must be larger than 0 and smaller than %d. Default is %d.\n", UINT16_MAX, STARTING_PORT);
lf_print(" The port number to use for the RTI. Must be larger than 0 and smaller than %d. Default is %d.\n", UINT16_MAX, DEFAULT_PORT);
lf_print(" -c, --clock_sync [off|init|on] [period <n>] [exchanges-per-interval <n>]");
lf_print(" The status of clock synchronization for this federate.");
lf_print(" - off: Clock synchronization is off.");
Expand Down Expand Up @@ -254,6 +289,16 @@ int main(int argc, const char* argv[]) {

// Catch the Ctrl-C signal, for a clean exit that does not lose the trace information
signal(SIGINT, exit);
#ifdef SIGPIPE
// Ignore SIGPIPE errors, which terminate the entire application if
// socket write() fails because the reader has closed the socket.
// Instead, cause an EPIPE error to be set when write() fails.
// NOTE: The reason for a broken socket causing a SIGPIPE signal
// instead of just having write() return an error is to robutly
// a foo | bar pipeline where bar crashes. The default behavior
// is for foo to also exit.
signal(SIGPIPE, SIG_IGN);
#endif // SIGPIPE
if (atexit(termination) != 0) {
lf_print_warning("Failed to register termination function!");
}
Expand All @@ -277,16 +322,28 @@ int main(int argc, const char* argv[]) {
// Allocate memory for the federates
rti.base.scheduling_nodes = (scheduling_node_t**)calloc(rti.base.number_of_scheduling_nodes, sizeof(scheduling_node_t*));
for (uint16_t i = 0; i < rti.base.number_of_scheduling_nodes; i++) {
federate_info_t *fed_info = (federate_info_t *) malloc(sizeof(federate_info_t));
federate_info_t *fed_info = (federate_info_t *) calloc(1, sizeof(federate_info_t));
initialize_federate(fed_info, i);
rti.base.scheduling_nodes[i] = (scheduling_node_t *) fed_info;
}

int socket_descriptor = start_rti_server(rti.user_specified_port);
wait_for_federates(socket_descriptor);
if (socket_descriptor >= 0) {
wait_for_federates(socket_descriptor);
normal_termination = true;
if (rti.base.tracing_enabled) {
// No need for a mutex lock because all threads have exited.
stop_trace_locked(rti.base.trace);
lf_print("RTI trace file saved.");
}
}

lf_print("RTI is exiting."); // Do this before freeing scheduling nodes.
free_scheduling_nodes(rti.base.scheduling_nodes, rti.base.number_of_scheduling_nodes);
lf_print("RTI is exiting.");
return 0;

// Even if the RTI is exiting normally, it should report an error code if one of the
// federates has reported an error.
return (int)_lf_federate_reports_error;
}
#endif // STANDALONE_RTI

176 changes: 0 additions & 176 deletions core/federated/RTI/message_record/message_record.c

This file was deleted.

86 changes: 0 additions & 86 deletions core/federated/RTI/message_record/message_record.h

This file was deleted.

Loading
Loading