Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplifying P25 call handling process #791

Closed
rosecitytransit opened this issue Mar 25, 2023 · 69 comments
Closed

Simplifying P25 call handling process #791

rosecitytransit opened this issue Mar 25, 2023 · 69 comments

Comments

@rosecitytransit
Copy link
Contributor

rosecitytransit commented Mar 25, 2023

First of all, for those who don't know you can set verbosity=10 here to see raw P25 voice channel metadata; this should ideally be exposed as a configuration option and may be best to only do it with one recorder set as the data isn't tagged.

But reading #326, #544, #674, #701, #731, #760, #773 and maybe others and the complicated call handling diagram, I'm wondering if the process could be simplified. Specifically:

  • Treat control channel grant and update messages the same, and base recording logic on the voice channel and its meta data

  • Eliminate the record_more_transmissions and termination flags and inactive and idle states

  • Optionally let the decoding of the voice channel continue until the stream stops (e.g. there's been no more frames seen) or squelch is activated; this would help in cases of poor control channel decoding or where there aren't update messages

  • Base new transmissions on seeing TDUs in the voice stream, or a change of unit or talk group ID in the voice meta data

  • If the talk group in the voice meta data changes, drop frames and start a new call (if the subroutine can't do that, at least have a voice talk group variable and have the main thread start a new call on next check)

  • Base unit IDs on the voice meta data, except at the start where it could come from the control channel message or when it is 0 or empty

  • Make transmission mode a configuration option; don't change to a new file on TDU or unit ID change if set to off

SO:

Control channel grant/update message -> start voice channel decoding -> TDUs seen or unit ID changes -> start new file if transmission mode enabled -> data flow on voice channel ceases -> end call

@rosecitytransit
Copy link
Contributor Author

Just realized that there is a message on the voice channel that marks the end of a call:
image

Example:
NAC 0xe27 TDU15: LCW: ec=0, pb=0, sf=0, lco=15 : 00 00 05 00 06 e9 00 00 00, gly_errs=0

So instead of creating a recorder->last_update process, it may be as simple as removing the two process_TTDU(); calls here https://github.com/robotastic/trunk-recorder/blob/672df88ba69516cef3b928617059a9f14cbe80d9/lib/op25_repeater/lib/p25p1_fdma.cc#L401 and here https://github.com/robotastic/trunk-recorder/blob/672df88ba69516cef3b928617059a9f14cbe80d9/lib/op25_repeater/lib/p25p1_fdma.cc#L413

and instead calling it calling it inside process_LCW with I think if ((framer->duid == 0x0f) && (lco == 15)) process_TTDU();

@robotastic
Copy link
Collaborator

I think this is pretty close to how things are handled right now. The tricky bit is that there isn't perfect alignment between the Control Channel and the Voice Channels. The processing chain on the voice channel side can sometimes be slower than the control channel, esp on long transmissions. A new Grant call can come in on on the same TG / Freq while the Voice channel is still processing the previous transmission.

The TDU messages are used to mark the end of a transmission... but tracking the last write time is there as a fallback just in case some TDUs are missed. The P25 processing code put a tag on the stream of samples when it decodes a TDU and the wav_sink gets it and ends the current transmission it is tracking.

Of course non of this works with Analog. You can generally use the Squelch as a TDU.... I can't remember if that is being done now.

So instead of creating a recorder->last_update process, it may be as simple as removing the two process_TTDU(); calls here

trunk-recorder/lib/op25_repeater/lib/p25p1_fdma.cc
Line 401 in 672df88
process_TTDU();
and here
trunk-recorder/lib/op25_repeater/lib/p25p1_fdma.cc
Line 413 in 672df88
process_TTDU();
and instead calling it calling it inside process_LCW with I think if ((framer->duid == 0x0f) && (lco == 15)) process_TTDU();

I am not sure what you mean here. Is this a different approach for capturing the TDUs? Are some being missed with the current approach?

@rosecitytransit
Copy link
Contributor Author

I think this is pretty close to how things are handled right now.

Now, if the control channel messages stop, it kills the call. This sets the state to INACTIVE, which results it being ended once config.call_timeout has been reached: https://github.com/robotastic/trunk-recorder/blob/9d9def8ce7afbbd324a36ccd7cb746f7d749d73d/trunk-recorder/main.cc#L855-L859

A new Grant call can come in on on the same TG / Freq while the Voice channel is still processing the previous transmission.

With my proposal, as long as the call is still active, additional control channel messages about the call would essentially be ignored since the call already exists. The processing would all be based on the voice channel messages.

Is this a different approach for capturing the TDUs?

Correct. This switches it to only consider the TDUs that mark the end of the call, and not just the transmission. I know you'd still have to handle the other ones (or maybe watch for source ID to change) in order to know when to change to a new transmission.

@robotastic
Copy link
Collaborator

With my proposal, as long as the call is still active, additional control channel messages about the call would essentially be ignored since the call already exists. The processing would all be based on the voice channel messages.

One challenge I have seen is having a Grant message come in on the Control Channel before the recorder has finished recording the current call. That is what the record_more_transmissions flag helps with. Without the prompt from the Control Channel, it wouldn't know if the next transmission is from the same talkgroup or not. The talkgroup info isn't always available on the voice channel at the start of the call.

Is part of the issue that unit-to-unit calls do not always get update messages on the Control Channel?

Correct. This switches it to only consider the TDUs that mark the end of the call, and not just the transmission. I know you'd still have to handle the other ones (or maybe watch for source ID to change) in order to know when to change to a new transmission.

Oh wait - are there 2 types of TDUs? The ones at the beginning and the ones at the End? If there is a way to only get the ones at the end, that would help clean up the codes a little

@rosecitytransit
Copy link
Contributor Author

The talkgroup info isn't always available on the voice channel at the start of the call.

That's not good. If you can't confirm that the next transmission is part of the same call or not, that does complicate things. But at least my logging (sample attached) shows it always available. I don't ever see info for unit-unit calls, but I think that's just a lack of handling. Do you have samples from systems that don't show it?

recorder2-20230403-003307.log

Is part of the issue that unit-to-unit calls do not always get update messages on the Control Channel?

Yes, but I also thought simpler might be better and avoid some of the issues that others have reported, and maybe if control channel decoding goes bad but voice channel stays good.

Oh wait - are there 2 types of TDUs?

This is a summary of what I see:

NAC 0xe27 LDU1: LCW: ec=0, pb=0, sf=0, lco=0 : 00 00 05 00 06 e9 00 89 a4, srcaddr=35236, grpaddr=1769
IMBE ...
NAC 0xe27 LDU2: ESS: algid=80, keyid=0, mi=00 00 00 00 00 00 00 00 00, rs_errs=0
IMBE ...
NAC 0xe27 TDU3:  
NAC 0xe27 TDU15:  LCW: ec=0, pb=0, sf=0, lco=0 : 00 00 05 00 06 e9 00 89 a4, srcaddr=35236, grpaddr=1769, gly_errs=0
NAC 0xe27 TDU15:  LCW: ec=0, pb=0, sf=0, lco=0 : 00 00 05 00 06 e9 00 00 00, srcaddr=0, grpaddr=1769, gly_errs=0
...
NAC 0xe27 TDU15:  LCW: ec=0, pb=0, sf=1, lco=15 : 4f 00 00 00 00 00 ff ff ff, gly_errs=0
...

There's lots of TDU15s with lco=0 which are silence after the end of a transmission, and then a few TDU15s with lco=15 that mark the end of the call.

The ones at the beginning and the ones at the End?

Have you seen them at the beginning? I think I've only seen them on unit-unit calls, normally I'd expect that there should be voice right away.

@robotastic
Copy link
Collaborator

Oh wait - I think you are right. The Group is always there before voice frames some - the Src ID isn't always. I am seeing TDUs at the beginning of transmissions but it could be that I am not processing things correctly.

I would be game to create a branch and see if things could be simplified a bit.

On the Voice Channel how would you know when the transmissions for a group are done and things should be uploaded? An obvious sign would be that a different talkgroup starts transmitting on the freq. you might need to have a timer in each recorder. If it has been x seconds since a TDU the recorder would close things or if it has been x sec since the last voice frame or squelch.

On trunk systems though the ground truth for what talkgroup has which freq is the control channel... but there maybe enough control info on the voice channels that we don't need to rely on it.

@rosecitytransit
Copy link
Contributor Author

rosecitytransit commented Apr 4, 2023

Meant to (also) send this last night

I've created a draft of my proposal of how to differentiate between transmission end and call end here: c48502b My thought is that transmission_sink.cc (or whatever) should have the option on transmission end to either a) close the wav file/transmission and start a new one, b) keep the wav file open (enable conversation mode) or maybe c) always end the call (keeping every transmission in a separate call and maybe rely on new grant message for next transmission; would also need to process every grant message ignoring any current call)

I also coded a recorder->since_last_update process that depends on the last voice channel message rather than just the last file write at c7bc0bb With the change to only terminate on call end (TDU15/lco=15) I think even the troublesome unit-unit calls don't need that.

@rosecitytransit
Copy link
Contributor Author

On the Voice Channel how would you know when the transmissions for a group are done and things should be uploaded?

In order of correctness (but any apply): 1) call termination (TDU15/lco=15) is seen; 2) different talk group appears on voice channel (in which case really should start a new call or notify main thread to) 3) no more voice frames seen 3) it's been a while since last control channel message and wav file last write 4) squelch is activated

@tadscottsmith
Copy link
Contributor

On trunk systems though the ground truth for what talkgroup has which freq is the control channel... but there maybe enough control info on the voice channels that we don't need to rely on it.

If you think of it from a Subscriber Unit (radio) perspective, the control channel information is only used to tell the radio to retune to the traffic channel when a talk group it's programmed to monitor receives a grant. After that, the radio needs to get all of its information from the traffic channel and has no visibility into what's happening on the control channel.

According to the specs, on a P25 Trunked System the radio should remain on the traffic channel until one of the following occurs:

  • The radio receives the call termination message (TDU15/lco=15).
  • The radio receives a packet addressed to a different group address.
  • The radio fails to detect any data on the traffic channel for a predetermined amount of time.
  • The radio receives data related to a different service than it moved to the traffic channel for (e.g. unit-to-unit call information on a channel it tuned for a group call).

After that, the radio should return to the control channel and continue monitoring.

My only concern with this approach is that what the P25 specs consider a "call" and what you've tried to logically define as a call (e.g. a series of transmissions on the same talk group within a specified period of time), might be different. In my opinion, what sets trunk-recorder apart from other programs is the ability to provide a listening experience where short conversations are recorded as a single call. Listening "transmission by transmission" style just doesn't work on busy systems.

That being said, I'm always open to help write or test new code!

@robotastic
Copy link
Collaborator

That is a great point @tadscottsmith!!

and overall I am probably confusing things by trying to bunch together common transmissions, vs just keeping them in separate files.

The terminology I invented (which may not 100% match P25 specs) is:
Transmission - goes from a GRANT to a TDU
Calls - are multiple Transmission recordings that a merged together into a single file. A Call last from the first GRANT for a TG on a freq and stops either when another TG has a GRANT on that Freq or after a timeout period where another GRANT is given for that TG/Freq.

I think all of the fundamentals are there to easily change things around and see how they work. Right now, recording termination is based on Receiving a TDU on the Voice Channel. Right now it can get overridden if there hasn't been and Update for that TG/Freq assignment in a while, but that can be turned off. Each of the transmissions is a separate file.

@rosecitytransit I guess I am still a little confused on what you think should change and how it would simplify things.

@robotastic
Copy link
Collaborator

Meant to (also) send this last night

I've created a draft of my proposal of how to differentiate between transmission end and call end here: c48502b

Ah - so there are multiple types of TTDUs? Only some of which are at the end of the call? I do a check in Transmission_sink.cc to ignore the ones at the beginning - but I can make this change and get rid of that check

@robotastic
Copy link
Collaborator

I also coded a recorder->since_last_update process that depends on the last voice channel message rather than just the last file write at c7bc0bb With the change to only terminate on call end (TDU15/lco=15) I think even the troublesome unit-unit calls don't need that.

With the current approach - call->last_update() is based on the last time it received an UPDATE message on the Control Channel. This is how you know the freq is still reserved for that TG. You could switch it to the last time the Transmission Sink received a voice frame.

With this switch, the Calls would instead terminate when there has been activity for a TG on a Freq for X period of time.... which I think could work? If there is a GRANT on that Freq for a different TG, you can still set the record_more flag to false for the recorder and it would immediately stop when it receives the TDU or gets the flag for a different talkgroup on the freq.

It might be worth tracking the last UPDATE on the Control Channel still, as well as the last Sample at the Transmission Sink...

I can go make a branch and give this a try....

@rosecitytransit
Copy link
Contributor Author

no visibility into what's happening on the control channel.

On Smart Net systems; the sub-audible data stream provides certain GRANT messages to enable priority listening. I think P25 is actually a down grade there.

what the P25 specs consider a "call" and what you've tried to logically define as a call (e.g. a series of transmissions on the same talk group within a specified period of time), might be different.

Only technically so. In the past, t-r did attempt to retune frequency if the next transmission was on a different one; this could be desirable on Smart Net systems where there's very little hang time and conversations do rapidly switch. Otherwise I think it would only be different if the time out on the system is different than the time out set in the code or though callTimeout

timeout period where another GRANT

I think that should actually be "timeout period after the last GRANT or UPDATE message"

more...

@natecarlson
Copy link
Contributor

natecarlson commented Apr 5, 2023

and overall I am probably confusing things by trying to bunch together common transmissions, vs just keeping them in separate files.

For the longer term, might it make sense to not bunch together the common transmissions, but instead add logic at the playback side to optionally group them? It seems like the logic would be pretty simple; group calls by talkgroup where there is less than X seconds between the end of one call and the start of another?

@rosecitytransit
Copy link
Contributor Author

I guess I am still a little confused on what you think should change and how it would simplify things.

Check the last 2 commits on https://github.com/rosecitytransit/trunk-recorder/tree/trimet4.5 (c48502b and 84e0dd6)

Ah - so there are multiple types of TTDUs? Only some of which are at the end of the call?

Don't you have a copy of the P25 specs? But as I said, I'm seeing a TDU3 and (TDU15's w/lco!=15) that mark the end of a transmission/are silence after the end of a transmission and then there's TDU15's w/lco==15 that mark the end of a call. I've only seen a very few TDUs at the start of unit-unit calls, and those are all less than 1 second from start.

If there is a GRANT on that Freq for a different TG,

the recorder should soon gets the flag for a different talkgroup on the freq. and immediately stop

@robotastic
Copy link
Collaborator

@rosecitytransit - sorry about that, I just surfaced online little before lunch but then had to dive back to work. I think I have some ideas based on those commits. Let me see how much I can drive from the Voice Channel vs the Control Channel and we can see if that makes things cleaner / better.

I will also print out the different TDU values, I may have been been lump them all together - I will go checkout what I am seeing on my local systems and on SmartNet system that have digital voice channels.

@rosecitytransit
Copy link
Contributor Author

So I've been playing around with things (see https://github.com/rosecitytransit/trunk-recorder/tree/trimet4.5 ) and I'm thinking I might just go with the recorder->since_last_update option ( rosecitytransit@c7bc0bb ). Is there any downside to that? Seeing how you're now using GNU Radio stream tags to send data, would it be possible/better to add timestamp tags? It would not rely on successfully receiving a call termination message.

I'd be willing to open a PR with this method and maybe work on eliminating the record_more_transmissions, etc if you're interested.

@robotastic
Copy link
Collaborator

robotastic commented Apr 16, 2023 via email

@rosecitytransit
Copy link
Contributor Author

rosecitytransit commented Apr 17, 2023

it is probably cleaner to keep track of the last time a voice sample came in inside the transmission sink

Can you clarify how? The transmission_sink.cc is a part that's less clear to me. I'm just seeing that somehow, manage_calls() in main.cc needs to know when the flow of samples has stopped so the call can be closed.wond

Also, making record_more_transmissions always be true results in even the problematic unit-unit calls successfully getting recorded by transmission; I've attached the JSON details if you're interested in how it looks. (I was going to deal with things by changing the termination flag in p25p1_fdma.cc to only get set on TDU15/lco 15 and then do a work around to handle the source ID info)

102-1681690624_772356250-call_22.json.txt

Edit: Well, it appears the rx_status().last_update may indeed by Phase 1 only. But I'm wondering if you could check if either d_sample_count has stopped increasing or if there's been 0 input_items or output_items

@robotastic
Copy link
Collaborator

robotastic commented Apr 17, 2023 via email

@rosecitytransit
Copy link
Contributor Author

I tried monitoring noutput_items, output_items.size() and input_items.size() and as I suspected, transmission_sink only gets the actual audio, while you'd need (at least I want) to know if there's anything on the voice channel, audio or TDUs/silence.

@robotastic
Copy link
Collaborator

robotastic commented Apr 19, 2023 via email

@rosecitytransit
Copy link
Contributor Author

Depending on what you want to do with that info would help determine the best way to route it.

Just want to know when there is nothing more coming in on the voice channel and therefore the call can be closed, whether that be by checking the timestamp of the latest frame, if there's zero in the queue, or some other method. Don't need any actual data. I can do some more playing around when I have time.

@rosecitytransit
Copy link
Contributor Author

This is what I've decided to go with; I've checked and it works for both Phase 1 and 2. rosecitytransit@c9a6e07

If you want to do things differently, I'd be willing to video chat with you sometime, or maybe you could submit a PR that I could work from. I've spent WAYYYYY too much time trying to figure things out and how I could pass things from the frame assembler to the transmission sink is not something I really understand.

@robotastic
Copy link
Collaborator

Cool - Let me try taking a swing at it this week. I will create a branch and do something similar except move it into the Transmission Sink.

@rosecitytransit
Copy link
Contributor Author

I decided to change from using a counter to checking the last time (check trimet4.5). Also, FWIW the unit-unit call example I posted may not be normal as there's never see trunking messages from the remote side. And when you have time, tell me your thoughts on the log file in 482.

@robotastic
Copy link
Collaborator

I haven't forgotten!! I am going to try and get something coded up this weekend. Sorry!

How can I get to that log file?

@rosecitytransit
Copy link
Contributor Author

It's OK. I know you've said you have kids and apparently a newish job. I got things figured out for me. And I can be available this weekend if you wanted to discuss things.

By log file, I meant issue 482 and how a daily log file should be laid out.

@robotastic
Copy link
Collaborator

ok - I am doing a little experimentation. From the docs, it looks like both TDUs with a Data ID of 3 and 15 are the same. Both should be treated as terminators.

In my testing I am seeing terminators show up at the beginning of Transmissions... but I am not sure if that is an artifact of some of the Samples being stuck in a buffer some where

image

@rosecitytransit
Copy link
Contributor Author

In the second call (the one that begins at 2023-04-03 00:33:26) in the log file I attached in a previous comment above does show a single TDU15 and then 3 TDU3s before the HDU and first LDU. I thought I read TDU3 is supposed to mark the end of a transmission, TDU15s w/LCO0 silence, and TDU15s w/LCO15 call end.

But my thought is to avoid being dependent on receiving those and instead simply keep decoding things until there's been a second or two with no new frames (or of course a different talk group appears). I can be available in about an hour or sometime later if you want to discuss.

@rosecitytransit
Copy link
Contributor Author

rosecitytransit commented May 8, 2023

I believe Luke does have a copy of the docs. But legally or not, someone uploaded (an older version of?) them to the Internet Archive: https://archive.org/details/TIA-102_Series_Documents/ the same user also uploaded MOTOTRBO documents and other stuff: https://archive.org/details/MototrboDocuments

Also, I have my call handling proposal here: https://github.com/rosecitytransit/trunk-recorder/tree/tdus-to-tsink particularly rosecitytransit@a3ae26e

With that, transmission_sink knows when the call is active and not just a transmission, so the recorder states could be replaced by tracking the timestamp of the last sample (they could be tracked in main.cc instead, and be just recording or stopped). record_more_transmissions could also be eliminated since it should now keep going until the flow of samples ends, as well as possibly the call IDLE state. Also, if there's not, there needs to be a way to halt recording if the talk group changes on the voice channel; I think by doing a comparison here and similar for other recorders: https://github.com/robotastic/trunk-recorder/blob/f7ece07fea315df95e86fca2269c3a8c1c6b96a3/lib/op25_repeater/lib/p25_frame_assembler_impl.cc#L191

@tadscottsmith
Copy link
Contributor

I believe Luke does have a copy of the docs. But legally or not, someone uploaded (an older version of?) them to the Internet Archive: https://archive.org/details/TIA-102_Series_Documents/ the same user also uploaded MOTOTRBO documents and other stuff: https://archive.org/details/MototrboDocuments

Also, I have my proposal here: https://github.com/rosecitytransit/trunk-recorder/tree/tdus-to-tsink particularly rosecitytransit@a3ae26e

I asked because TIA-102.AABD-B is not in that location, and is the document that thoroughly explains the handoff from the control channel to the traffic channels and how radios should switch back and forth.

@rosecitytransit
Copy link
Contributor Author

Check out the new_call_mgmt branch

Looks nice, but I can't get it to pick up the control channel, even though I verified it's there using heatmap.py, had been testing my tdus-to-tsink branch and even tried deleting everything in the working folder and starting from scratch.

@robotastic
Copy link
Collaborator

@tadscottsmith Thanks - that is a good doc, just paged through it.

Is Unit to Group the standard type of call? Is Group the same thing as a Talkgroup? The weird thing about this diagram is that the Call Terminator comes from the RFSS, but that might just be because it is relaying a voice channel terminator from an SU?

image

Potentially the default timeout is 3 seconds - this list the Max, Default & Min. Based on the references, this may just be for System ID on the Control Channel.
image

I have definitely noticed that if you remove a Call after there has been a gap of only 1 or 2 secs of voice transmissions on the traffic channel, you will still get UPDATE messages for that group on the Control Channel.

It does look like GRANTS and UPDATES should sort of be treated the same... I think the trick will be not removing a Call when a Recorder has timed out, while the TG is still getting updates on the Control Channel:

image

the solution for this could just be to wait until both the Recorder has timed out based on the Traffic Channel and the Call has timed out based on UPDATEs on the Control Channel.

@robotastic
Copy link
Collaborator

@rosecitytransit I just pulled in the changes from the Master branch - it will work for P25 systems now. I think this has a lot of what you are proposing. Calls timeout based on the last Voice Frame that was received. If a different Talkgroup appears on a Voice Channel, the Call will also stop.

I haven't seen any special TDUs or DATA frames on the Voice Channel that designate when a single SU has stop transmitting vs when the Control Channel has stopped allocating a Voice Channel for a Talkgroup. I have been using a SmartNet system for testing though... I will also look at my local P25 system today.

@rosecitytransit
Copy link
Contributor Author

Is Unit to Group the standard type of call? Is Group the same thing as a Talkgroup? The weird thing about this diagram is that the Call Terminator comes from the RFSS, but that might just be because it is relaying a voice channel terminator from an SU?

Without having the document, my guess would be "yes" to all those questions.

It does look like GRANTS and UPDATES should sort of be treated the same... I think the trick will be not removing a Call when a Recorder has timed out, while the TG is still getting updates on the Control Channel:

This is on the list of things I want to propose or fix, and I thought was mentioned in an issue or PR a long time ago but can't find it now.

the solution for this could just be to wait until both the Recorder has timed out based on the Traffic Channel and the Call has timed out based on UPDATEs on the Control Channel.

I would think no more data on the traffic channel would be the definitive guide, especially with the decoding being somewhat delayed vs the control channel.

I have been using a SmartNet system for testing though

I've never played with (low-level) SmartNet stuff as the systems around here have long been since replaced, but I thought I've read it's pretty messy to deal with. In any case, as I've said, I think you can reference allocation based on there being something (anything) coming in on the voice channel and monitoring for that stopping.

@robotastic
Copy link
Collaborator

It does look like GRANTS and UPDATES should sort of be treated the same... I think the trick will be not removing a Call when a Recorder has timed out, while the TG is still getting updates on the Control Channel:

This is on the list of things I want to propose or fix, and I thought was mentioned in an issue or PR a long time ago but can't find it now.

Yep - I have switched back and forth from having it this.

the solution for this could just be to wait until both the Recorder has timed out based on the Traffic Channel and the Call has timed out based on UPDATEs on the Control Channel.

I would think no more data on the traffic channel would be the definitive guide, especially with the decoding being somewhat delayed vs the control channel.

The potential challenge here is that I am seeing the Control Channel continue to send UPDATEs after the Voice Channel has timed out. This would cause some thrashing, because the initial Call for the TG would be closed and then a new, additional Call would be started up based on the UPDATE.

One solution, as mentioned above, would be to only remove a Call when both the Voice and Control Channels have timed out from not receiving a message for the Call.

I have been using a SmartNet system for testing though

I've never played with (low-level) SmartNet stuff as the systems around here have long been since replaced, but I thought I've read it's pretty messy to deal with. In any case, as I've said, I think you can reference allocation based on there being something (anything) coming in on the voice channel and monitoring for that stopping.

yep - that is what it is doing now. Actually, it is based on the last Voice Frame coming in on the channel... but I could switch that to include any data packets too.

@rosecitytransit
Copy link
Contributor Author

rosecitytransit commented May 8, 2023

Also, I'm experiencing weirdness using my trimet4.5 branch. I'm ending calls based on last_voice_frame https://github.com/rosecitytransit/trunk-recorder/blob/6beb1fb508e5955c66257b751095c2d3af1a5f01/lib/op25_repeater/lib/p25_frame_assembler_impl.cc#L188-L190 or when (recorder->since_last_write() > config.call_timeout) and 1 of the 4 systems I'm recording from is working fine, on another one, on another, last_voice_frame doesn't seem to be working but recorder->since_last_write() is and on another neither are and I think lots of calls are never closing, only closing when there's a frequency overlap. Check out the log excerpt: http://www.rosecitytransit.org/files/05-08-2023_0047_00.log.gz

I may try switching to either my tdus-to-tsink approach or try your new_call_mgmt again.

@rosecitytransit
Copy link
Contributor Author

One solution, as mentioned above, would be to only remove a Call when both the Voice and Control Channels have timed out from not receiving a message for the Call.

Or just wait until a few seconds after the voice channel time out. But wouldn't the delay in voice processing help here?

Actually, it is based on the last Voice Frame coming in on the channel... but I could switch that to include any data packets too.

By "data packets" do you mean TDUs, etc in the hang time following transmissions? Because you'd need to watch for those to know whether the call has ended or not, to wait for any further transmissions. If you're only looking for audio, there could be late update messages for the hang time.

@robotastic
Copy link
Collaborator

Give the branch I am working on a try - It is measuring the last_voice_frame in the Transmission Sink. It generally seems to be working.

By "data packets" do you mean TDUs, etc in the hang time following transmissions? Because you'd need to watch for those to know whether the call has ended or not, to wait for any further transmissions. If you're only looking for audio, there could be late update messages for the hang time.

yea - TDUs or any of the other messages that show up there. Maybe measuring from the end of those transmission would be more accurate. I have to measure and monitor the time since last UPDATE on the Trunk channel anyhow, so it might make the most sense to take advantage of both. I could also just up the call_timeout value to 5 secs or something like that, but good to end calls as soon as possible to free up Recorders and distribute the Calls

@tadscottsmith
Copy link
Contributor

tadscottsmith commented May 8, 2023

Is Unit to Group the standard type of call? Is Group the same thing as a Talkgroup? The weird thing about this diagram is that the Call Terminator comes from the RFSS, but that might just be because it is relaying a voice channel terminator from an SU?

My guess is that the the transmitting radio doesn't send the TDU, the FNE adds the TDUs when it stops relaying audio.

It does look like GRANTS and UPDATES should sort of be treated the same... I think the trick will be not removing a Call when a Recorder has timed out, while the TG is still getting updates on the Control Channel:

From a receiver (trunk-recorder) standpoint, they should be able to be treated similarly. The caveat would be that trunk-recorder would need to start gleaning things like the SRC from the traffic channel on an update. A transmitter needs the full grant message before transmitting.

The requesting SU shall receive the non-update version of the grant (which includes the requesting unit’s ID) before it may begin initial transmission on the traffic channel. The “update” form of the grant is used only for late entry listening SUs.

yea - TDUs or any of the other messages that show up there. Maybe measuring from the end of those transmission would be > > more accurate. I have to measure and monitor the time since last UPDATE on the Trunk channel anyhow, so it might make the > >most sense to take advantage of both. I could also just up the call_timeout value to 5 secs or something like that, but good to >end calls as soon as possible to free up Recorders and distribute the Calls

Is there any reason to track the control channels last message time? Couldn't you cleanup calls based on the recorders' last activity?

EDIT: It does appear you need to track the last update for monitored calls. But for calls that have been recorded, I'm wondering if you could just go off the last recorded time rather than the last control channel update.

@robotastic
Copy link
Collaborator

EDIT: It does appear you need to track the last update for monitored calls. But for calls that have been recorded, I'm wondering if you could just go off the last recorded time rather than the last control channel update.

Totally - that is how I am doing right now for the Branch. Calls are getting cleaned up based on the last time a Voice Frame was received. The tricky part is that I was seeing UPDATE messages still arriving for that TG after it had timed out, based on the Voice Frame.

From a receiver (trunk-recorder) standpoint, they should be able to be treated similarly. The caveat would be that trunk-recorder would need to start gleaning things like the SRC from the traffic channel on an update. A transmitter needs the full grant message before transmitting.

Yea - it is nice being able to get the SRC ID from the GRANT message. I would be very curious to see how many Calls are missed by just looking at the GRANT to start a call.

It could be helpful for a resource constrained system because if a Recorder free up halfway through a transmission, a call could switch from Monitoring to Recording. It would be a little awkward though because if it works as intended, you only get half a call.

For now, I will just leave GRANT/UPDATE message stuff the way it is. The works fine as an independent change. Better to make small incremental changes.

@rosecitytransit
Copy link
Contributor Author

The tricky part is that I was seeing UPDATE messages still arriving for that TG after it had timed out, based on the Voice Frame.

As I said, I think it's just a matter of checking to see if it's been a few seconds past the last voice channel activity to close the call (and benefiting from the voice processing delay). I've been doing so in case there's a momentary decoding issue or processing delay and then activity starts coming in again. Or of course waiting for a few seconds since a message from either place.

Yea - it is nice being able to get the SRC ID from the GRANT message. I would be very curious to see how many Calls are missed by just looking at the GRANT to start a call.

I think this could log when that happens https://github.com/robotastic/trunk-recorder/blob/f7ece07fea315df95e86fca2269c3a8c1c6b96a3/trunk-recorder/main.cc#L1224

My thought is to combine the handle_call_grant and handle_call_update but include a parameter tells the type. But I'm fine with/would like to address the issue separately.

It could be helpful for a resource constrained system because if a Recorder free up halfway through a transmission, a call could switch from Monitoring to Recording. It would be a little awkward though because if it works as intended, you only get half a call.

Better than no call at all?

@robotastic
Copy link
Collaborator

To try things out - I switched it around so both UPDATES & GRANTS messages get handle as a GRANT. I also have it only timing out Calls based on the last time the Recorder got a voice frame.

One thing is clear - it looks like the SmartNet parsing is generating a lot of incorrect UPDATE messages. I am seeing a lot of Calls with no transmissions in them.

Give the branch a try and let me know what you are seeing.

@rosecitytransit
Copy link
Contributor Author

rosecitytransit commented May 10, 2023

I've been a little hesitant because I've been confused how it keeps calls open but I think I finally get it: terminate_call gets set and reset for each terminator seen, getting flipped back and forth during the hang time string of TDUs.* (Plus, I went ahead and added the pdx2 system so don't have the spare SDRs that I did. But I think I can add yet another one.)

If so, what I'd do instead is leave it true, and essentially have it be the equivalent of the recorder IDLE state: In p25_frame_assembler_impl replace if (get_call_terminated()) { terminate_call = true; reset_call_terminated(); } with if (!get_call_terminated()) terminate_call = false; and eliminate reset_call_terminated(). For p25p1_fdma move the terminate_call = false to within process_voice() and if it was true reset_rx_status() there, somewhere before line 711 where rx_status first gets added to. This would help with yet another change I'd like to make: counting the errors and spikes (and at least for me total_len) by call instead of by transmission.

Other then that, I think I'm willing to switch to your method. I'll update my trimet4.5 branch when I do.

I think yours is essentially the same as my tdus-to-tsink (ignore the switch to sending status_string, I was going to undo that*) and I'm wondering if you could just eliminate these 2 lines https://github.com/robotastic/trunk-recorder/blob/5b7f2c668570044bbdd41ecf2456ec9a7c37ad49/lib/op25_repeater/lib/p25_frame_assembler_impl.cc#L217-L218 and instead always send the spike_count and error_count when amt_produce = 0, and in transmission_sink treat seeing those as the terminator flag.

Also, good job on adding the talkgroup checking and stopping the recorder if there's a mismatch. If you saw my now-deleted comment, it's confusing having the commit list being sorted newest first while compare changes is oldest first.

*I've been believing that terminate_call fires, and the terminate and possibly error_count and spike_count tags get added, only on the first terminator seen, but if I'm correct, it happens every terminator because it's reset at the start of processing every frame in void p25p1_fdma::rx_sym and in transmission_sink the later tags get ignored because the recorder state will then be IDLE or STOPPED and state != RECORDING. If so, what's the point of the reset_call_terminated()?

@robotastic
Copy link
Collaborator

Oh yea... definitely don't test on a production system! :)

The Terminate Flag automatically gets reset when a voice frame comes in... rx_sym doesn't seem to get called when TDUs come in:
https://github.com/robotastic/trunk-recorder/blob/093c9a01c32b6ec5410155ea2030cb2d56e98134/lib/op25_repeater/lib/p25p1_fdma.cc#L851

I recently added the reset_call_terminated() because otherwise the "terminated" tag will kept being placed on the flowgraph down to the transmission_sink until the recorder is stopped, because each time that block is called it will see that the termiantion flag is true... even if a new TDU has not been received.

I am sort of confused on the change you are looking to do with terminate_call though.

I'm wondering if you could just eliminate these 2 lines

trunk-recorder/lib/op25_repeater/lib/p25_frame_assembler_impl.cc
Lines 217 to 218 in 5b7f2c6
if (terminate_call) {
add_item_tag(0, nitems_written(0), pmt::intern("terminate"), pmt::from_long(1), d_tag_src );
and instead always send the spike_count and error_count when amt_produce = 0, and in transmission_sink treat seeing those as the terminator flag.

You might be able to get away with making that switch... but you can have multiple tags on a sample, so it seems clearer to just have an explicit tag for each. I might be missing what you are trying to do though.

The Call Concluder function tallies up the Spikes and Errors from all the transmissions and gives you call level info. I think that should be in the JSON and it is in the call_info for Plugins

@rosecitytransit
Copy link
Contributor Author

Oh yea... definitely don't test on a production system! :)

I've been up late some nights trying to figure out why changes I made weren't working. I also missed some calls because I mixed up time and time_t.

rx_sym doesn't seem to get called when TDUs come in:

Didn't know that! I'd love to add a ton of logging statements to see what happens when/where.

even if a new TDU has not been received.

That's not good. I thought p25_frame_assembler_impl::general_work only ran when there is actually something coming in. And for one of the systems I'm recording, that seems to be the case (the "cc" one in the log file extract I linked to but apparently not for others.

so it seems clearer to just have an explicit tag for each.

That's fine.

The Call Concluder function tallies up the Spikes and Errors from all the transmissions and gives you call level info. I think that should be in the JSON and it is in the call_info for Plugins

Right. I was just thinking of not storing it by transmission. I was also wanting to include counts for non-voice frames, but looking at the code again I'm not sure it does that. So lets ignore that idea for at least now.

@rosecitytransit
Copy link
Contributor Author

So, I finally have new_call_mgmt merged into my branch (I'm just using a text editor and command line git plus am not familiar with things so progress takes awhile) and am running it with some added logging. The only issues I see are that 1) this needs to be modified so terminators/other non-voice data count towards a call being active: https://github.com/robotastic/trunk-recorder/blob/5b7f2c668570044bbdd41ecf2456ec9a7c37ad49/trunk-recorder/gr_blocks/transmission_sink.cc#L431 and 2) the timeout for monitored calls could be set back to the fixed 1 or 2 seconds since nothing is being done with them.

Log files if you're interested

@robotastic
Copy link
Collaborator

VS Code does make merging a bit easier. If you are on a Mac, the FileMerge tool that comes with XCode is great.

For 1) - I tried it both ways. Did you notice much of a difference one way or another?
2) ideally this should be the same value as the trunking system, which should be what call_timeout is set to.

@robotastic
Copy link
Collaborator

I am probably going to switch it only allow for GRANT messages to start a new call. On both my P25 and Smartnet system, I am seeing a lot of extra "No Transmissions were recorded!" and these seem to be coming from calls that were started with an UPDATE. You can spot this with the "This was an UPDATE". The current approach only uses GRANT messages, so this would just be keeping that approach.

(info) [dcfd] 57C TG: 153 Freq: 860.987500 MHz Starting P25 Recorder Num [0] TDMA: false Slot: 0 QPSK
(info) [dcfd] 57C TG: 153 Freq: 860.987500 MHz This was an UPDATE
(info) [dcfd] 57C TG: 153 Freq: 860.987500 MHz Stopping Call because of Recorder Rec last write: 3.9654 State: idle
(info) [dcfd] 57C TG: 153 Freq: 860.987500 MHz Concluding Recorded Call - Last Update: 1s Call Elapsed: 4
(info) [dcfd] 57C TG: 153 Freq: 860.987500 MHz Stopping P25 Recorder Num [0] TDMA: false Slot: 0 Hz E>
(error) [dcfd] 57C TG: 153 Freq: 860.987500 MHz No Transmissions were recorded!

@rosecitytransit
Copy link
Contributor Author

I use Linux. I may ask on reddit or do some searching. Besides a GUI to git and something more focused on C than Bluefish, I'd love a program that turns functions and function calls into links so one can easily navigate code path, and a interactive flow chart diagram of the code. I have ran doxygen on the code and played with sourcetrail. It also helps to understand what git can do--I had stared exporting the patches I had made to my branch, resetting it back to your upstream and then manually go through and reapply the patches I still wanted when mostly all I had to do was "merge upstream (master and new_call_mgmt) -X theirs".

Anyways, for 1) calls should be considered active during terminators. This ensures that calls don't get concluded too early (which is especially helpful for my unit calls which don't fully show up on the control channel, or when there's a CC decoding issue) and allows the time out to be set low since it's known that the call is actually completed. 2) isn't really an issue now that I feel confident that the timeout can be set low.

@rosecitytransit
Copy link
Contributor Author

On both my P25 and Smartnet system, I am seeing a lot of extra "No Transmissions were recorded!"

I was going to offer to look at your logs some time, but it looks like I'm seeing that too in the ones I linked to. I'll try to see what's happening when I have time (not today, as I'm about to head out of town in an hour).

@natecarlson
Copy link
Contributor

I use Linux. I may ask on reddit or do some searching. Besides a GUI to git and something more focused on C than Bluefish, I'd love a program that turns functions and function calls into links so one can easily navigate code path, and a interactive flow chart diagram of the code.

VScode gets you a lot of that. It's a first class citizen on Linux too. No flow chart that I'm aware of though.

@Dygear
Copy link
Contributor

Dygear commented May 14, 2023

Sorry for the off topic, but yeah VSCode on Mac and Linux is great. I use Pop_OS on my desktop and macOS on my laptop. It's pretty seamless for both.

@rosecitytransit
Copy link
Contributor Author

An update: I've been using the new_call_mgmt branch (haven't tried the debug or release ones) with my customizations and things are working pretty good. The only issue I've come across is that sometimes I'll get a GROUP MISMATCH (in transmission_sink) and the call will be ended, but then the call will be started right back up (same TG, upon seeing the next message on the control channel). I think it might be seeing a tag that was from the last call; I'm pretty sure it only happens at the start of the call.

For now, I've changed transmission_sink to only actually set the recorder to stop if if's recording, as it seemed the recorder was always idle when the errant mismatch happened. When I have time and get my test bed set up, I can do some further debugging.

But Robo (Mr. Tastic?), regarding the work you've been doing on this and other recent additions:
You are awesome
@robotastic

@robotastic
Copy link
Collaborator

🤙 Teamwork!! Thanks for the suggestions and the help debugging.

Check out the master branch. I merged in the new_call_mgmt branch and there are some updates on top of that. One of them should take care of the problem with group mismatch at the start of the call. I was seeing the same. If it hasn't started recording and it gets a group ID it will just ignore things until it gets a termination and then reset itself.

@rosecitytransit
Copy link
Contributor Author

rosecitytransit commented Jun 11, 2023 via email

@robotastic
Copy link
Collaborator

Sounds good - I am going to close this thread out because the changes have been merged in... and this is super long.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants