Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sound sync regression #1638

Closed
totaam opened this issue Sep 11, 2017 · 20 comments
Closed

Sound sync regression #1638

totaam opened this issue Sep 11, 2017 · 20 comments
Labels

Comments

@totaam
Copy link
Collaborator

totaam commented Sep 11, 2017

Issue migrated from trac ticket # 1638

component: sound | priority: major | resolution: fixed

2017-09-11 23:44:52: maxmylyn created the issue


After testing #1637 and #1595 I've noticed that the sound sync is getting wildly out of sync over time.

My server and client are both trunk r16825 client/server running Fedora 25. I start with the usual defaults xpra start :13 --bind-tcp=0.0.0.0: --start-child=firefox, but also I double checked using a local video in VLC and it also gets out of sync. Surprisingly, the sound sync gets out sync within a matter of a few short minutes (usually 5-7 minutes).

Also, upon connecting, the following errors print out on the server:

[max@Vorfuehreffekt ~]$ E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3367711738
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3367711738
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3367711738
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3367711738
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3367711738
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3367711738
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3367711738
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3367711738
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3367711738
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3367711738
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3367711738
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] ratelimit.c: 4 events suppressed
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3419524942
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3419524942
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3419524942
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3419524942
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3419524942
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3419524942
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3419524942
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3419524942
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3419524942
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3419524942
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket
E: [pulseaudio] pstream.c: Cannot send block reference with non-registered memfd ID = 3419524942
E: [pulseaudio] pstream.c: Fallig back to copying full block data over socket

I walked through all the other codecs, and I'm noting similar behavior - sound gets pretty far out of sync (up to several seconds), in addition to that, sound is also fairly choppy at times. With MP3, I notice that after a server has been running for a while, when letting the server go silent then start playing sound the sound plays a rather loud tick / noise before sound comes back.

Unfortunately I ran out of time writing this, so I'm putting it up as an FYI. I still need to check sound levels and get more useful logs, so I'll assign it to me for now. I'll be back in office on Wednesday PST.

@totaam
Copy link
Collaborator Author

totaam commented Sep 12, 2017

2017-09-12 03:02:30: maxmylyn commented


NOTE:

XPRA_CUTTER_THRESHOLD=0 will disable the sound packets coming in empty to test to see if that's the cause of the misbehaving sound.

edit: defaults to 0.02 or 2%

@totaam
Copy link
Collaborator Author

totaam commented Sep 12, 2017

2017-09-12 12:06:56: antoine commented


Could also be the switch from opus / vorbis to mp3 which was also needed for cutter (see #1617#comment:5).

@totaam
Copy link
Collaborator Author

totaam commented Sep 12, 2017

2017-09-12 15:51:57: antoine commented


Please see #1617#comment:7 : opus now supports cutter and is the default again, but we probably still want to know if the out-of-sync problem was caused by switching to mp3 or by adding "cutter".

The Fallig back to copying full block data over socket is unrelated, please use a separate ticket and make sure to attach the full server log. Did pulseaudio start OK? I'm not seeing that with my VM.

@totaam
Copy link
Collaborator Author

totaam commented Sep 13, 2017

2017-09-13 23:01:21: maxmylyn commented


Okay I'm gonna walk through the available codecs on my machine, while watching Office Space on VLC, hopefully I only have to watch it the one time. On the bright side this is an excuse to take a closer look at the state of the sound codecs as of r16839.

|| Codec || Sync? || Notes ||
|| opus || Works fine until video is paused || New default - works fine until video is paused then sound may or may not come back properly. Turning the speaker forwarding off and on fixes it..until you pause it again ||
|| opus+ogg || no ||Seems fine until you pause video for a few minutes, then it's off - Sound buffer seems to hold steady at the minimum - graph is a "teeth" pattern of +-20ms (looks roughly 1 second per level)||
|| vorbis+mka || no - very off even from the start || <- ||
|| mp3 || yes? || very scratchy when sound does come through, it was silent most of the time with the sound buffer level being below the minimum. Seemed in sync though ||
|| flac+ogg || yes || actually seems to work fine (need to follow up with a longer test) ||
|| aac+mpeg4 || yes || seems to work fine ||
|| wav+lz4 || no || gst-stream-error-quark ||
|| wav+lzo || no || sound totally cut out with buffer level at 0 ||
|| wav || no || gst-stream-error-quark ||
|| wavpack || no || not scratchy or anything, just out of sync ||
|| speex+ogg || slightly off || slightly off but sometimes come back into sync - stable, doesn't become scratchy or cut off ||
|| opus+mka || no || sound buffer level holds steady at 0 - scratchy and sounds "starved" (correct term?) like the buffer is underrun ||
|| flac || no || same as opus+mka but not nearly as out of sync ||
|| mp3 + mpeg4 || no || starts out okay but gets further and further behind ||

And again with XPRA_CUTTER_THRESHOLD=0:

|| Codec || Sync? || Notes ||
|| opus || yes || sound cuts out randomly ||
|| opus+ogg || yes || same as opus (aren't they the same?) ||
|| vorbis+mka || no || very scratchy ||
|| mp3 || yes? || very scratchy when sound does come through, it was silent most of the time with the sound buffer level being below the minimum ||
|| flac+ogg || yes || same as with sound cutter set to 0.02 ||
|| aac+mpeg4 || yes || seems to work fine ||
|| wav+lz4 || no || no sound at all - level looks fine, but no sound ||
|| wav+lzo || no || very little sound (constantly cut out) - level looks fine (after speaker restart, first connection got gst-quark) ||
|| wav || no || gst-stream-error-quark ||
|| wavpack || no || not scratchy just out of sync ||
|| speex+ogg || yes || seems fine ||
|| opus+mka || no || same as with cutter at the default, but less scratchy, but consistently at least a few seconds behind ||
|| flac || no || scratchy and cut out a lot; sound buffer stayed at 0 ||
|| mp3 + mpeg4 || no || starts out okay but after a minute sound gets way behind, buffer level looks okay ||

NOTE: I am not nearly qualified to give a definite 'is absolutely in sync', I'm just going off what my ear tells me. When I say yes, I mean it's about the same as it was before the recent changes to sound.

That being said, flac+ogg and aac+mpeg4 seemed to actually work okay, even after a while. In addition, the error that I put in the OP didn't come back, so that was a one-time thing.


gst quark error for wav and the others:

sound output pipeline error: gst-stream-error-quark: The stream is of a different type than handled by this element. (5)
2017-09-13 14:01:51,669 sound output  appsrc name=src emit-signals=0 block=0 is-live=0 stream-type=0 format=4 ! wavparse ! audioconvert ! audioresample ! queue name=queue min-threshold-time=0 max-size-buffers=0 max-size-bytes=0 max-size-time=450000000 leaky=2 ! volume name=volume volume=0 ! pulsesink client-name="Xpra" async=True device="alsa_output.pci-0000_00_1b.0.analog-stereo" qos=True sync=False
2017-09-13 14:01:51,669 sound output  riff-read.c(262)
2017-09-13 14:01:51,669 sound output  gst_riff_parse_file_header ()
2017-09-13 14:01:51,669 sound output  /GstPipeline:pipeline0/GstWavParse:wavparse0:
2017-09-13 14:01:51,670 sound output  Stream is no RIFF stream
2017-09-13 14:01:51,670 sound output  0x15aff77

@totaam
Copy link
Collaborator Author

totaam commented Sep 14, 2017

2017-09-14 18:53:28: antoine commented


You can now use the env var XPRA_SOUND_LOG_CUTTER=1 to see when the audio level passes the threshold (see #1617#comment:8 for details).

Note: any codec option that uses a container format will not be using cutter (ie: anything with "+ogg", "+mka" or "+mpeg4"), and cutter is also disabled for "vorbis", "wav", "wavpack" and one of the "aac" encoders. So for some of the cases where you noticed a difference, then the two test runs were different in some other way since cutter was not even being used. Maybe just natural variation.

Generally speaking, some scratchiness can be expected when we use cutter as the sound queue will be completely empty when we resume the stream. The usual queue level management logic (#849) should take care of that relatively quickly.
If this turns out to be too much of a problem, we could easily send a control packet to the client telling it to prepare for the stream re-start by temporarily raising the min level. (this would belong in #1617)

I'm assuming that you're still using Fedora 25 as both client and server.
I've re-tested with a Fedora 26 server, and used for clients: Fedora 26 and MS Windows 7 32-bit.

  • opus: I can stop and start it again any number of times, the sound comes back on in the client. (win32 builds needed r16846 to fix the packaging)
  • opus+ogg: doesn't use cutter, it is not the same as plain opus since we use an ogg container (aka muxer)
  • vorbis+mka: lack of sound sync is likely to be related to the container format chunking things more coarsely, plain vorbis (which you seem to have skipped) works much better (but it doesn't support cutter)
  • mp3: nothing should have changed since you tested it in the OP, no scratchiness here
  • flac+ogg: would be a decent option if it didn't use so much bandwidth
  • aac+mpeg4: meant for the HTML5 client, but could be used or at least moved up the priority list
  • wav+lz4, wav+lzo, wav: (edited) now fixed, see detect silence and avoid sending empty packets #1617#comment:9 (not sure how you managed to get anything out of wav+lzo, as all wav decoding was broken when cutter was enabled)
  • wavpack: no sync since it buffers a lot, and also doesn't use cutter
  • speex+ogg: this is a speech codec, so I'm not sure we should be using it for such a general purpose
  • opus+mka: meh
  • flac: meh (as above)
  • mp3+mpeg4: meant for the HTML5 client

So, what do I need to do to reproduce the scratchiness that you're hearing?
Since you're even having problems with cutter disabled, let's first figure out why that is. Surely this must be a regression since you had tested this same list of codecs a while back?
This could also be related to r16831.

@totaam
Copy link
Collaborator Author

totaam commented Sep 16, 2017

2017-09-16 00:32:25: maxmylyn commented


Okay, I agree we should focus on why I'm seeing so much scratchiness. Since you aren't seeing it I'm not sure what to tell you - pretty much any sound source I can think of is very very scratchy where noted - be it a music file (Rhythmbox/VLC/whatever), a video file (VLC / mplayer), sound via a browser (Pandora / Spotify (needs Widevine)), or video via a browser (Netflix / YouTube). Sound on my system is very scratchy.

I vote we start with MP3 and Opus since Opus is the new default and MP3 was the default for a short while, and MP3 the most reliable one to act up.

When running with MP3 and -d gstreamer I get these prints to flood the output constantly:

2017-09-15 15:02:16,778 sound output queue_underrun(<__gi__.GstQueue object at 0x7f3edfd7e640 (GstQueue at 0x10070f160b0)>,)
2017-09-15 15:02:16,807 sound output queue_running(<__gi__.GstQueue object at 0x7f3edfd7e640 (GstQueue at 0x10070f160b0)>,)
2017-09-15 15:02:16,807 sound output queue_pushing(<__gi__.GstQueue object at 0x7f3edfd7e640 (GstQueue at 0x10070f160b0)>,)
2017-09-15 15:02:16,807 sound output queue_underrun(<__gi__.GstQueue object at 0x7f3edfd7e640 (GstQueue at 0x10070f160b0)>,)
2017-09-15 15:02:16,819 sound output queue_running(<__gi__.GstQueue object at 0x7f3edfd7e640 (GstQueue at 0x10070f160b0)>,)
2017-09-15 15:02:16,819 sound output queue_pushing(<__gi__.GstQueue object at 0x7f3edfd7e640 (GstQueue at 0x10070f160b0)>,)
2017-09-15 15:02:16,819 sound output queue_underrun(<__gi__.GstQueue object at 0x7f3edfd7e640 (GstQueue at 0x10070f160b0)>,)

And, based on what I'm seeing on the graphs, the sound queue on my system is constantly being underrun, which would explain the scratchiness (I am pretty sure).

Now, with opus, I get the exact same prints, but they come and they go - when they stop the sound isn't scratchy. In addition, the sound did stop completely a couple of times and come back after a few seconds - and pausing the video for a period of time then coming back to it causes the sound to completely stop(during this time the underrun prints stop). Also, the sound did completely stop a few times.

I did notice while it was playing clearly that when I switched virtual desktops (I keep the Xpra windows open on the first one, and Firefox on the second one) it would produce a moment of scratchiness when I switched back to the desktop with the audio source playing. I wonder if that may have an effect since we paint windows slower when the desktop isn't in view.

@totaam
Copy link
Collaborator Author

totaam commented Sep 22, 2017

2017-09-22 22:35:43: maxmylyn changed status from assigned to new

@totaam
Copy link
Collaborator Author

totaam commented Sep 22, 2017

2017-09-22 22:35:43: maxmylyn changed owner from maxmylyn to antoine

@totaam
Copy link
Collaborator Author

totaam commented Sep 22, 2017

2017-09-22 22:35:43: maxmylyn commented


Upped my server to r16954:

I noticed that when using opus sound seems to behave fine, but -d gstreamer on the server side, I see it printing:

2017-09-22 14:34:43,575 sound source cutter: skipping buffer with pts=121133500000 (min-timestamp=121230000000)
2017-09-22 14:34:43,576 sound source cutter: skipping buffer with pts=121153500000 (min-timestamp=121230000000)
2017-09-22 14:34:43,576 sound source cutter: skipping buffer with pts=121173500000 (min-timestamp=121230000000)
2017-09-22 14:34:43,576 sound source cutter: skipping buffer with pts=121193500000 (min-timestamp=121230000000)
2017-09-22 14:34:43,588 sound source cutter: skipping buffer with pts=121213500000 (min-timestamp=121230000000)

Even while it should be playing sound. Could that be causing the underruns?

EDIT: (accidentally pasted over this part)

And when I see it print that sound gets scratchy on the client, with the client printing that the sound queue is getting underrun - and that's right about when the sound sync goes out the window.

@totaam
Copy link
Collaborator Author

totaam commented Sep 22, 2017

2017-09-22 22:37:41: maxmylyn commented


For reference, I'm watching this video [https://www.youtube.com/watch?v=OTh4f7ye1IQ&index=3&list=PLOQZmjD6P2Hl3KrDBs6Vcc_5XDZbqPABC] (the guy runs a YouTube channel about popular guns in video games) in Chrome/Firefox (checked both).

@totaam
Copy link
Collaborator Author

totaam commented Oct 17, 2017

2017-10-17 04:57:05: antoine changed owner from antoine to maxmylyn

@totaam
Copy link
Collaborator Author

totaam commented Oct 17, 2017

2017-10-17 04:57:05: antoine commented


Even while it should be playing sound.
The buffers shown there are only 20ms apart, isn't it possible for your audio to have gone silent for 5 buffers? (1/10th of a second)

Could that be causing the underruns?
Yes, that's very likely.

And when I see it print that sound gets scratchy on the client
Underruns would do that.
Please include the sound buffer level graphs at that point.

and that's right about when the sound sync goes out the window.
"out of the window" - ahead or behind?
I assume that the sound ends up being ahead?


Related changes:

  • r17208 lowers the default threshold to just 0.001 so only real silence should trigger the cutoff and not quiet sounds
  • r17209 tries to deal with the scratchiness (which I have observed, but not persistently - only when cutter fires): we temporarily raise the max-level to ensure we have enough room to also raise the min-level. I don't like changing this tricky sound buffer code (sound improvements: refactorings, cleanups, #849), as it was hard enough making it work as well as it does and it has served us well until now, but it looks like this change shouldn't be able to make things worse.

If you still encounter problems:

  • please provide as many details as possible (ie "-d gstreamer,sound")
  • verify that the problem is a regression: test with v2.1, test with v2.2 and XPRA_CUTTER_THRESHOLD=0 to disable cutter
  • test different bandwidth conditions to make sure we understand what circumstances trigger the problem. (so we can target those and not penalize the others)

Other things worth trying on the client:

  • XPRA_SOUND_MARGIN=200 (defaults to 50)
  • XPRA_SOUND_UNDERRUN_MIN_LEVEL=100 (defaults to 50 - can be lower when max-level is low)
    On the server:
  • XPRA_CUTTER_THRESHOLD=0.05 can be useful to trigger "cutter" much more aggressively, to trigger the underruns on the client

@totaam
Copy link
Collaborator Author

totaam commented Nov 1, 2017

2017-11-01 19:34:45: maxmylyn commented


So I upped my server and client to 17281:

  • Lowering XPRA_CUTTER_THRESHOLD to 0.000001 makes all the scratchiness go away as far as I can tell. In addition to this, the sound sync seems to be notably better.

I think I'll hold on to this ticket a bit longer to double check it.

@totaam
Copy link
Collaborator Author

totaam commented Nov 2, 2017

2017-11-02 07:47:29: antoine commented


Lowering XPRA_CUTTER_THRESHOLD to 0.000001 makes ...
A value this small effectively disables cutter.
With debugging enabled, you should have noticed that cutter was never firing.

Audio is recorded at 16-bit per channel, so the maximum precision in decimal is:

$ python -c "print('{0:.8f}'.format(1.0/2**16))"
0.00001526

Have you tried any of the other tunables from comment:9?

Also, these should help:

  • fine tuning in r17285
  • more correct underrun refilling for longer pauses: r17286
  • r17287 + r17288: rework buffer refilling

If not, please see #849 for the kind of details needed (graphs, info, tested with tc to cause latency, etc..)

@totaam
Copy link
Collaborator Author

totaam commented Dec 11, 2017

2017-12-11 23:06:53: maxmylyn commented


As of trunk r17606 - sound is much better - the sync is about perfect, and all the scratchiness I noted isn't apparent any more. I'm tempted to close this, but I'll see if I can test it a bit more thoroughly than watching a couple video files and some YouTube videos.

@totaam
Copy link
Collaborator Author

totaam commented Dec 12, 2017

2017-12-12 22:30:57: maxmylyn changed status from new to closed

@totaam
Copy link
Collaborator Author

totaam commented Dec 12, 2017

2017-12-12 22:30:57: maxmylyn set resolution to fixed

@totaam
Copy link
Collaborator Author

totaam commented Dec 12, 2017

2017-12-12 22:30:57: maxmylyn commented


Upped my server and client to 17607 and done some more thorough testing today and this seems to have all but gone away - the scratchiness is gone, and sound sync is much better (as good as it was before).

I'm gonna close this.

@totaam totaam closed this as completed Dec 12, 2017
@totaam
Copy link
Collaborator Author

totaam commented Jan 22, 2018

2018-01-22 09:22:27: antoine commented


Note: I have now encountered the Cannot send block reference with non-registered memfd ID ... error with r18103 (for #1266) so now we also disable memfd by adding --enable-memfd=no to the pulseaudio command line.

@totaam
Copy link
Collaborator Author

totaam commented Mar 3, 2019

2019-03-03 12:50:59: antoine commented


Follow up in #2186.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant