Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Universal SoundSource for FFmpeg 4.x #1356

Merged
merged 48 commits into from
Sep 14, 2019
Merged

Universal SoundSource for FFmpeg 4.x #1356

merged 48 commits into from
Sep 14, 2019

Conversation

uklotzde
Copy link
Contributor

@uklotzde uklotzde commented Sep 28, 2017

Open Issues

  • The duration of MP3 files is imprecise as reported by FFmpeg in AVStream. This value might simply be calculated from the average bitrate and the length of the file. Workaround: Parse the whole MP3 file upfront for correctly determining the exact duration. Parsing should at least be faster than decoding. A similar strategy is used in SoundSourceMP3 already. Duration of MP3 files is reported as a multiple of the MP3 frame size, i.e. 1152 samples. If the last MP3 frame is incomplete then we will simply add silence. This is acceptable for DJing, no action needed.
  • Reduce priority from HIGHER to LOWER before merging. This will ensure that the existing SoundSources with priority DEFAULT are still favoured.
  • Fix SoundSourceProxyTest for file cover-test-itunes-12.3.0-aac.m4a
  • Analyze and fix warning for MP3 files: Warning [AnalyzerThread 0 #1]: SoundSourceFFmpeg4 - Overlapping sample frames in the stream: [-1105 -> 0)

Preamble

  • Based on PR AudioSource v2 API #1317 (AudioSource v2 API) -> will be rebased frequently on this branch
  • Enabled by compiling with ffmpeg31=1ffmpeg4=1
  • NOT supported by Ubuntu Trusty -> disabled in CI builds

This one was really a challenge!! But I hope it was worth it.

The immediate reward is reliable M4A and ALAC decoding for (almost) everyone. In the long term I expect this to become the main SoundSource of Mixxx, replacing most of the custom implementations. The first decoders that could (or should?) be declared as legacy are SoundSourceM4A, SoundSourceFFmpeg, SoundSourceMP3 (still needed for Windows), and SoundSourceWV (still needed for Windows).

I've tested it with all the corrupt files that I have collected over time. None of them is able to crash Mixxx or trigger a debug assertion. Even better: Some of them no longer need to be considered as corrupt!

Notes:

  • Uses whitelisting: Only registered for file types that pass all our tests
  • Chosen with HIGHER priority than DEFAULT to select it before any existing SoundSources -> safe because of whitelisting
  • OggVorbis decoding is disabled because of failing tests: FFmpeg #3825: Wrong PTS in Ogg Vorbis file
  • FLAC decoding is disabled because FFmpeg sometimes fails after seeking -> need to be discussed with the FFmpeg developers

Fixes the following bugs:
https://bugs.launchpad.net/mixxx/+bug/1336982
https://bugs.launchpad.net/mixxx/+bug/1665369

Essentially finished, but this piece of code should be tested thoroughly!

[Update 2017-10-02]
With my revised implementation it should now be much easier to follow and understand the code. It even reveals known bugs in FFmpeg that prevent us from enabling OggVorbis and FLAC support.

[Update 2017-10-02]
Added a workaround for decoding of AAC files that have been encoded with iTunes 12.3.0! I discovered this issue after fixing my calculation of the frame index range that didn't take the start time of a stream into account. The workaround is effective for all AAC files with a start time > 0, because we are not able to distinguish them properly. The only drawback is that some samples at the end might be cut off.
[Libav-user] Wrong duration of AAC files encoded by iTunes 12.3.0

[Update 2018-02-08]
The handling of start_time and duration in AVStream seems to have changed for FFmpeg 4.x. I adjusted the implementation and all tests now pass. The workaround for AAC files encoded with iTunes 12.3.0 is no longer necessary, but we should keep the test files.

@uklotzde uklotzde changed the title [WiP] Universal SoundSource for FFmpeg 3.1 (and beyond) [WiP] Universal SoundSource based on FFmpeg 3.1 (and beyond) Oct 1, 2017
@illuusio
Copy link
Contributor

illuusio commented Oct 6, 2017

Without reading code and noting that integration ain't working. What are main benefits to create new FFlMpeg soundsource and not to just refactor old? What are main things that should be tested?

@uklotzde
Copy link
Contributor Author

uklotzde commented Oct 6, 2017

There is great demand for M4A/ALAC playback from our user base. Furthermore it is only a question of time until ancient libraries like FAAD2 and MAD will no longer be maintained.

SoundSourceFFmpeg does not pass our rigorous tests and also shows audible artefacts when playing and seeking randomly in the test files. I was not able to understand how the existing code works and how to fix those issues.

These were the decisions for adding a new SoundSourceFFmpeg31 implementation:

  • No more backward compatibility with older FFmpeg versions up to 3.0
  • Whole new send/receive/consume loop without any internal caching except buffering of received but not consumed samples
  • Leverages FFmpeg's resampling/reformatting capabilities instead of using a home brew solution

At least the code for opening files was partially reusable. Reimplementing the decoding loop seemed to be a task with moderate complexity, but I must admit that it was much harder than what I expected.

@illuusio
Copy link
Contributor

illuusio commented Oct 8, 2017

Ok. There is much room for improving. Making FFMpeg work for every codec and container is harder than it should be. All other soundsources should be dropped soon if this should be the one because people tends to keep using old ones and not move to this one there is no reason to port them to new soundsource code.
There should be something like CuRL easy to have easier interface for decoding.

@daschuer
Copy link
Member

daschuer commented Oct 8, 2017

Do we have CPU load measures comparing MAD with FFMPEG mp3 encoding?
If there are only negligible differences, it sounds reasonable to switch to FFMPEG.

@illuusio
Copy link
Contributor

illuusio commented Oct 9, 2017

@daschuer even if CPU load doubles for encoding moving to solid encoder interface provided by FFMpeg is worth it. I haven't found difference. They use libmp3lame so you can test with lame and see how they manage as they don't seem to have native version.

@daschuer
Copy link
Member

daschuer commented Oct 9, 2017

I have found some benchmark results:
https://multimedia.cx/eggs/gcc-of-multimedia/
Here ffmpeg is slightly faster than libmad.
In most other cases ffmpeg wins with a gap.

Can we adopt these results to us?

@illuusio
Copy link
Contributor

illuusio commented Oct 10, 2017

I've little bit looked code. @uklotzde are you trusting FFMpeg for current timestamp and length? As my experience why things are done like they are old FFMpeg soundsource for most cases (like MP3 VBR) DTS/PTS is just good guess like you say in Vorbis.
Only way I could get it working was starting from beginning and seek correct point and then start reading from there. It was whole point of caching because people tends to just seek back and forward like 5 secs in audio to seek correct place and not to try find messages playing backwards whole song.

@illuusio
Copy link
Contributor

@daschuer it depends how extreme with building options you want to go with FFMpeg.

@uklotzde
Copy link
Contributor Author

@illuusio FFmpeg might use dts internally for determining the correct seek position. The SoundSource uses pts from the decoded frames for determining the correct position. We are seeking with the flag AVSEEK_FLAG_BACKWARD which should guarantee that we always land before the target position. Of course, we might need to skip some samples before reaching our target position.

I added a VBR test file and noticed that there was a bug when the audio file is actually shorter when decoded than initially reported by FFmpeg. Fixed.

@illuusio
Copy link
Contributor

Yes @uklotzde I know difference between PTS and DTS. Older version they are mostly same and I don't see why they are not that in newer versions.
Problem that I originally faced with AVSEEK_FLAG_BACKWARD (or other flag) is when seeking MP3 you get bogus PTS which ain't same if you start reading from start. As MP3 has fixed frame size it should be but at leas sub 3.1 world it doesn't. If it's fixed then everything is fine.
This is also problem containers like WMV(2) where FFMpeg sucks hardly. Frame size ain't fixed it can be something up to 64 KB (divided by 1024) and seeking is just guess and doesn't every time drop in same byte.

@illuusio
Copy link
Contributor

Trying this out. Is there option to disable all the other SoundSources to make FFMpeg rule them all?

@uklotzde
Copy link
Contributor Author

The priority is already set to HIGHER, that's sufficient. The whitelisting is controlled by getSupportedFileExtensions(), OggVorbis and FLAC are still disabled.

@illuusio
Copy link
Contributor

I got this often. First rows are normal FFMpeg warning and rest are something that is wrong?

Estimating duration from bitrate, this may be inaccurate
[New Thread 0x7ffee63ec700 (LWP 8710)]
[mp3 @ 0x7ffef40ece00] Estimating duration from bitrate, this may be inaccurate
[mp3 @ 0x7ffef428a3a0] Header missing
Warning [AnalyzerQueue 1]: SoundSourceFFmpeg31 - avcodec_send_packet() failed: No description for error code (-1094995529) found 
Warning [AnalyzerQueue 1]: AnalyzerQueue - Aborting analysis after failed to read sample data from "some.mp3" : expected frames = [21143552 -> 21147648) , actual frames = [21143552 -> 21143808)

Or it this the bug report on FFMpeg mailinglist about?

@ronso0
Copy link
Member

ronso0 commented Oct 13, 2017

I was curious if this PR would fix Bug 1669500 (scratching backwards over loop_in disables loop), so I built with ffmpeg sources for Trusty (3.2.4) from here.

And indeed it does fix it, at least for some wav and mp3 files.
m4a files that failed with master, fail with this branch as well.
no ffmpeg-related entries in log for those files.

@uklotzde
Copy link
Contributor Author

@illuusio Decoding of that mp3 files needs to be analyzed in detail by tracing all packets and frames. It might either be FFmpeg itself or our usage of FFmpeg that causes those failures. Even if the file is corrupt, decoding should not produce unexpected errors.

It is also crucial to test with a clean build. I experienced strange errors with partial builds after switching branches and don't trust scons.

@illuusio
Copy link
Contributor

Is there something that needs particular testing. This works as expected with simple playing patterns.

@uklotzde
Copy link
Contributor Author

@illuusio Known issue: The duration for MP3 is imprecise.

Carl Eugen from FFmpeg recommended that we should not rely on the duration reported by the stream! If we need to know the exact duration upfront we should instead parse the file from beginning to end, at least for MP3. They are mostly dealing with infinite streams and don't need to know the exact length even if it may exist.

Parsing should be faster than decoding, but still needs to read the whole file once. And I haven't figured out yet how to do it correctly and efficiently.

@illuusio
Copy link
Contributor

@uklotzde fastest way I figured out is in soundsourceffmpeg. One just reads frames but not decode to packages (Which can also lead incorect place with VBR).
After that you can rely on dts/pts it was also one point which lead me not to use seeking functions. Many people doesn't use WMA2 but estimations of length of WMA2 is just very rough estimation and you can rely on that.
As wonderful FFMpeg is sometimes I wonder should we use VLC library for reading and let them do the FFMpeg stuff.

@uklotzde uklotzde added this to the 2.2.0 milestone Nov 25, 2017
@illuusio
Copy link
Contributor

illuusio commented Mar 5, 2018

What is current status with this? I have used this few times and it worked ok it there something to test on?

@uklotzde
Copy link
Contributor Author

uklotzde commented Mar 6, 2018

There is still one open issue that needs to be solved before integrating this PR: We need a quick scan through encoded files to reliably determine the exact number of sample frames = sum of all encoded frame lengths.

@Be-ing Be-ing modified the milestones: 2.2.0, 2.3.0 Jun 23, 2018
@illuusio
Copy link
Contributor

Sometimes I feel we need mixxx-next which contains experimental stuff like this and they don't get stuck


SoundSourceProviderPriority SoundSourceProviderFFmpeg31::getPriorityHint(
const QString& /*supportedFileExtension*/) const {
// TODO: Increase priority to HIGHER if FFmpeg should be used as the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rephrase this a bit more weak? I know there are different opinions around we have to discuss, but this should not block this PR.

How about this:
// FFmpeg has the LOWER to be used as a fallback in case Mixxx has no own implementation. Increase priority to HIGER if you wish that ffmpeg decodes all files.

IMHO this should stick on LOWER forever. If we wish that ffmpeg is used we can remove the other implementation.

In case of mp3 and libmad, I am satisfied with the current implementation. While libmad is rather old, we have ironed out many cases and are rejecting files that are played with heavy sound artefacts in ffmpeg.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.

Unfortunately the API of FFmpeg changes constantly in combination with an intransparent and almost erratic behavior, it's a nightmare! I guess maintenance could become a never ending story and I'm not willing to tweak this piece of code constantly.

@WaylonR
Copy link
Contributor

WaylonR commented Dec 19, 2018

need to update build/depends.py and build/features.py
depends.py: needs soundsourceffmpeg31.cpp added in sources.
features.py: needs path fixed.

@uklotzde
Copy link
Contributor Author

"Enabled by compiling with ffmpeg31=1"

@WaylonR
Copy link
Contributor

WaylonR commented Dec 19, 2018

.. okay, don't need in depends.py, but without the path fix in features.py, will error.

@illuusio
Copy link
Contributor

illuusio commented Sep 5, 2019

I did first read through the code. There is some FFMpeg rough points which I have been hitting my head while I have write code for FFmpeg 4.x series. If they bug me enough I'll report them If not they work and let them be as code is solid as is now.

Copy link
Contributor

@illuusio illuusio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Over all bigger problem was that there is too long methods. They should be cut down to more easily understandable blocks

avcodec_register_all();
#endif
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these just be removed if this is targeted 4.x which in time means these are not needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In features.py we only require a minimum major version:

  • libavcodec 58
  • libavformat 58
  • libswresample 3.1
    Any differences that depend on minor versions must be handled by conditional compilation.

// Those 9 frames should at least drain the bit reservoir.
DEBUG_ASSERT(avStream.codecpar->channels <= 2);
const SINT mp3SeekPrerollFrameCount =
9 * (kSamplesPerMP3Frame / avStream.codecpar->channels);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 9?

QStringList list;

// Collect all supported formats (whitelist)
#if LIBAVCODEC_VERSION_INT < AV_VERSION_INT(58, 9, 100)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we drop all together old cruft and support only newest version which is available.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above. Requiring the most recent version would be too restrictive and it is often not available, e.g. Fedora 30 stays on FFmpeg 4.1.x.

We define minimum major versions in features.py and use version guards in the code only if required.

list.append("m4v");
continue;
} else if (!strcmp(pavInputFormat->name, "mov,mp4,m4a,3gp,3g2,mj2")) {
list.append("mov");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some difference in these? mp4 container is same with m4v and m4a?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@illuusio I didn't understand your comment. What shall we do here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a mapping from format string to individual file extensions.

#if LIBAVCODEC_VERSION_INT < AV_VERSION_INT(58, 18, 100)
// Align the time base of the context with that of the selected stream
av_codec_set_pkt_timebase(pavCodecContext, pavStream->time_base);
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again should this just been dropped rather than #if?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, see above.

<< formatErrorMessage(av_seek_frame_result).toLocal8Bit().constData();
m_curFrameIndex = kFrameIndexInvalid;
return ReadableSampleFrames();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs some comments what this stuff does

// This is expected behavior and will be compensated during 'preskip'
// (see below).

if (m_pSwrContext) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would hope that resampling is in own method or own file in here it makes this too long.

clearSampleCount);
readFrameIndex += missingFrameCount;
}
if (!decodedFrameRange.empty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment?

m_sampleBuffer.adjustCapacity(sampleBufferCapacity);
}
DEBUG_ASSERT(m_sampleBuffer.writableLength() >= sampleBufferWriteLength);
if (missingFrameCount > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment?

@@ -304,5 +192,3 @@ class SoundSourceProviderFFmpeg : public SoundSourceProvider {
};

} // namespace mixxx
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kinda understand that everything is in one file and there is just Inner classes but it makes understanding this class very very difficult.

@uklotzde
Copy link
Contributor Author

uklotzde commented Sep 8, 2019

@illuusio Thank you for the comprehensive review, greatly appreciated.

I will primarily update the code or add more comments to preserve this information for the future. I will only add comments here for clarification.

@uklotzde
Copy link
Contributor Author

uklotzde commented Sep 8, 2019

Done

@uklotzde
Copy link
Contributor Author

uklotzde commented Sep 8, 2019

MAD killer sample: libmad_decode_errors.zip

FFmpeg: ✔️
mpg123: ✔️
MAD: ⚠️ Peak error

@illuusio
Copy link
Contributor

illuusio commented Sep 9, 2019

@illuusio Thank you for the comprehensive review, greatly appreciated.

I don't know if they are any good small things basicly. I'll do last reading when you feel it's 'ready-to-merge' with sharper eye.

@Be-ing
Copy link
Contributor

Be-ing commented Sep 10, 2019

What is the state of this? What remains to be done before merging? @illuusio did you want to do a more thorough review?

@illuusio
Copy link
Contributor

For me code is ready to get in. One thing that bothers me is why there is changes in all SoundSources? Should they first get in @uklotzde?

@daschuer
Copy link
Member

MAD killer sample: libmad_decode_errors.zip

Ah I see. this tracks exceeds mad signed int with the range of +- ~8 and overflows.
A proper clamping would be better. I guess the mad developer have decided against clamping in favor of saving CPU. In normal Music Players the samples will already clamped at 1 and 8 is far above that.
Can we consider this as a faulty track?

@uklotzde
Copy link
Contributor Author

uklotzde commented Sep 12, 2019

@illuusio There are no relevant changes in other SoundSources. I only removed a constant from SoundSourceOpus that was used for testing.

@Be-ing
Copy link
Contributor

Be-ing commented Sep 14, 2019

Thank you for the review @illuusio! I'll go ahead and merge this now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants