-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Universal SoundSource for FFmpeg 4.x #1356
Conversation
Without reading code and noting that integration ain't working. What are main benefits to create new FFlMpeg soundsource and not to just refactor old? What are main things that should be tested? |
There is great demand for M4A/ALAC playback from our user base. Furthermore it is only a question of time until ancient libraries like FAAD2 and MAD will no longer be maintained. SoundSourceFFmpeg does not pass our rigorous tests and also shows audible artefacts when playing and seeking randomly in the test files. I was not able to understand how the existing code works and how to fix those issues. These were the decisions for adding a new SoundSourceFFmpeg31 implementation:
At least the code for opening files was partially reusable. Reimplementing the decoding loop seemed to be a task with moderate complexity, but I must admit that it was much harder than what I expected. |
Ok. There is much room for improving. Making FFMpeg work for every codec and container is harder than it should be. All other soundsources should be dropped soon if this should be the one because people tends to keep using old ones and not move to this one there is no reason to port them to new soundsource code. |
Do we have CPU load measures comparing MAD with FFMPEG mp3 encoding? |
@daschuer even if CPU load doubles for encoding moving to solid encoder interface provided by FFMpeg is worth it. I haven't found difference. They use libmp3lame so you can test with lame and see how they manage as they don't seem to have native version. |
I have found some benchmark results: Can we adopt these results to us? |
I've little bit looked code. @uklotzde are you trusting FFMpeg for current timestamp and length? As my experience why things are done like they are old FFMpeg soundsource for most cases (like MP3 VBR) DTS/PTS is just good guess like you say in Vorbis. |
@daschuer it depends how extreme with building options you want to go with FFMpeg. |
@illuusio FFmpeg might use dts internally for determining the correct seek position. The SoundSource uses pts from the decoded frames for determining the correct position. We are seeking with the flag AVSEEK_FLAG_BACKWARD which should guarantee that we always land before the target position. Of course, we might need to skip some samples before reaching our target position. I added a VBR test file and noticed that there was a bug when the audio file is actually shorter when decoded than initially reported by FFmpeg. Fixed. |
Yes @uklotzde I know difference between PTS and DTS. Older version they are mostly same and I don't see why they are not that in newer versions. |
Trying this out. Is there option to disable all the other SoundSources to make FFMpeg rule them all? |
The priority is already set to HIGHER, that's sufficient. The whitelisting is controlled by getSupportedFileExtensions(), OggVorbis and FLAC are still disabled. |
I got this often. First rows are normal FFMpeg warning and rest are something that is wrong?
Or it this the bug report on FFMpeg mailinglist about? |
I was curious if this PR would fix Bug 1669500 (scratching backwards over loop_in disables loop), so I built with ffmpeg sources for Trusty (3.2.4) from here. And indeed it does fix it, at least for some wav and mp3 files. |
@illuusio Decoding of that mp3 files needs to be analyzed in detail by tracing all packets and frames. It might either be FFmpeg itself or our usage of FFmpeg that causes those failures. Even if the file is corrupt, decoding should not produce unexpected errors. It is also crucial to test with a clean build. I experienced strange errors with partial builds after switching branches and don't trust scons. |
Is there something that needs particular testing. This works as expected with simple playing patterns. |
@illuusio Known issue: The duration for MP3 is imprecise. Carl Eugen from FFmpeg recommended that we should not rely on the duration reported by the stream! If we need to know the exact duration upfront we should instead parse the file from beginning to end, at least for MP3. They are mostly dealing with infinite streams and don't need to know the exact length even if it may exist. Parsing should be faster than decoding, but still needs to read the whole file once. And I haven't figured out yet how to do it correctly and efficiently. |
@uklotzde fastest way I figured out is in soundsourceffmpeg. One just reads frames but not decode to packages (Which can also lead incorect place with VBR). |
What is current status with this? I have used this few times and it worked ok it there something to test on? |
There is still one open issue that needs to be solved before integrating this PR: We need a quick scan through encoded files to reliably determine the exact number of sample frames = sum of all encoded frame lengths. |
Sometimes I feel we need mixxx-next which contains experimental stuff like this and they don't get stuck |
src/sources/soundsourceffmpeg31.cpp
Outdated
|
||
SoundSourceProviderPriority SoundSourceProviderFFmpeg31::getPriorityHint( | ||
const QString& /*supportedFileExtension*/) const { | ||
// TODO: Increase priority to HIGHER if FFmpeg should be used as the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rephrase this a bit more weak? I know there are different opinions around we have to discuss, but this should not block this PR.
How about this:
// FFmpeg has the LOWER to be used as a fallback in case Mixxx has no own implementation. Increase priority to HIGER if you wish that ffmpeg decodes all files.
IMHO this should stick on LOWER forever. If we wish that ffmpeg is used we can remove the other implementation.
In case of mp3 and libmad, I am satisfied with the current implementation. While libmad is rather old, we have ironed out many cases and are rejecting files that are played with heavy sound artefacts in ffmpeg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree.
Unfortunately the API of FFmpeg changes constantly in combination with an intransparent and almost erratic behavior, it's a nightmare! I guess maintenance could become a never ending story and I'm not willing to tweak this piece of code constantly.
need to update build/depends.py and build/features.py |
"Enabled by compiling with ffmpeg31=1" |
.. okay, don't need in depends.py, but without the path fix in features.py, will error. |
I did first read through the code. There is some FFMpeg rough points which I have been hitting my head while I have write code for FFmpeg 4.x series. If they bug me enough I'll report them If not they work and let them be as code is solid as is now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Over all bigger problem was that there is too long methods. They should be cut down to more easily understandable blocks
avcodec_register_all(); | ||
#endif | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these just be removed if this is targeted 4.x which in time means these are not needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In features.py we only require a minimum major version:
- libavcodec 58
- libavformat 58
- libswresample 3.1
Any differences that depend on minor versions must be handled by conditional compilation.
// Those 9 frames should at least drain the bit reservoir. | ||
DEBUG_ASSERT(avStream.codecpar->channels <= 2); | ||
const SINT mp3SeekPrerollFrameCount = | ||
9 * (kSamplesPerMP3Frame / avStream.codecpar->channels); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 9?
QStringList list; | ||
|
||
// Collect all supported formats (whitelist) | ||
#if LIBAVCODEC_VERSION_INT < AV_VERSION_INT(58, 9, 100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we drop all together old cruft and support only newest version which is available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above. Requiring the most recent version would be too restrictive and it is often not available, e.g. Fedora 30 stays on FFmpeg 4.1.x.
We define minimum major versions in features.py and use version guards in the code only if required.
list.append("m4v"); | ||
continue; | ||
} else if (!strcmp(pavInputFormat->name, "mov,mp4,m4a,3gp,3g2,mj2")) { | ||
list.append("mov"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some difference in these? mp4 container is same with m4v and m4a?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@illuusio I didn't understand your comment. What shall we do here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a mapping from format string to individual file extensions.
#if LIBAVCODEC_VERSION_INT < AV_VERSION_INT(58, 18, 100) | ||
// Align the time base of the context with that of the selected stream | ||
av_codec_set_pkt_timebase(pavCodecContext, pavStream->time_base); | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again should this just been dropped rather than #if?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, see above.
src/sources/soundsourceffmpeg.cpp
Outdated
<< formatErrorMessage(av_seek_frame_result).toLocal8Bit().constData(); | ||
m_curFrameIndex = kFrameIndexInvalid; | ||
return ReadableSampleFrames(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs some comments what this stuff does
src/sources/soundsourceffmpeg.cpp
Outdated
// This is expected behavior and will be compensated during 'preskip' | ||
// (see below). | ||
|
||
if (m_pSwrContext) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would hope that resampling is in own method or own file in here it makes this too long.
clearSampleCount); | ||
readFrameIndex += missingFrameCount; | ||
} | ||
if (!decodedFrameRange.empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment?
m_sampleBuffer.adjustCapacity(sampleBufferCapacity); | ||
} | ||
DEBUG_ASSERT(m_sampleBuffer.writableLength() >= sampleBufferWriteLength); | ||
if (missingFrameCount > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment?
@@ -304,5 +192,3 @@ class SoundSourceProviderFFmpeg : public SoundSourceProvider { | |||
}; | |||
|
|||
} // namespace mixxx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kinda understand that everything is in one file and there is just Inner classes but it makes understanding this class very very difficult.
@illuusio Thank you for the comprehensive review, greatly appreciated. I will primarily update the code or add more comments to preserve this information for the future. I will only add comments here for clarification. |
Done |
MAD killer sample: libmad_decode_errors.zip FFmpeg: ✔️ |
I don't know if they are any good small things basicly. I'll do last reading when you feel it's 'ready-to-merge' with sharper eye. |
What is the state of this? What remains to be done before merging? @illuusio did you want to do a more thorough review? |
For me code is ready to get in. One thing that bothers me is why there is changes in all SoundSources? Should they first get in @uklotzde? |
Ah I see. this tracks exceeds mad signed int with the range of +- ~8 and overflows. |
@illuusio There are no relevant changes in other SoundSources. I only removed a constant from SoundSourceOpus that was used for testing. |
Thank you for the review @illuusio! I'll go ahead and merge this now. |
Open Issues
The duration of MP3 files is imprecise as reported by FFmpeg in AVStream. This value might simply be calculated from the average bitrate and the length of the file. Workaround: Parse the whole MP3 file upfront for correctly determining the exact duration. Parsing should at least be faster than decoding. A similar strategy is used in SoundSourceMP3 already.Duration of MP3 files is reported as a multiple of the MP3 frame size, i.e. 1152 samples. If the last MP3 frame is incomplete then we will simply add silence. This is acceptable for DJing, no action needed.Warning [AnalyzerThread 0 #1]: SoundSourceFFmpeg4 - Overlapping sample frames in the stream: [-1105 -> 0)
Preamble
Based on PR AudioSource v2 API #1317 (AudioSource v2 API) -> will be rebased frequently on this branchffmpeg31=1ffmpeg4=1NOT supported by Ubuntu Trusty -> disabled in CI buildsThis one was really a challenge!! But I hope it was worth it.
The immediate reward is reliable M4A and ALAC decoding for (almost) everyone. In the long term I expect this to become the main SoundSource of Mixxx, replacing most of the custom implementations. The first decoders that could (or should?) be declared as legacy are SoundSourceM4A, SoundSourceFFmpeg,
SoundSourceMP3(still needed for Windows), andSoundSourceWV(still needed for Windows).I've tested it with all the corrupt files that I have collected over time. None of them is able to crash Mixxx or trigger a debug assertion. Even better: Some of them no longer need to be considered as corrupt!
Notes:
Fixes the following bugs:
https://bugs.launchpad.net/mixxx/+bug/1336982
https://bugs.launchpad.net/mixxx/+bug/1665369
Essentially finished, but this piece of code should be tested thoroughly!
[Update 2017-10-02]
With my revised implementation it should now be much easier to follow and understand the code. It even reveals known bugs in FFmpeg that prevent us from enabling OggVorbis and FLAC support.
[Update 2017-10-02]
Added a workaround for decoding of AAC files that have been encoded with iTunes 12.3.0! I discovered this issue after fixing my calculation of the frame index range that didn't take the start time of a stream into account. The workaround is effective for all AAC files with a start time > 0, because we are not able to distinguish them properly. The only drawback is that some samples at the end might be cut off.
[Libav-user] Wrong duration of AAC files encoded by iTunes 12.3.0
[Update 2018-02-08]
The handling of
start_time
andduration
inAVStream
seems to have changed for FFmpeg 4.x. I adjusted the implementation and all tests now pass. The workaround for AAC files encoded with iTunes 12.3.0 is no longer necessary, but we should keep the test files.