-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RingDelayBuffer: ring buffer for delay handling #4852
Conversation
This commit adds a new ring buffer. This ring buffer serves for handling a delay. For the first use case was specially created for the EngineEffectsDelay, which handles the delay, of the effects (in a chain). The ring delay buffer allows moving with the reading position subject to certain rules. Another difference between the classic ring buffer is, that the ring delay buffer offers to read zero values which were not written by using the write method and write position. Both of these two specific properties are based on the cross-fading between changes of two delays and the classic ring buffer cannot be used.
The commit adds tests for the ring buffer for handling delay. The test set includes tests for testing the RingDelayBuffer::isEmpty, RingDelayBuffer::isFull, RingDelayBuffer::clear, then tests for checking the number of available items for reading and writing and at last tests for reading and writing without skipping the position with the read position and including skip with the read position on both sides and with both variants (circle around / not circle around).
The commit adds the original license for the ring buffer from the Portable Audio I/O Library. The ring buffer from the pa_ringbuffer.c was used as a template for some parts of code in RingDelayBuffer. Based on that, the original license is added to cpp file and header. In the header, the modified and added functions are described.
This commit adds clearing of the m_jumpLeftAroundMask. When the RingDelayBuffer::clear method is called, the mentioned variable for masking, when the left side jump crossed the left side of the delay buffer, is set to zero.
This commit allows, that the size of the jump with the reading position to the right can be equal to the number of reading available items.
This commit solves the maximum size of the jump to the left for the reading position. Based on that, the comments for ASSERTs are updated and the zero size jump is handled separately.
The commit adds benchmarks for testing the RingDelayBuffer. The benchmarks test the RingDelayBuffer::write, RingDelayBuffer::read and RingDelayBuffer::moveReadPositionBy for the case without skipping, the case with a jump to the left without circling and jump to the left with circling around.
Thanks. Before I go into detail reviewing this. it would make sense to first introduce the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added some thoughts
src/test/ringdelaybuffer_test.cpp
Outdated
class RingDelayBufferTest : public MixxxTest { | ||
protected: | ||
void SetUp() override { | ||
m_pRingDelayBuffer = new RingDelayBuffer(m_ringDelayBufferSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can become an unique_ptr
src/util/ringdelaybuffer.cpp
Outdated
m_ringMask(bufferSize - 1), | ||
m_jumpLeftAroundMask(0), | ||
m_buffer(bufferSize) { | ||
// TODO(davidchocholaty) Handle to allow only power of two for the size of the ring delay buffer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably already now this function:
Line 41 in 9b9fbaa
inline int roundUpToPowerOf2(int v) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO not needed, also there is std::bit_ceil
in C++20's <bit>
header. It's constexpr and also likely faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Benchmark supports that its at least equal, though I guess the performance of bit_ceil
depends very much on whats available on the target architecture (which you can't specify on Quickbench). https://quick-bench.com/q/3FOpFZ_ZaAfJOtWWl1Xx7MFC7JQ
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you both for your recommendations. Anyway, after code refactorization which removes the binary operations, the "power of two" condition is no longer required.
src/util/ringdelaybuffer.cpp
Outdated
|
||
// Set the ring delay buffer items to 0. | ||
CSAMPLE* bufferData = m_buffer.data(); | ||
memset(bufferData, 0, sizeof(*bufferData) * bufferSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be not necessary. The buffer should never return uninitialized zeros. In out of memory situation the usage code needs to fade to zero to avoid clicks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initializing values on zero was created, due to it is possible to move with m_readPos
to left over the left delay buffer bound. For the current code version, it is allowed in the beginning too. Depending on that, the read position can read values, which weren't written by m_writePos
. If I'm thinking about it, not a good practice. It is possible to allow skipping to the left only if the items were written by m_writePos
with VERIFY_OR_DEBUG_ASSERT
.
src/util/ringdelaybuffer.cpp
Outdated
itemsToRead = available; | ||
} | ||
|
||
const SINT position = m_readPos & m_ringMask; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if the concept of a member variable m_readPos is still suitable.
For my understanding we have the current sample at m_writePos - 1 and the delayed samples before that so the actual read position is m_writePos - "current delay" - itemsToRead
We need to make sure this does not hit the end of the buffer.
Idea: Pass the delay along to the read function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for this idea. Sounds cool and I think it cleans up the code a lot. I have just one thing on my mind for two cases. Based on the proposed new version, the ring delay buffer will behave a little bit differently.
-
If, for example, 8 samples will be written into the ring delay buffer, and then the different amount of samples will be read with zero delays, just propose 4 samples, so, for the previous version, the ring buffer will read the first half of the written samples by index 0, however, the new version will read the second half of the written samples (8 - 0 - 4).
-
If the write method will be called two times in a row without calling the read method, the situation will be quite similar to the first one.
For our future use case for EngineEffectsDelay
, this cannot happen. Anyway, if I will think only about RingDelayBuffer
, it may be possible. What do you think about that? I don't want to throw down somehow this idea, I really like it and I would like to use it. I'm only thinking about possible cases, which can happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to re-open a discussion about this idea. I would like to discuss just one thing which I noticed. If the new version will be used, then a lot of checks of delay value have to be run for every read()
call, probably all code in moveReadPositionBy()
. Can I ask what is your view on this situation?
Another inspiration for a singlethreaded Ringbuffer: https://hg.sr.ht/~breakfastquay/rubberband/browse/src/common/SingleThreadRingBuffer.h?rev=tip |
This commit removes unnecessary explicitly created virtual destructor.
The commit removes the variable for storing the size of the delay buffer. Instead of the mentioned variable, the SampleBuffer::size function is used, due to the type of the delay buffer, which is mixxx::SampleBuffer.
This commit improves const correctness a few functions and parameters. The newly const is added to functions: RingDelayBuffer::isFull, RingDelayBuffer::getReadAvailable, RingDelayBuffer::getWriteAvailable. Then the numItems parameter for RingDelayBuffer::read and RingDelayBuffer::write is const and the jumpSize parameter for RingDelayBuffer::moveReadPositionBy.
The commit removes the information, that the ring buffer is safe for single-thread only. This information is a rewrite before the class.
This commit replaces the use of RingDelayBuffer* with unique_ptr.
Thank you for this tip. |
I would like to post a little info about the possible choice of using only memcpy:
SampleUtil::copy:
|
I think, that I should just summarize some of your reviews and the current code look. I would like to discuss, the problem, for most of the review's indirectly points. The problem, why the current code look is in some cases a little bit over-complicated is, that to follow the same behaviour as in EngineEffectsDelay, read of uninitialized (zero values) has to be enabled. The situation may arise for example for the following workflow: Precondition: ring delay buffer is empty with uninitialized (zero) values and the read and the write positions are zero.
In the shown situation, the uninitialized (zero) values have to be read. In Now, here are two options:
Now, I would like to open a small discussion about this behaviour problem. |
I don't understand how it can happen that the read position is ahead of the write position. That would mean that you are trying to read more samples than given. When used in the filter delay, that would mean that there was a negative delay. Can you elaborate on the exact circumstances that would lead to this? Irregardless of this, I still don't understand why the read position calculation has to as complicated as it currently is. |
The read position calculation can be of course simplified by using modulo operation instead of AND operation. That's of course possible. I didn't mention it that strongly with the phrase _" but still the current code look can be refactorized a lot." _. The reason, why I dwell so much on the mentioned situation is, that with simplification this situation must be taken into account. Based on your's and daschuer's reviews, the code can be simplified of course a lot, but IMO it makes sense to dig through simplification with the final decision for the mentioned situation because it is not at all common for a ring buffer as known and doesn't make sense to reimplement something without clear vision what it can and can't do. |
Yeah, I understand that it is not easy to understand this situation without all the cases around. I think, that the best will be an example with calculations from the Just for info, the For the following examples, I will assume kMaxDelay = 192000, but it doesn't matter now. So, the first
The second
This is the situation, which I have in mind because at the start with a clear buffer, there isn't written data on 191996 index. Please, let me now if it is a little bit more clearer. I would like to know, "if we are on the same page" with this problem. |
Ah, I think understand. Yes in that case we should just have the buffer pre-filled with zeros IMO. |
Okay, perfect. Thank you. So now when we agreed to allow the mentioned special case, I will simplify all the calculations asap. |
When it happens that you read out the initial zeros, you will hear a click sound when the real samples starts. This can be avoided by fading in the fist input buffer. |
Good point. Thank you for this tip. |
This commit removes invasive manipulation with handling sizes and memsetting. Instead, the fill method is used.
This commit replaces manual handling with memcpy by using SampleUtil::copy. This provides using a vectorized loop for 32-bit SSE instead of memcpy (based on benchmarking).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, thank you.
I have added some suggestions for improvements.
src/util/ringdelaybuffer.cpp
Outdated
m_buffer.fill(0); | ||
} | ||
|
||
void RingDelayBuffer::copy(const ReadableSlice pSourceBuffer, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a free function now, we this can go to an an anonymous namespace or if we think we can use it elsewhere, it can be moved sample.cpp.
This is not a plain copy it is a copyRing()
or such.
Does it work if the source sourcePos AND destPos are not 0? I think not. We may either assert that or add the case with three copy calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mentioned problems should be fixed, to summarise:
copy()
renamed tocopyRing()
copyRing()
is moved into an anonymous namespace and serves as a helper function only
Does it work if the source sourcePos AND destPos are not 0? I think not. We may either assert that or add the case with three copy calls.
I think I understand, what you mean. IMO for the case that such a huge numItems
value will be provided, which will be many times greater than the size of the source and destination buffer too, it would be needed much more copies than just three. It cannot occur for the actual usage but may be possible. Based on that multiple copies would not be used now, I would prefer the assert-way solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have an other (unused) case in mind for three copies: Copy from a ring to a ring.
Source:
123456789
Destination
123456789
Copy 8 samples
Read pointer at 5 write pointer at 7
Copy
5...7 to 7...9 (3)
7...9 to 1..3 (3)
1..2 to 4..5
This can't happen if one of the pointers points to the start.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Asserting that this not happens works for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I missed that you did it already.
Thank you.
src/util/ringdelaybuffer.cpp
Outdated
SINT RingDelayBuffer::read(CSAMPLE* pBuffer, const SINT itemsToRead, const SINT delayItems) { | ||
const SINT shift = itemsToRead + delayItems; | ||
|
||
if (shift > m_buffer.size()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (shift > m_buffer.size()) { | |
VERIFY_OR_DEBUG_ASSERT(shift <= m_buffer.size()) { |
src/util/ringdelaybuffer.cpp
Outdated
} | ||
|
||
SINT RingDelayBuffer::write(const CSAMPLE* pBuffer, const SINT itemsToWrite) { | ||
if (itemsToWrite > m_buffer.size()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (itemsToWrite > m_buffer.size()) { | |
VERIFY_OR_DEBUG_ASSERT(itemsToWrite <= m_buffer.size()) { |
This commit replaces the basic if statements for testing the invalid amount of item values with VERIFY_OR_DEBUG_ASSERT.
This commit renames the RingDelayBuffer::copy function into RingDelayBuffer:copyRing.
The commit moves the RingDelayBuffer::copyRing into an anonymous namespace and removes the function from the RingDelayBuffer class. So, the function serves as a helper function only.
The commit adds handling structure for the situation when the number of items to copy causes, that both buffers have to cross their upper bounds and circle around at least once. For the current RingDelayBuffer::copyRing usage this situation is not required, so, the VERIFY_OR_DEBUG_ASSERT is added.
src/util/ringdelaybuffer.cpp
Outdated
using ReadableSlice = mixxx::SampleBuffer::ReadableSlice; | ||
using WritableSlice = mixxx::SampleBuffer::WritableSlice; | ||
|
||
namespace { | ||
SINT copyRing(const ReadableSlice pSourceBuffer, | ||
SINT sourcePos, | ||
const WritableSlice pDestBuffer, | ||
SINT destPos, | ||
const SINT numItems) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm planning to deprecate mixxx::SampleBuffer::*Slice
. Why not use std::span
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, std::span
can be used. I thought, that I should rather work with span through SampleBuffer
and forgot, that I can call the mixxx::spanutil::spanFromPtrLen()
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, you just use spanFromPtrLen
as an adapter so to speak and then try to use std::span as much as possible in any APIs that take the usual (pointer, size)
pair.
This commit adds the default value assignment for copiedItems variable. The initialization is added to avoid potentially uninitialized memory.
This commits tries to avoid the uninitialized memory due to creating the variable before the if-else statement without assigning the default value. One solution could be assigning the default value, anyway, some IDEs can warn that the variable is unused and primarily it will hide real errors like that the variable is not set in one branch. Based on that, the lambda solution is used.
This commit replaces the mixxx::SampleBuffer::ReadableSlice and mixxx::SampleBuffer::WritableSlice with std::span and the mixxx::spanutil::spanFromPtrLen helper function.
This commit renames the pSourceBuffer and pDestBuffer in copyRing function onto sourceBuffer and destBuffer without the use of the 'p' prefix, which is used for pointers.
This commit introduces the std::span in RingDelayBufferTest. The span is primarily used in the RingDelayBufferTest::AssertIdenticalBufferEquals and for all calls of this function.
Just a little sum up. To make this PR ready, the last thing from my side is to finish the documentation and descriptive comments, if the previous changes will pass your reviews. The last thing, that I have on my mind is, that the mentioned licence for |
This commit renames the m_firstInputBuffer onto m_firstInputChunk due to it better describes the nature of the situation. Actually, in a general view, the user code has not had to pass to the RingDelayBuffer::write function the input buffer at once but could break it into small chunks.
This commit adds the documentation comments for the RingDelayBuffer class and adds some description comments for the code parts for better understanding.
This commit improves the const-correctness by making the RingDelayBuffer::size function returned value as a constant expression due to it can be evaluated in the compile time.
IANAL, but I don't think its required anymore... |
This commits introduces std::span in the RingDelayBuffer API. So, the parameter for RingDelayBuffer::read is only the destination buffer (using span) and delay items. The RingDelayBuffer::write has just one parameter, the source buffer (span as well). Depended tests are upgraded using span in the function callings too.
This commit removes the licence for the mixxx/lib/portaudio/pa_ringbuffer.c. The licence is no longer required due to major previous changes and the creation of a new implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for the nitpicks again....
This commit changes to avoid creating a span from the SampleBuffer's data using SampleBuffer::data and the implemented SampleBuffer::span function is used instead.
Absolutely no problem, this is how I should have written it the first time, I just did it too much automatically. Now, it should be fixed. The indent was kept to make it a little bit more readable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I'll go ahead and merge now so you can build of this.
Perfect, thank you very much for merging. |
The new ring buffer is introduced. The use case for the ring buffer is
for delay handling, but it also could be used as a classic ring buffer
with a jumping option with the reading position. It is based
on the classic known ring buffer. The extensions are,
that the ring delay buffer allows moving with the reading position
subject to certain rules. Another difference between the classic
ring buffer is, that the ring delay buffer offers to read zero values,
which were not written by using the write method and write position.
Both of these two specific properties are based on the cross-fading
between changes of two delays.