Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultiThreading #628

Closed
oeai opened this issue Apr 20, 2014 · 17 comments
Closed

MultiThreading #628

oeai opened this issue Apr 20, 2014 · 17 comments

Comments

@oeai
Copy link
Contributor

oeai commented Apr 20, 2014

Hi, i've searched code for
QFuture
QtConcurrent
QThreadPool
QRunnable
and didn't find it
http://qt-project.org/doc/qt-5/threads-technologies.html
it's supposed to use all available processors and control threads, so it will be much powerful i guess on 4 cores cpu.
sorry, maybe not an issue, i don't know Qt, i see there are threads, but Zynaddsubfx uses all those i've been searchng, not any other plugin for example, so having different threadpools for different groups of synths, effects, gui and visualization plugins could ease very well a lot of things, especially basic play mode.
but well needs a kind of testing first i guess.

@Sti2nd
Copy link
Contributor

Sti2nd commented Apr 20, 2014

Nice find! If diiz or another developer approves of this, would you be able to help out making it come true?

@eagles051387
Copy link

Does this have anything to do with making LMMS Real Time Safe?

On Sun, Apr 20, 2014 at 10:50 PM, Stian Jørgensrud <notifications@github.com

wrote:

Nice find! If diiz or another developer approves of this, would you be
able to help out making it come true?


Reply to this email directly or view it on GitHubhttps://github.com//issues/628#issuecomment-40904535
.

Jonathan Aquilina

@diizy
Copy link
Contributor

diizy commented Apr 21, 2014

AFAIK we already use separate threads for GUI and DSP. I'm not sure if
we ever use more than two though. Would have to ask Toby about the details.

@musikBear
Copy link

and one thing to remember, how will a multithreaded architecture work on systems with 2 or even 1 cpu?

@oeai
Copy link
Contributor Author

oeai commented Apr 21, 2014

@musikBear - yes, this is why it needs testing, but really you can live with stable-1.0 for that and for all those who got much more powerful cpu it would be a significant++, between 2 cpu (2cores) QThreadPool could balance all those things and to make more viable i thought to split even synths for few groups.
@Sti2nd - well, i'd like to, but i have not enough skills imho for qt? i wasn't programming qt and @diizy is much more confident in that question, knowing how things are done already.
i'd suggest to implement a multithreading class, so when -WANT_MORECPU is set then it will be included in compilation and i think that there may be listed all available QThreads and it will just call it depending on some cpu_info data, but to make it funnier i'd suggest to make it as LMMS_version, so when you got 4 cores - you'll get LMMS-4 or LMMS-2 for Duo, sometimes they do 3 -12-16 cores and 128++, so there will be a large set of versions. =)

@tobydox
Copy link
Member

tobydox commented Apr 22, 2014

We have multi-threading in the Mixer. The MixerWorkerThread instances process independent play handles (i.e. rendering the sound of an instrument), effect chains etc. in parallel. That way we can make use of all available CPU cores. QtConcurrent etc. are not suited here as they're not lightweight enough and probably not RT-safe. As a reference I suggest to look at the code in the master branch as there have been major improvements/cleanups -> src/core/MixerWorkerThread.cpp

@tobydox tobydox closed this as completed Apr 22, 2014
@oeai
Copy link
Contributor Author

oeai commented Apr 22, 2014

i didn't found cpu balancing there and as i understand you are using WorkerScript
QThreadPool implements serialization for QObjects so this looks like a safer way for RT
while QtConcurrent i'd use for final rendering, but if it works well now, maybe it's ok.
especially still i don't understand the mix of c++ vs qt, so not really understand what is done where.

@tobydox
Copy link
Member

tobydox commented Apr 22, 2014

There's one worker thread for each CPU - that's it. They're not explicitely pinned but that's not neccessary and a task of the OS scheduler. I also don't see why we should use different threading technologies for RT playback and rendering - just adds unneccessary complexity and more code to maintain. Furthermore we don't need QObject-specific features (like signals/slots) when rendering as that would cost lots of performance and eventually there's no need for it.

@oeai
Copy link
Contributor Author

oeai commented Apr 22, 2014

looks like /me using QFuture somehow, so just thinking about something that not in this time yet.
yes - more code, just for complex sounds 1CPU is not enough, so this load of just 1 thread should be balanced between many CPUs (this is what qthreadpool do), using different technologies can be a better way - just like blender do, for final render you can sit 3weeks on full load of CPU or just use GPU to get the same in about 2hrs, the difference between final render and working preview is huge. so knowing that there are different models of light rendering it's clear for me that different render-engines in fact can be applied for music as well. For example more details per reverb, wider room, deeper silence, more caustics or harmony for music.
in a simple way for working mode there should be optimizations and for final some detailed tune-ups.
so maybe qfuture will be able to make some work a little better, by disabling/enabling some threads while rendering. for for example me it would be nice to use 4 of my sound cards with different latency for final render, this is something that i cannot set now, and maybe distribute some work between em manually, it's really maybe not standard situation, but there should be at least different optimization modes and maybe different cards for working output and final render.
but really let's say it's a feature request to get more render-modes, maybe not this time yet ,' )

@tobydox
Copy link
Member

tobydox commented Apr 22, 2014

Again, LMMS does already make use of all available CPU cores! Offloading rendering jobs to other hardware than CPU would have to be done by the specific plugins itself. Usually exporting a project with high quality settings still is just a thing of minutes on modern computers. Blender is a completely different thing as there's a lot more data to be processed which at the same time can be processed through standardized computing/shading languages and thus offloaded to GPU easily.

@diizy
Copy link
Contributor

diizy commented Apr 22, 2014

I think what he's asking is, can one playHandle be divided to be
processed by several cores at once...

I'd think the answer is no since the rendering function is a single
function call per period. Of course if you use a very low period size, a
single note could be divided accross multiple cores, but each period of
a single note is always on a single thread.

Which IMO is the way it should be. If we have to start worrying about
thread safety even within the period processing loop, that makes writing
plugins like 100000 times harder...

@oeai
Copy link
Contributor Author

oeai commented Apr 22, 2014

separate thread per instrument-track.

@tobydox
Copy link
Member

tobydox commented Apr 22, 2014

This is implicitely done. If you play e.g. 4 notes with an InstrumentTrack the sound of these 4 notes will be processed by the available worker threads. If you have 4 CPUs, the 4 notes will be rendered in parallel. For single-streamed instruments, the individual InstrumentPlayHandles will be processed (rendered) in parallel as well so basically there's one thread for e.g. each ZynAddSubFX instrument. The same for independent effect chains.

This is far more generic and less error-prone than starting to think about multi-threading in the DSP code itself. As long as things are splitted into enough single jobs, this scales well. When rendering projects on my QuadCore LMMS usually does have a CPU usage of 350-380% which is quite good given that there're still other things which can't be parallelized (like encoding the output stream).

@unfa
Copy link
Contributor

unfa commented Apr 27, 2014

I haven't measured it, but I always wandered why LMMS never uses 100% (or
400% rather) of my i5's 4 logical processors when rendering a track. It
always leaves like 20% of total CPU time free.
On 22 Apr 2014 17:57, "Tobias Doerffel" notifications@github.com wrote:

This is implicitely done. If you play e.g. 4 notes with an InstrumentTrack
the sound of these 4 notes will be processed by the available worker
threads. If you have 4 CPUs, the 4 notes will be rendered in parallel. For
single-streamed instruments, the individual InstrumentPlayHandles will be
processed (rendered) in parallel as well so basically there's one thread
for e.g. each ZynAddSubFX instrument. The same for independent effect
chains.

This is far more generic and less error-prone than starting to think about
multi-threading in the DSP code itself. As long as things are splitted into
enough single jobs, this scales well. When rendering projects on my
QuadCore LMMS usually does have a CPU usage of 350-380% which is quite good
given that there're still other things which can't be parallelized (like
encoding the output stream).


Reply to this email directly or view it on GitHubhttps://github.com//issues/628#issuecomment-41058071
.

@Sti2nd
Copy link
Contributor

Sti2nd commented Apr 27, 2014

100% will maybe crash your PC, LMMS or make it unresponding? My guess.

@tobydox
Copy link
Member

tobydox commented Apr 27, 2014

As already said, 100% will never be possible as there's always work which can't be parallelized and thus only computed by one core each period before and after doing the main work in parallel.

@diizy
Copy link
Contributor

diizy commented Apr 27, 2014

On 04/27/2014 11:56 PM, Tobias Doerffel wrote:

As already said, 100% will never be possible as there's always work
which can't be parallelized and thus only computed by one core /each
period/ before and after doing the main work in parallel.

Toby, since we're talking about threads, did you see the post I made
with a question about the new FXmixer's thread usage... I'm wondering if
it can be optimized more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants