Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiprocess issue #468

Closed
darrencl opened this issue Mar 9, 2020 · 11 comments
Closed

Multiprocess issue #468

darrencl opened this issue Mar 9, 2020 · 11 comments

Comments

@darrencl
Copy link

darrencl commented Mar 9, 2020

Hi, not sure if I should open this in Distributed.jl or here. Sometimes when I train my pipeline with acceleration=MLJ.CPUProcesses(), acceleration_resampling=MLJ.CPUThreads() I got the following error (The logs are mixed up with the other process, so you can ignore the non-relevant logs). I am using 64 processes with 64 threads.

ERROR: LoadError: ProcessExitedException(19)
Stacktrace:[ Info: Training NodalMachine{CorrelationFeatureSelector} @ 466.

 [1] (::Base.var"#732#734")(::Task) at ./asyncmap.jl:178
 [2] foreach(::Base.var"#732#734", ::Array{Any,1}) at ./abstractarray.jl:1920
 [3] maptwice(::Function, ::Channel{Any}, ::Array{Any,1}, ::Array{MyTreePipe,1}) at ./asyncmap.jl:178
 [4] wrap_n_exec_twice(::Channel{Any}, ::Array{Any,1}, ::Distributed.var"#208#211"{WorkerPool}, ::Function, ::Array{MyTreePipe,1}) at ./asyncmap.jl:154
 [5] #async_usemap#717(::Function, ::Nothing, ::typeof(Base.async_usemap), ::Distributed.var"#192#194"{Distributed.var"#192#193#195"{WorkerPool,MLJTuning.var"#8#9"{Machine{Resampler{StratifiedCV,MyTreePipe}},Int64,Grid,Nothing}}}, ::Array{MyTreePipe,1}) at ./asyncmap.jl:103
 [6] (::Base.var"#kw##async_usemap")(::NamedTuple{(:ntasks, :batch_size),Tuple{Distributed.var"#208#211"{WorkerPool},Nothing}}, ::typeof(Base.async_usemap), ::Function, ::Array{MyTreePipe,1}) at ./none:0
 [7] #asyncmap#716 at ./asyncmap.jl:81 [inlined]
 [8] #asyncmap at ./none:0 [inlined]
 [9] #pmap#207(::Bool, ::Int64, ::Nothing, ::Array{Any,1}, ::Nothing, ::typeof(pmap), ::Function, ::WorkerPool, ::Array{MyTreePipe,1}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Distributed/src/pmap.jl:126
 [10] pmap(::Function, ::WorkerPool, ::Array{MyTreePipe,1}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Distributed/src/pmap.jl:101
 [11] #pmap#217(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(pmap), ::Function, ::Array{MyTreePipe,1}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Distributed/src/pmap.jl:156
 [12] pmap at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Distributed/src/pmap.jl:156 [inlined]
 [13] assemble_events(::Array{MyTreePipe,1}, ::Machine{Resampler{StratifiedCV,MyTreePipe}}, ::Int64, ::Grid, ::Nothing, ::ComputationalResources.CPUProcesses{Nothing}) at /home/tdlukas/.julia/packages/MLJTuning/qFW6q/src/tuned_models.jl:227
 [14] build(::Nothing, ::Int64, ::Grid, ::MyTreePipe, ::NamedTuple{(:models, :fields, :parameter_scales),Tuple{Array{MyTreePipe,1},NTuple{10,Expr},NTuple{10,Symbol}}}, ::Int64, ::ComputationalResources.CPUProcesses{Nothing}, ::Machine{Resampler{StratifiedCV,MyTreePipe}}) at /home/tdlukas/.julia/packages/MLJTuning/qFW6q/src/tuned_models.jl:259
 [15] fit(::MLJTuning.ProbabilisticTunedModel{Grid,MyTreePipe,ComputationalResources.CPUProcesses{Nothing},CPUThreads{Nothing}}, ::Int64, ::DataFrame, ::CategoricalArray{String,1,UInt32,String,CategoricalString{UInt32},Union{}}) at /home/tdlukas/.julia/packages/MLJTuning/qFW6q/src/tuned_models.jl:292
in expression starting at /dmf/tri_services/Facilities/Imaging/Darren/DoD_classifier/compare_approaches.jl:216
[ Info: Training NodalMachine{DecisionTreeClassifier} @ 129.
┌ Warning: Forcibly interrupting busy workers
│   exception = rmprocs: pids [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65] not terminated after 5.0 seconds.
└ @ Distributed /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Distributed/src/cluster.jl:1219
[ Info: Training NodalMachine{CorrelationFeatureSelector} @ 486.
┌ Warning: rmprocs: process 1 not removed
└ @ Distributed /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Distributed/src/cluster.jl:1015

I'm not sure what's happening here since this doesn't always happen although using same code. Might be it's the issue in my HPC's CPU scheduling (I am using PBS cluster)?

@OkonSamuel
Copy link
Member

OkonSamuel commented Mar 12, 2020

@darrencl. I think your right the issue is with the PBS cluster scheduler.Some processes were interrupted by the cluster(maybe due to other users using the cluster). I am not familiar with the PBS cluster. But you could checkout ClusterManagers package for more info.

@darrencl
Copy link
Author

@OkonSamuel Thanks! My pipeline requires long time when training due to pre-processing occurs in the cross validation to tune pre-processing hyper-parameter. This will obviously be slower especially when training with models having some hyper-parameters to be tuned. It could even take few days to train the pipeline (maybe partly due to old hardware and scheduling/shared system).

I know in deep net framework like keras, we're able to stop and resume the training, so similarly, I'm just wondering when my script breaks in the middle, is there any workaround to let the pipeline resume gracefully?

@OkonSamuel
Copy link
Member

OkonSamuel commented Mar 13, 2020

@darrencl. At the moment i don't think MLJ has ability to pause and resume training from where it left off..But @ablaom or @tlienart can correct me on this if am wrong?. This seems like a nice feature to have.

@ablaom
Copy link
Member

ablaom commented Mar 15, 2020

The plan, which is yet to be implemented, is to provide a model wrapper IterativeModel (say) for any iterative model, to enable its "external" control. "Control" would include things like serialization between iterations to allow restart on crash, applying early stopping criterion, and so forth. This wrapper would work for any iterative model that implements an appropriate update model method, which includes TunedModel. So tuning would be covered.

@ablaom
Copy link
Member

ablaom commented Mar 15, 2020

Closing as the original issue seems not to be MLJ related. The other discussion could continue at #139.

@ablaom ablaom closed this as completed Mar 15, 2020
@darrencl
Copy link
Author

Hi @ablaom @OkonSamuel I have a question on multiprocessed logs when using Distributed's addprocs on local machine. I am training a model using MLJ pipeline, which is accelerated using multiprocessing on the grid search, below is the log result. It can be seen that sometimes the worker runs for few hours, while ideally it should run for ~20mins, is this just because of the scheduler? Note that I have 8 cores in my local machine, so in theory I could run 8 processes at a time (in this I summon 5 additional processes with multithreads in each process using BLAS.set_num_threads()). Just to be precise, the multithreads runs on the 5-fold (i.e. individual fold has its own thread), while the whole/1x 5-fold runs in a process. As a side note, this runs in IJulia notebook.

(base) darren@darren-Latitude-7490:~/Documents$ cat train.log | grep -A 3 Evaluating
Evaluating over 5 folds: 100%[=========================] Time: 0:27:52
┌ Info: Pre-processing done, putting into df
└   timer = 36 seconds, 340 milliseconds
      From worker 5:	(preprocessor = (skew = 0.2, t1pad = 1, pow = 2, target_idx = 1), selector = (k = 5, threshold = 0.0, target_idx = 1), model = (method = :gevd, cov_w = StatsBase.SimpleCovariance(false), cov_b = StatsBase.SimpleCovariance(false), out_dim = 0, regcoef = 1.0e-6, dist = Distances.SqEuclidean(0.0)))
--
Evaluating over 5 folds: 100%[=========================] Time: 0:29:22
      From worker 6:	(preprocessor = (skew = 0.1, t1pad = 1, pow = 1, target_idx = 1), selector = (k = 5, threshold = 0.0, target_idx = 1), model = (method = :gevd, cov_w = StatsBase.SimpleCovariance(false), cov_b = StatsBase.SimpleCovariance(false), out_dim = 0, regcoef = 1.0e-6, dist = Distances.SqEuclidean(0.0)))
      From worker 6:	(measure = MLJBase.CrossEntropy{Float64}[cross_entropy], measurement = [0.6951190812943034])
[ Info: Training Machine{Resampler} @ 1…90.
--
Evaluating over 5 folds: 100%[=========================] Time: 0:24:30
      From worker 5:	(preprocessor = (skew = 0.3, t1pad = 1, pow = 3, target_idx = 1), selector = (k = 5, threshold = 0.0, target_idx = 1), model = (method = :gevd, cov_w = StatsBase.SimpleCovariance(false), cov_b = StatsBase.SimpleCovariance(false), out_dim = 0, regcoef = 1.0e-6, dist = Distances.SqEuclidean(0.0)))
      From worker 5:	(measure = MLJBase.CrossEntropy{Float64}[cross_entropy], measurement = [0.7095121584895735])
[ Info: Training Machine{Resampler} @ 6…64.
--
Evaluating over 5 folds: 100%[=========================] Time: 0:24:01
      From worker 6:	(preprocessor = (skew = 0.5, t1pad = 1, pow = 1, target_idx = 1), selector = (k = 5, threshold = 0.0, target_idx = 1), model = (method = :gevd, cov_w = StatsBase.SimpleCovariance(false), cov_b = StatsBase.SimpleCovariance(false), out_dim = 0, regcoef = 1.0e-6, dist = Distances.SqEuclidean(0.0)))
      From worker 6:	(measure = MLJBase.CrossEntropy{Float64}[cross_entropy], measurement = [0.6925936572433181])
[ Info: Training Machine{Resampler} @ 1…72.
--
Evaluating over 5 folds: 100%[=========================] Time: 0:27:01
      From worker 5:	(preprocessor = (skew = 0.4, t1pad = 1, pow = 3, target_idx = 1), selector = (k = 5, threshold = 0.0, target_idx = 1), model = (method = :gevd, cov_w = StatsBase.SimpleCovariance(false), cov_b = StatsBase.SimpleCovariance(false), out_dim = 0, regcoef = 1.0e-6, dist = Distances.SqEuclidean(0.0)))
      From worker 5:	(measure = MLJBase.CrossEntropy{Float64}[cross_entropy], measurement = [0.6937507751226161])
[ Info: Training Machine{Resampler} @ 9…27.
--
Evaluating over 5 folds: 100%[=========================] Time: 0:25:32
      From worker 5:	(preprocessor = (skew = 0.1, t1pad = 1, pow = 3, target_idx = 1), selector = (k = 5, threshold = 0.0, target_idx = 1), model = (method = :gevd, cov_w = StatsBase.SimpleCovariance(false), cov_b = StatsBase.SimpleCovariance(false), out_dim = 0, regcoef = 1.0e-6, dist = Distances.SqEuclidean(0.0)))
      From worker 5:	(measure = MLJBase.CrossEntropy{Float64}[cross_entropy], measurement = [0.6943046187258485])
[ Info: Training Machine{Resampler} @ 4…85.
--
Evaluating over 5 folds: 100%[=========================] Time: 2:14:07
      From worker 2:	(preprocessor = (skew = 0.5, t1pad = 2, pow = 2, target_idx = 1), selector = (k = 5, threshold = 0.0, target_idx = 1), model = (method = :gevd, cov_w = StatsBase.SimpleCovariance(false), cov_b = StatsBase.SimpleCovariance(false), out_dim = 0, regcoef = 1.0e-6, dist = Distances.SqEuclidean(0.0)))
┌ Info: Pre-processing done, putting into df
└   timer = 3 minutes, 56 seconds, 636 milliseconds
--
Evaluating over 5 folds: 100%[=========================] Time: 2:15:00
      From worker 3:	(preprocessor = (skew = 0.4, t1pad = 2, pow = 1, target_idx = 1), selector = (k = 5, threshold = 0.0, target_idx = 1), model = (method = :gevd, cov_w = StatsBase.SimpleCovariance(false), cov_b = StatsBase.SimpleCovariance(false), out_dim = 0, regcoef = 1.0e-6, dist = Distances.SqEuclidean(0.0)))
      From worker 3:	(measure = MLJBase.CrossEntropy{Float64}[cross_entropy], measurement = [0.6922140272354853])
[ Info: Training Machine{Resampler} @ 1…10.
--
Evaluating over 5 folds: 100%[=========================] Time: 0:27:51
      From worker 3:	(preprocessor = (skew = 0.3, t1pad = 1, pow = 2, target_idx = 1), selector = (k = 5, threshold = 0.0, target_idx = 1), model = (method = :gevd, cov_w = StatsBase.SimpleCovariance(false), cov_b = StatsBase.SimpleCovariance(false), out_dim = 0, regcoef = 1.0e-6, dist = Distances.SqEuclidean(0.0)))
      From worker 3:	(measure = MLJBase.CrossEntropy{Float64}[cross_entropy], measurement = [0.6867009488169642])
[ Info: Training Machine{Resampler} @ 1…48.
--
Evaluating over 5 folds: 100%[=========================] Time: 2:24:03
      From worker 2:	(preprocessor = (skew = 0.5, t1pad = 2, pow = 5, target_idx = 1), selector = (k = 5, threshold = 0.0, target_idx = 1), model = (method = :gevd, cov_w = StatsBase.SimpleCovariance(false), cov_b = StatsBase.SimpleCovariance(false), out_dim = 0, regcoef = 1.0e-6, dist = Distances.SqEuclidean(0.0)))
      From worker 2:	(measure = MLJBase.CrossEntropy{Float64}[cross_entropy], measurement = [0.6900283230710006])
[ Info: Training Machine{Resampler} @ 1…76.
--
Evaluating over 5 folds: 100%[=========================] Time: 2:24:58
      From worker 3:	(preprocessor = (skew = 0.1, t1pad = 2, pow = 2, target_idx = 1), selector = (k = 5, threshold = 0.0, target_idx = 1), model = (method = :gevd, cov_w = StatsBase.SimpleCovariance(false), cov_b = StatsBase.SimpleCovariance(false), out_dim = 0, regcoef = 1.0e-6, dist = Distances.SqEuclidean(0.0)))
      From worker 3:	(measure = MLJBase.CrossEntropy{Float64}[cross_entropy], measurement = [0.691078816371036])
[ Info: Training Machine{Resampler} @ 1…33.
--
Evaluating over 5 folds: 100%[=========================] Time: 5:41:26K
      From worker 4:	(preprocessor = (skew = 0.5, t1pad = 3, pow = 2, target_idx = 1), selector = (k = 5, threshold = 0.0, target_idx = 1), model = (method = :gevd, cov_w = StatsBase.SimpleCovariance(false), cov_b = StatsBase.SimpleCovariance(false), out_dim = 0, regcoef = 1.0e-6, dist = Distances.SqEuclidean(0.0)))
      From worker 4:	(measure = MLJBase.CrossEntropy{Float64}[cross_entropy], measurement = [0.7106614867732746])
┌ Info: Pre-processing done, putting into df
--
Evaluating over 5 folds: 100%[=========================] Time: 6:15:0000:02
      From worker 2:	(preprocessor = (skew = 0.1, t1pad = 3, pow = 4, target_idx = 1), selector = (k = 5, threshold = 0.0, target_idx = 1), model = (method = :gevd, cov_w = StatsBase.SimpleCovariance(false), cov_b = StatsBase.SimpleCovariance(false), out_dim = 0, regcoef = 1.0e-6, dist = Distances.SqEuclidean(0.0)))
      From worker 2:	(measure = MLJBase.CrossEntropy{Float64}[cross_entropy], measurement = [0.6878213435314848])
[ Info: Training Machine{Resampler} @ 3…30.
--
Evaluating over 5 folds: 100%[=========================] Time: 6:20:2021:20
      From worker 3:	(preprocessor = (skew = 0.2, t1pad = 3, pow = 4, target_idx = 1), selector = (k = 5, threshold = 0.0, target_idx = 1), model = (method = :gevd, cov_w = StatsBase.SimpleCovariance(false), cov_b = StatsBase.SimpleCovariance(false), out_dim = 0, regcoef = 1.0e-6, dist = Distances.SqEuclidean(0.0)))
      From worker 3:	(measure = MLJBase.CrossEntropy{Float64}[cross_entropy], measurement = [0.6897869050018242])
[ Info: Training Machine{Resampler} @ 7…61.
--
Evaluating over 5 folds: 100%[=========================] Time: 0:33:12
      From worker 3:	(preprocessor = (skew = 0.4, t1pad = 1, pow = 2, target_idx = 1), selector = (k = 5, threshold = 0.0, target_idx = 1), model = (method = :gevd, cov_w = StatsBase.SimpleCovariance(false), cov_b = StatsBase.SimpleCovariance(false), out_dim = 0, regcoef = 1.0e-6, dist = Distances.SqEuclidean(0.0)))
      From worker 3:	(measure = MLJBase.CrossEntropy{Float64}[cross_entropy], measurement = [0.688506274833862])
[ Info: Training Machine{Resampler} @ 8…40.
--
Evaluating over 5 folds: 100%[=========================] Time: 11:39:4839:14
      From worker 6:	(preprocessor = (skew = 0.1, t1pad = 4, pow = 1, target_idx = 1), selector = (k = 5, threshold = 0.0, target_idx = 1), model = (method = :gevd, cov_w = StatsBase.SimpleCovariance(false), cov_b = StatsBase.SimpleCovariance(false), out_dim = 0, regcoef = 1.0e-6, dist = Distances.SqEuclidean(0.0)))
      From worker 6:	(measure = MLJBase.CrossEntropy{Float64}[cross_entropy], measurement = [0.689900278359669])
[ Info: Training Machine{Resampler} @ 5…27.
--
Evaluating over 5 folds: 100%[=========================] Time: 0:36:10
      From worker 6:	(preprocessor = (skew = 0.5, t1pad = 1, pow = 5, target_idx = 1), selector = (k = 5, threshold = 0.0, target_idx = 1), model = (method = :gevd, cov_w = StatsBase.SimpleCovariance(false), cov_b = StatsBase.SimpleCovariance(false), out_dim = 0, regcoef = 1.0e-6, dist = Distances.SqEuclidean(0.0)))
      From worker 6:	(measure = MLJBase.CrossEntropy{Float64}[cross_entropy], measurement = [0.6892440460095297])
[ Info: Training Machine{Resampler} @ 1…91.
--
Evaluating over 5 folds: 100%[=========================] Time: 11:38:5935:59
      From worker 5:	(preprocessor = (skew = 0.5, t1pad = 4, pow = 3, target_idx = 1), selector = (k = 5, threshold = 0.0, target_idx = 1), model = (method = :gevd, cov_w = StatsBase.SimpleCovariance(false), cov_b = StatsBase.SimpleCovariance(false), out_dim = 0, regcoef = 1.0e-6, dist = Distances.SqEuclidean(0.0)))
      From worker 5:	(measure = MLJBase.CrossEntropy{Float64}[cross_entropy], measurement = [0.6899755841172523])

Thanks!

@OkonSamuel
Copy link
Member

OkonSamuel commented Mar 26, 2020

@darrencl . I think what's happening here is that the number of models to be evaluated during tuning is far greater than the number of processes available to julia (which is 5 in your case. BLAS,set_num_threads(5) only sets BLAS threads not add new julia processes.) so some evaluations will have to wait till there is an available worker which may take a while. To confirm that this is the case try tuning using only a single range to reduce the number of models to be evaluated. Hope this helps.

@darrencl
Copy link
Author

darrencl commented Mar 26, 2020

@OkonSamuel Ahh, I see. Fair enough. So for evaluating over a lot of models (due to a lot of hyper-parameters to be tuned), is there any other alternatives to optimize the process?

Anyway, I am on Ubuntu 18.04 and tried to use only acceleration_resampling=CPUThreads() with BLAS.set_num_threads(5). Looking into the system monitor, I can see that it utilizes 5 CPUs. I thought multi-thread only uses 1 core, while multi-process uses multi cores? Then what distinguish multi-thread with multi-process in Julia if this is the case, and which one is favored?

image

Also, looking into the logs produced during resampling (5-fold with 5 CPUThreads), it seems the threads are still done in serial. If it's not serial, then the 3 seconds log should've come first, then 6s, etc. So I am not sure as to how this helps in terms of optimizing the performance?

┌ Info: Unpacked to df. Total time
└   timer = 14 seconds, 751 milliseconds
┌ Info: Unpacked to df. Total time
└   timer = 12 seconds, 305 milliseconds
┌ Info: Unpacked to df. Total time
└   timer = 9 seconds, 604 milliseconds
┌ Info: Unpacked to df. Total time
└   timer = 6 seconds, 513 milliseconds
┌ Info: Unpacked to df. Total time
└   timer = 3 seconds, 448 milliseconds

@OkonSamuel
Copy link
Member

OkonSamuel commented Mar 26, 2020

@darrencl

Also, looking into the logs produced during resampling (5-fold with 5 CPUThreads), it seems the threads are still done in serial. If it's not serial, then the 3 seconds log should've come first, then 6s, etc. So I am not sure as to how this helps in terms of optimizing the performance?

Yes your right this is done serially. To use acceleration_resampling=CPUThreads() julia has to be started with multi threads.($ JULIA_NUM_THREADS=4 ./julia, BLAS.set_num_threads(1)). If done this way it would run in parallel. In terms of performance gains you could test it to see. The reason i set BLAS threads to 1 is because BLAS threads tends to affect performance of julia threads.

Anyway, I am on Ubuntu 18.04 and tried to use only acceleration_resampling=CPUThreads() with BLAS.set_num_threads(5). Looking into the system monitor, I can see that it utilizes 5 CPUs. I thought multi-thread only uses 1 core, while multi-process uses multi cores? Then what distinguish multi-thread with multi-process in Julia if this is the case, and which one is favored?

Both multithread and Multicore are forms of parallelism. Multi-thread is shared memory parallelism( The same julia process making use of multiple CPUs and sharing memory) while Multicore does not shared memory (each distinct julia process runs on a different CPU). There is no best choice both have their use cases.

@OkonSamuel Ahh, I see. Fair enough. So for evaluating over a lot of models (due to a lot of hyper-parameters to be tuned), is there any other alternatives to optimize the process?

I don't think there is an alternative except having more cores. But using acceleration=CPUProcesses() should still be way faster than using single CPU acceleration=CPU1().

@darrencl
Copy link
Author

darrencl commented Mar 26, 2020

@OkonSamuel if I get this correctly, BLAS' threads is used for computational operation such as matrix, etc. while Julia threads is for general purpose? If I were to set this in a cluster, is there anyway I can do this in code instead of going into worker nodes and setup JULIA_NUM_THREADS environment variable?

I see, so spawning 5 processes with 4 threads (i.e. acceleration=CPUProcesses() and acceleration_resampling=CPUThreads()) will still require 5*4 CPUs in that sense? And if I were to only have 8 cores, this could overload my system and result in ineffective process?

@OkonSamuel
Copy link
Member

@darrencl

if I get this correctly, BLAS' threads is used for computational operation such as matrix, etc. while Julia threads is for general purpose?

YES. BLAS threads are used by the LinearAlgebra Standard Library.

If I were to set this in a cluster, is there anyway I can do this in code instead of going into worker nodes and setup JULIA_NUM_THREADS environment variable?

No. For now it has too be set up usjng the JULIA_NUM_THREADS enviroment variable. see issue.

I see, so spawning 5 processes with 4 threads (i.e. acceleration=CPUProcesses() and acceleration_resampling=CPUThreads()) will still require 5*4 CPUs in that sense?

Yes. This spawns 4 threads per process making a total of 20 threads. Therefore having more cpu's would be more effective.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants