optimization #5

rokroskar · 2016-07-25T11:12:21Z

Currently, several spots in the workflow are taking a huge amount of time to run. Where are the bottle necks? Can we easily solve them? One idea is to minimize (de)serialization overhead by keeping partitions in full numpy arrays -- but will this work? For some of the steps, serialization is still a big problem.

Keep a list of places in need of optimization here:

reading particles and setting particle IDs (still room for small optimization in setting the actual IDs)
the particle arrays in cython code should be memory views of cfof.PARTICLE type
the first part of the group merge stage computes the group mappings across domains -- this currently does a full data shuffle because there is no partition information provided. We could do better by taking the RDD of PRIMARY_GHOST_PARTICLE particles and doing a union on the RDD of GHOST_PARTICLE_COPY particles, which will need to be shuffled. But this should result in a much smaller data shuffle overall because there are many more primary ghost particles. currently this is pretty fast, will not change for the moment
in count_groups_partition_cython the for loop is probably not needed -- just concatenate the partition and run np.unique once to get the (groupID, count) tuples

The text was updated successfully, but these errors were encountered:

… particle IDs (item in #5)

rokroskar · 2016-09-19T08:27:42Z

memoryviews implemented in #7

rokroskar added the enhancement label Jul 25, 2016

rokroskar added a commit that referenced this issue Jul 25, 2016

added function read_tipsy that optimizes the file read and setting of…

68b1ecb

… particle IDs (item in #5)

rokroskar closed this as completed Sep 19, 2016

rokroskar reopened this Sep 19, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimization #5

optimization #5

rokroskar commented Jul 25, 2016 •

edited

Loading

rokroskar commented Sep 19, 2016 •

edited

Loading

optimization #5

optimization #5

Comments

rokroskar commented Jul 25, 2016 • edited Loading

rokroskar commented Sep 19, 2016 • edited Loading

rokroskar commented Jul 25, 2016 •

edited

Loading

rokroskar commented Sep 19, 2016 •

edited

Loading