Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Picking 'random' numbers which will always be the same for a given event? #682

Closed
patrickbryant opened this issue Jun 3, 2022 · 6 comments
Labels
question Further information is requested

Comments

@patrickbryant
Copy link

In the c++ version of my analysis code there are a few spots where we make use of random numbers. In order to ensure the results are always the same for a given event we set the generator seed using the event number. Is there a fast/efficient way to do this with numpy arrays? Seems bad to generate the array one event at a time in a for loop.

Something like:

np.random.uniform(0, 1, size=len(events), seed=events.event)

From what I can find this functionality does not exist in numpy. Perhaps what I actually want is a hash of the event number where the hashing function samples the uniform distribution.

@patrickbryant patrickbryant added the question Further information is requested label Jun 3, 2022
@andrzejnovak
Copy link
Member

Generate a hash from filename:start:stop and use that for the seed?

@patrickbryant
Copy link
Author

Yeah I was thinking that but I can imagine scenarios where I end up running the same file with different chunk sizes

@NJManganelli
Copy link
Collaborator

Perhaps the performance cost would not be too bad from using e.g. in CMs the luminosity block as the seed for the random generator and filling an array via masking/where. I don’t know if I ever checked to see if that varies in CMS MC or not, though, the lumi… run is always 1 IIRC. Anything else that might be common across 1000s of events but not dependent on chunking/skimming/etc?

Nick M

@nsmith-
Copy link
Member

nsmith- commented Jun 22, 2022

A similar issue came up with JER smearing as discussed in #454.
The approach still is chunk-splitting-sensitive. We'll need to add a feature to correctionlib to do this anyway so perhaps that can be stolen/reused: cms-nanoAOD/correctionlib#130

@lgray
Copy link
Collaborator

lgray commented Dec 6, 2023

This appears answered. Re-open if not.

@lgray lgray closed this as completed Dec 6, 2023
@nsmith-
Copy link
Member

nsmith- commented Dec 19, 2023

Yes, correctionlib includes a facility to generate deterministic pseudorandom data. See https://cms-nanoaod.github.io/correctionlib/correctionlib_tutorial.html#Resolution-models for an example

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants