-
Notifications
You must be signed in to change notification settings - Fork 358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add acceleration option to JointPrimaryMarginalizedModel likelihood #4688
Conversation
bin/inference/pycbc_inference
Outdated
@@ -141,7 +141,12 @@ with ctx: | |||
if pool.is_main_process(): | |||
for fn in [sampler.checkpoint_file, sampler.backup_file]: | |||
with loadfile(fn, 'a') as fp: | |||
fp.write_config_file(cp) | |||
# some models will interally modify original cp for sampling, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be an option as one should always save the original configuration. Why isn't your internal sampler modifying a copy? The version saved here then doesn't have to know (and really shouldn't) that you may have modified internally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahnitz My understanding is cp
is saved when it runs logging.info("Loading joint_primary_marginalized model") return super(HierarchicalModel, cls).from_config( cp, submodels=submodels, **kwargs)
https://github.com/WuShichao/pycbc/blob/accelerate_multiband/pycbc/inference/models/hierarchical.py#L1002 the cp
here is the modified config for sampling, can't be used as the initial config. So how to let PyCBC save the original one here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WuShichao I don't see anything related to saving the configfile where you've pointed. The place is in pycbc_inference where I'm comment on. Take a look at my first comment, I say how to do it. Don't modify the configparser you are passed in-place. Make a copy so you aren't editting the original. Then you don't need to save a separate copy and this particular line will just work to begin with.
@@ -96,7 +96,10 @@ model.sampling_transforms = None | |||
def callmodel(arg): | |||
iteration, paramvals = arg | |||
# calculate the logposterior to get all stats populated | |||
model.update(**{p: paramvals[p] for p in model.variable_params}) | |||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, this is not correct. Your top level model should just have an update method. That method should do whatever is needed to prepare for the log* methods to actually work. You shouldn't be requiring anyone know about a new method 'update_all_models'. It's fine if you want to have an new method for internal use, but not for the top-level api.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So do I need to rename my update_all_models
to be update
(so overwrite the base one), or just use the original update
in callmodel
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WuShichao A sampler is just using 'update' so why are you doing so differently here? Think through how to do this best, but it should be clear that making this change in this program seems very inconsistent. If it is required, that indicates something is wrong with your model, and if so fix that. Otherwise maybe you made this change in error.
marginalized_params += list(primary_model.static_params.keys()) | ||
marginalized_params = list(marginalized_params.keys()) | ||
# add distance or phase if they are marginalized | ||
if primary_model.distance_marginalization: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need to hardcode this? This will be brittle and break easily to changes to marginalization for example, which we don't want. If you need a list of marginalized parameters, why not add this to the class in inference/tools so that it is kept up to date?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I will save it in the tools
module and just use it here.
@@ -596,10 +596,11 @@ def _loglr(self): | |||
filt += filter_i | |||
norm += norm_i | |||
|
|||
loglr = self.marginalize_loglr(filt, norm) | |||
if self.return_sh_hh: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this if statement? Shouldn't the existing flag already used for demarginalization take care of this? E.g. why not use the reconstruct_phase flag?
https://github.com/gwastro/pycbc/blob/master/pycbc/inference/models/tools.py#L241
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahnitz This is used here https://github.com/WuShichao/pycbc/blob/accelerate_multiband/pycbc/inference/models/hierarchical.py#L752 We need to get LISA's sh and hh.
if isinstance(value, numpy.ndarray): | ||
nums = len(value) | ||
else: | ||
nums = 1 | ||
# add distance if it has been marginalized, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, avoid needed to know explicitly about distance here. Think instead about the format that you require. If the format differs how to generically convert.
hh_others[i] += hh_other | ||
other_model.return_sh_hh = False | ||
# set logr, otherwise it will store (sh, hh) | ||
setattr(other_model._current_stats, 'loglr', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to store this? It's not necessarily a problem, but it will slow down the code slightly (maybe not important at the moment). Why not think about why it was being set to a vector (and from where), do you even want this stored in the case of a submodel? Maybe the solution was simply not to store this when it's not actually a marginalized loglr anyway, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that when pycbc_inference_model_stats
check for pi, p in enumerate(model.default_stats):
it will try to access submodel's loglr
, no? If so, I need to store it. When I not rest it, I found other_model._current_stats
also contain (sh, hh) for each point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WuShichao OK, good. So now the question is what is the right thing to do in this case?
other_model.return_sh_hh = False | ||
# set logr, otherwise it will store (sh, hh) | ||
setattr(other_model._current_stats, 'loglr', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above. It's not necessarily a problem as it might be useful to have the separate loglrs, but it's not clear that it will always make sense.
margin_params[p] = \ | ||
self.primary_model.current_params[p][i_max_extrinsic] | ||
nums = len(self.primary_model.current_params[p]) | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WuShichao This logic should already take care of the distance case, except that later on you assume that if any parameter is a scalar they all are. That's the part you should stop assuming. Don't assume they are any particular mix of scalar or vector.
Replace deprecated ConfigParser |
…wastro#4688) * Update hierarchical.py * Update hierarchical.py * Update hierarchical.py * Update hierarchical.py * Update hierarchical.py * fix cc issues * Update hierarchical.py * Update relbin.py * add complex phase correction for sh_others * Update hierarchical.py * Update relbin.py * fix cc issues * make code more general * update * fix * rename * update * WIP * fix a bug in frame transform * fix overwritten issues * update * update * fix reconstruct * make this PR general * update * update * fix cc issues * rename * rename * add multiband description * fix * add comments * fix hdf's config * fix * fix * fix * fix * remove print * update for Alex's comments * wip * update * fix * update * seems work * fix CC issue * fix * fix demargin
@ahnitz This PR adds an acceleration option to JointPrimaryMarginalizedModel likelihood by assuming all extrinsic parameters can be fixed.