-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Distribution to sample from when simulating from custom predict #17
Comments
Hi, Thank you for the question. Yes once the predicted values are derived from the custom predict function, these values will be directly used in the simulation code without sampling. If the predictor has a known distribution, as specified in the pre-defined distributions, e.g., "binary", "normal", etc, the predicted values are drawn from their distribution in the simulation. If the predictor has a custom distribution, it is expected that users include feeding in the distribution in the custom fit function and sample from that distribution to get the sampled values in the custom predict function. If there is no underlying distribution assumption on the predictor, such as using a random forest model, then there is no need to do sampling in the custom predict function, the predicted values will be used directly. I'll make it clear in the documentation. Feel free to let me know if you have any further questions on this. Best, |
Thank you for your answer. In case of random forest model, when there is no assumption about the underlying distribution of the predictor, does it make sense to directly use the mean (predicted) value? The new data simulated for n_simul number of patients would all be the same with no variance? Perhaps I am not understanding something here? I also had another query. My application requires working with end-of-followup outcomes of variable time length for each patient. Is there a way to work with this in the package. Right now, the package requires all the patients have the same time length. |
Hi, With a quick look-over, there should be a way to implement the EOF outcome with variable time length. You are free to add this new feature and submit a pull request as well. Alternatively, you can send me more details about your question and the required data structure, I'll review its compatibility with the current version and update the code when I have a moment. Best, |
Hi Jing, Thank you for your prompt replies. I have coded up the feature of EOF outcomes with variable time length. I will submit the new feature as a pull request. Please do review when you have time for it to be merged. Also, can you add the 'Discussions' tab to your project? I would like to discuss some assumptions that I made when developing the feature and also some conceptual questions I wanted to ask. I feel they are more discussions than issues. Thanks again for a very useful package! Regards, |
Hi, Best, |
Hi,
Very nice work. Thank you for your contribution.
At the step where we feed in "Custom" covariate models, I am interested in providing custom fit and predict functions. I see from the source code that once the predicted values from a custom predict model (mean values for each patient at any given time point) are given out on the simulated data, the code to sample data from (which?) distribution is not present. It is also not clear which distribution it would be sampled from. Or are you expecting the user to feed in the distribution as well based on the assumption the custom model would make on the underlying distribution of the target variable in condideration? In that case, it is not clear from the documentation. For example, in case of "normal", the predicted mean and the variance is used to characterize a normal distribution and patient data (counterfactual covariates) is drawn by sampling from this distribution.
The text was updated successfully, but these errors were encountered: