You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
This is a follow up on the introduction of 10% of missing data.
Thanks @joshwlambert for the clarification about the usage of NA in some columns. As you mentionned, in the discussion of PR #187, boils down to represent unavailable data for cases where this is not collected (Ct values for non confirmed cases).
I have run the function several times to see columns where NA is introduced from the sim_linelist() function. It came out to be columns - see the outcome below.
Thanks for the informative description. I have addressed this request in PR #199. I've added a new internal .add_missing() function that adds what you've requested.
If the missing_value is NA then the newly inserted missing values do not sample from the existing NA elements in the <data.frame>. Avoiding overwriting missing values. If the missing_value is changed by the user, for example to "N/A", then the .add_missing() function samples from all <data.frame> elements.
.add_missing() also performs type coercion to avoid unwanted type coercions when the user specifies a custom missing_value.
I suggest the 10% NA to be introduced in the remaining columns i.e. in columns other than these five columns.
The approach taken in .add_missing() still allows introducing missing values into the <data.frame> columns that already contain NAs when the missing_value = NA (default), this nicely retains the feature of random missingness without overwriting NA values.
Is your feature request related to a problem? Please describe.
This is a follow up on the introduction of 10% of missing data.
Thanks @joshwlambert for the clarification about the usage of
NA
in some columns. As you mentionned, in the discussion of PR #187, boils down to represent unavailable data for cases where this is not collected (Ct values for non confirmed cases).I have run the function several times to see columns where
NA
is introduced from thesim_linelist()
function. It came out to be columns - see the outcome below.Created on 2025-02-19 with reprex v2.1.0
Describe the solution you'd like
I suggest the 10%
NA
to be introduced in the remaining columns i.e. in columns other than these five columns.The text was updated successfully, but these errors were encountered: