Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Got error all the time #18

Open
chungkham opened this issue Dec 24, 2024 · 1 comment
Open

Got error all the time #18

chungkham opened this issue Dec 24, 2024 · 1 comment

Comments

@chungkham
Copy link

Dear Jing Li,

Thanks a lot for the package gfoRmula both in R and Python. I am learning to use in R. I have 3 time points longitudinal data and the outcome is measures in 4 wave (Time: 0,1,2,3). I have a binary outcome (Y: 0/1). Three time-varying covariates, L1, L2, L3 (L1, L2 are categorical variables of 3+ categories, and L3 is binary). I have 5 baseline covariates, W1, W2, W3, W4, W5, all are categorical except W2, which is binary. I have multiple treatments, A1, A2, A3, A4, A5, A6, A7, A8, which are tertiles (0:lowest, 1: Middle, 2:Highest). I would like to estimate the risk ratio of the outcome by intervening a hypothetical improvements in each of the treatments at once separately (8 models) and a combine improvement in all the treatments simultaneously. In the document of gfoRmula, I do not find the threshold for improving on categorical exposure. I tried with a modified code using the document, but everytime I got error. My intervention is keep at the lowest tertile if the natural/observed value is 0, move to 0 if the natural value is 1, move to 1 if the natural value is 2. Basically I would like to see the risk ratio of moving one position in the tertile. Below I give the code for the combined improvement and in a single treatment, say A1 along with the error. Could you please help me what is the problem.

  1. Combine intervention

Define covariates

outcome_type <- 'binary_eof'
id <- 'ID'
time_name <- 'time'
outcome_name <- 'Y'

covnames <- c(

Time-varying covariates

"L1", "L2", "L3",

Treatments

"A1", "A2", "A3", "A4", "A5", "A6", "A7", "A8"
)

Define covariate types

covtypes <- c(
"categorical", "categorical", "binary", # Time-varying
rep("categorical", 8) # Treatments
)

Define the history variables

histories <- c(lagged) # I could not put cumavg, it says categorical is not suitable for cumulative
histvars <- list(
c("A1", "A2", "A3", "A4", "A5", "A6", "A7", "A8",
"L1", "L2", "L3"))

covparams <- list(covmodels = c(L1 ~ lag1_A1 + lag1_A2 + lag1_A3 + lag1_A4 +
lag1_A5 + lag1_A6 + lag1_A7 + lag1_A8 +
lag1_L1 + lag1_L2 + lag1_L3 +
W1 + W2 + W3 + W4 + W5 + time,
L2 ~ lag1_A1 + lag1_A2 + lag1_A3 + lag1_A4 +
lag1_A5 + lag1_A6 + lag1_A7 + lag1_A8 +
L1 +
lag1_L1 + lag1_L2 + lag1_L3 +
W1 + W2 + W3 + W4 + W5 + time,
L3 ~ lag1_A1 + lag1_A2 + lag1_A3 + lag1_A4 +
lag1_A5 + lag1_A6 + lag1_A7 + lag1_A8 +
L1 + L2 +
lag1_L1 + lag1_L2 + lag1_L3 +
W1 + W2 + W3 + W4 + W5 + time,
A1 ~ lag1_A1 + lag1_L1 + lag1_L2 + lag1_L3 +
W1 + W2 + W3 + W4 + W5 + time,
A2 ~ lag1_A2 + lag1_L1 + lag1_L2 + lag1_L3 +
W1 + W2 + W3 + W4 + W5 + time,
A3 ~ lag1_A3 + lag1_L1 + lag1_L2 + lag1_L3 +
W1 + W2 + W3 + W4 + W5 + time,
A4 ~ lag1_A4 + lag1_L1 + lag1_L2 + lag1_L3 +
W1 + W2 + W3 + W4 + W5 + time,
A5 ~ lag1_A5 + lag1_L1 + lag1_L2 + lag1_L3 +
W1 + W2 + W3 + W4 + W5 + time,
A6 ~ lag1_A6 + lag1_L1 + lag1_L2 + lag1_L3 +
W1 + W2 + W3 + W4 + W5 + time,
A7 ~ lag1_A7 + lag1_L1 + lag1_L2 + lag1_L3 +
W1 + W2 + W3 + W4 + W5 + time,
A8 ~ lag1_A8 + lag1_L1 + lag1_L2 + lag1_L3 +
W1 + W2 + W3 + W4 + W5 + time))
ymodel <- Y ~ A1 + A2 + A3 + A4 + A5 + A6 + A7 + A8 +
L1 + L2 + L3 + W1 +
lag1_A1 + lag1_A2 + lag1_A3 + lag1_A4 + lag1_A5 + lag1_A6 + lag1_A7 + lag1_A8 +
lag1_L1 + lag1_L2 + lag1_L3 + time

Define the intervention

intvars <- c("A1", "A2", "A3", "A4", "A5", "A6", "A7", "A8")

Define interventions using the threshold function

interventions <- list(
list(c(threshold, 2, 2)), # A1: Set to 1 for values exactly 2
list(c(threshold, 2, 2)), # A2: Set to 1 for values exactly 2
list(c(threshold, 2, 2)), # A3: Set to 1 for values exactly 2
list(c(threshold, 2, 2)), # A4: Set to 1 for values exactly 2
list(c(threshold, 2, 2)), # A5: Set to 1 for values exactly 2
list(c(threshold, 2, 2)), # A6: Set to 1 for values exactly 2
list(c(threshold, 2, 2)), # A7: Set to 1 for values exactly 2
list(c(threshold, 2, 2)) # A8: Set to 1 for values exactly 2
)

Define intervention descriptions for clarity

int_descript <- paste("Set", intvars, "to 1 if 2, else 0")

nsimul <- 10000
ncores <- 2

Run gformula_binary_eof

gform_bin_eof <- gformula_binary_eof(
obs_data = data_table,
id = id,
time_name = time_name,
covnames = covnames,
outcome_name = outcome_name,
covtypes = covtypes,
covparams = covparams,
ymodel = ymodel,
intvars = intvars,
interventions = interventions,
int_descript = int_descript,
histories = histories,
histvars = histvars,
basecovs = c("W1", "W2", "W3", "W4", "W5"),
seed = 1234,
parallel = TRUE,
nsamples = 5,
nsimul = nsimul,
ncores = ncores
)

Error: Error in checkForRemoteErrors(val) :
2 nodes produced errors; first error: 'from' must be a finite number

  1. Intervention on a single treatment

Define covariates

outcome_type <- 'binary_eof'
id <- 'ID'
time_name <- 'time'
outcome_name <- 'Y'

covnames <- c(

Time-varying covariates

"L1", "L2", "L3",

Treatments

"A1", "A2", "A3", "A4", "A5", "A6", "A7", "A8"
)

Define covariate types

covtypes <- c(
"categorical", "categorical", "binary", # Time-varying
rep("categorical", 8) # Treatments
)

Define the history variables

histories <- c(lagged) # I could not put cumavg, it says categorical is not suitable for cumulative
histvars <- list(
c("A1", "A2", "A3", "A4", "A5", "A6", "A7", "A8",
"L1", "L2", "L3"))

covparams <- list(covmodels = c(L1 ~ lag1_A1 + lag1_A2 + lag1_A3 + lag1_A4 +
lag1_A5 + lag1_A6 + lag1_A7 + lag1_A8 +
lag1_L1 + lag1_L2 + lag1_L3 +
W1 + W2 + W3 + W4 + W5 + time,
L2 ~ lag1_A1 + lag1_A2 + lag1_A3 + lag1_A4 +
lag1_A5 + lag1_A6 + lag1_A7 + lag1_A8 +
L1 +
lag1_L1 + lag1_L2 + lag1_L3 +
W1 + W2 + W3 + W4 + W5 + time,
L3 ~ lag1_A1 + lag1_A2 + lag1_A3 + lag1_A4 +
lag1_A5 + lag1_A6 + lag1_A7 + lag1_A8 +
L1 + L2 +
lag1_L1 + lag1_L2 + lag1_L3 +
W1 + W2 + W3 + W4 + W5 + time,
A1 ~ lag1_A1 + lag1_L1 + lag1_L2 + lag1_L3 +
W1 + W2 + W3 + W4 + W5 + time,
A2 ~ lag1_A2 + lag1_L1 + lag1_L2 + lag1_L3 +
W1 + W2 + W3 + W4 + W5 + time,
A3 ~ lag1_A3 + lag1_L1 + lag1_L2 + lag1_L3 +
W1 + W2 + W3 + W4 + W5 + time,
A4 ~ lag1_A4 + lag1_L1 + lag1_L2 + lag1_L3 +
W1 + W2 + W3 + W4 + W5 + time,
A5 ~ lag1_A5 + lag1_L1 + lag1_L2 + lag1_L3 +
W1 + W2 + W3 + W4 + W5 + time,
A6 ~ lag1_A6 + lag1_L1 + lag1_L2 + lag1_L3 +
W1 + W2 + W3 + W4 + W5 + time,
A7 ~ lag1_A7 + lag1_L1 + lag1_L2 + lag1_L3 +
W1 + W2 + W3 + W4 + W5 + time,
A8 ~ lag1_A8 + lag1_L1 + lag1_L2 + lag1_L3 +
W1 + W2 + W3 + W4 + W5 + time))
ymodel <- Y ~ A1 + A2 + A3 + A4 + A5 + A6 + A7 + A8 +
L1 + L2 + L3 + W1 +
lag1_A1 + lag1_A2 + lag1_A3 + lag1_A4 + lag1_A5 + lag1_A6 + lag1_A7 + lag1_A8 +
lag1_L1 + lag1_L2 + lag1_L3 + time

Define the intervention

intvars <- c("A1")

Define interventions using the threshold function

interventions <- list(
list(c(threshold, 2, 2))
)

Define intervention descriptions for clarity

int_descript <- paste("Set", intvars, "to 1 if 2, else 0")

nsimul <- 10000
ncores <- 2

Run gformula_binary_eof

gform_bin_eof <- gformula_binary_eof(
obs_data = data_table,
id = id,
time_name = time_name,
covnames = covnames,
outcome_name = outcome_name,
covtypes = covtypes,
covparams = covparams,
ymodel = ymodel,
intvars = intvars,
interventions = interventions,
int_descript = int_descript,
histories = histories,
histvars = histvars,
basecovs = c("W1", "W2", "W3", "W4", "W5"),
seed = 1234,
parallel = TRUE,
nsamples = 5,
nsimul = nsimul,
ncores = ncores
)
Error in checkForRemoteErrors(val) :
2 nodes produced errors; first error: 'from' must be a finite number

I could not use “cumavg”. I gave the message that cumavg cannot be use for categorical variable. Is this right?

Thanks a lot.

Best,

Holendro

@LilJing
Copy link
Collaborator

LilJing commented Jan 4, 2025

Hi Holendro,

Regarding this issue, here are some points for troubleshooting:

  1. check the type of the variable in your input data and see if the categorical variable is recognized as incorrect type like "numeric" instead of "factor" in R;
  2. specify the intervention using custom intervention function for your treatment strategy, since the threshold function is only applicable to continuous treatment instead of categorical one; please refer to gfoRmula for custom intervention specification;
  3. remove the "time" variable in your ymodel, the time term should not be included in the ymodel statement.

Ps. Yes the "cumavg" is only applicable to continuous variables, not categorical ones.

If the above points do not resolve this issue, providing a minimal working example of the data would help a more in-depth investigation.

Best,
Jing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants