☝️The image above is what we're trying to achieve here.
To determine the effects of various factors on health outcomes, we currently apply pharmacokinetic modeling over various onset delay and duration of action hyper-parameters and combine that with some other parameters for each of Hill's criteria for causality.
The distributions in this type of data aren't super normal, and you've got the onset delays and durations of action so regular Pearson correlations don't work so well. So we mainly focus on change from baseline. There's a ton of room for improvement by controlling using instrumental variables or convolutional recursive neural networks.
Hybrid Predictive Control Black Box Models seem most appropriate. This repository is for storing potential alternative implementations.
See the data folder.
The best file is probably data/arthritis-factor-measurements-matrix-zeros-unixtime. csv. It's a matrix of years of self-reported Arthritis Severity Rating measurements and hundreds of potential factors over time.
The first row is the variable names. The first column is Unix timestamp (seconds since 1970-01-01 00:00:00 UTC).
To make it easier to analyze some preprocessing has been done. This includes zero-filling where appropriate. Also, the factor measurement values are aggregated values preceding the Arthritis measurements based on the onset delay and duration of action.
The aggregation method and other hyper-parameters can be found by putting the Variable Name in either
- the API Explorer or
- in the URL
https://studies.curedao.org/VARIABLE_NAME_HERE
.