-
Major restructure of
lift_adat()
functionality (@stufield, #81, #78)lift_adat()
now takes abridge =
argument, replacing theanno.tbl =
argument. Lifting is now performed internally for a better (and safer) user experience, without the necessity of an external annotations (Excel) file.- the majority of this refactoring was internal and the user should not experience a major disruption to the API.
- much improved lifting/bridging documentation (#82)
-
Added a new lifting and bridging vignette (@stufield, #77)
- in addition to the improved lifting documentation this new vignette provides additional context, explanation, clear examples, and lifting guidance.
-
is_lifted()
is new and returns a boolean according to whether the signal space (RFU) has been previously lifted -
Lifting accessor function for Lin's CCC values (#88)
getSomaScanLiftCCC()
accesses the lifting correlations between SomaScan versions for each analyte- returns a
tibble
split by sample matrix (serum or plasma)
-
merge_clin()
is newly exported (#80)- a thin wrapper that allows users to merge
clinical variables to
soma_adat
objects easily - previously users had to either use the CLI merge tool
or merge in clinical variables themselves with
dplyr
- a thin wrapper that allows users to merge
clinical variables to
-
Newly exported ADAT "get**" helpers (#83)
- functions to access properties of ADATs
getAdatVersion()
getSomaScanVersion()
getSignalSpace()
checkSomaScanVersion()
getAdatVersion()
gets a new S3 method (#92)- this enables passing of different objects
- namely
soma_adat
orlist
depending on the situation
- functions to access properties of ADATs
-
Newly exported functions that were previously internal only:
addAttributes()
addClass()
cleanNames()
-
The package
README
is now simplified (#35)- example analysis workflows are now split out
into their own vignettes/articles
and cross-linked in the
README
- example analysis workflows are now split out
into their own vignettes/articles
and cross-linked in the
-
Reorganization and expansion of statistical vignettes (#35, #47)
- moved 3 existing statistical examples from
README
into their own vignettes - resulting in four new "Statistical Workflow" vignettes/articles:
- Binary classification via logistic regression
- Linear regression for continuous variables
- Two-group comparison via t-test
- Three-group analysis ANOVA
- moved 3 existing statistical examples from
-
Added new general analysis workflow vignettes
- articles for the pkgdown website have been built out
- new articles on:
- safely mapping values among variables
- safely renaming a data frame
- loading-and-wrangling
- typical train and test data splits
- beginning the FAQs and/or Coming Soon pages
-
Added a new vignette describing how to use the command-line interface merge tool (#45)
- the new CLI merge tool used to add new clinical data to existing ADAT file
-
collapseAdats()
better combinesHEADER
information (#86)- certain information, e.g.
PlateScale
andCal*
, are better maintained in the final collapsed ADAT - other entries are combined by pasting into a single string
- should result in less duplication of superfluous entries and
retention of more "useful"
HEADER
information in the resulting (collapsed)soma_adat
- certain information, e.g.
-
Update
read_annotations()
with11k
content (#85) -
Update
transform()
andscaleAnalytes()
scaleAnalytes()
(internal) now skips missing references and is much more like a "step" in therecipes
packagetransform()
gets edge case protection withdrop = FALSE
in case a single-analytesoma_adat
is scaled.
-
New
row.names()
S3 method support forsoma_adat
class- dispatched on calls to
rownmaes()
- rather than calling
NextMethod()
which normally would invokedata.frame
, we now force thedata.frame
method in case there aretbl_df
orgrouped_df
classes present that would be dispatched. Those are bypassed in favor of thedata.frame
becausetbl_df
1) can nuke the attributes, 2) triggers a warning about adding rownames to atibble
.
- dispatched on calls to
-
New
grouped_df
S3 print support for the groupedsoma_adat
- now displays Grouping information from a call to
the S3 print method for
soma_adat
class
- now displays Grouping information from a call to
the S3 print method for
-
New
grouped_df
S3 method support forsoma_adat
class (#66)grouped_df
data objects previously unsupported and were interfering with downstream S3 methods fordplyr
verbs onceNextMethod()
was called- this support now ensures that the group
methods are maintained, as well as the
soma_adat
class itself (and most importantly, with its attributes intact)
-
tidyr::separate.soma_adat()
S3 method was simplified (#72)- now uses
%||%
helper internally - expanded error messages inside
stopifnot()
to be more informative
- now uses
-
is_intact_attr()
is now much quieter, signaling only when called indirectly (#71)- new conditional logic to silences signaling messages when called from within another function (indirectly)
- these previously lead to confusing messages
when they appear in wrappers, where
is_intact_attr()
can be, sometimes deeply, nested in the call stack
-
Development and improvements to the
pkgdown
website- added new links and improved clarity in YAML
- added new logo at footer
- restyled side bar for easier hyperlinking and getting help
- clicking on the SomaLogic logo in the GitHub
README
now links to thepkgdown
website - new "Coming Soon" drop-down section in the website header to let users know about active progress (but not yet ready for external publication)
-
SomaDataIO
no longer depends ondesc
package- to generate the
README.md
- to generate the
-
Internal rowname helpers were upgraded
- they now use internal cross-functions as originally intended to avoid redundancy, efficiency, and improved debugging
-
sysdata.rda
no longer contains non-exported functions (#59)- new internal helper functions:
convertColMeta()
genRowNames()
parseCheck()
syncColMeta()
scaleAnalytes()
- new internal helper functions:
-
Bug-fix for corner-case writing a single-analyte ADAT (#51)
- RFU values are rounded to 1 decimal place when written by
write_adat()
, via a call toapply()
, which expects a 2-dim object when replacing those values. write_adat()
no longer usesapply()
and instead converts the entire RFU data frame to a matrix (maintains original dimensions), and use vectorized format conversion viasprintf()
- in theory this should be faster because
sprintf()
is only called once on a long vector, rather than 1000s of times on shorter vectors (insideapply()
).
- RFU values are rounded to 1 decimal place when written by
-
Fixed missing closing parenthesis in
SomaScanObjects.R
(thanks @Hijinx725!, #40)
- We are now on CRAN! 🥳
-
New clinical data merge CLI tool (@stufield, #25)
Rscript --vanilla merge_clin.R
for merging clinical variables into existing*.adat
SomaScan data files- added 2 new example
meta.csv
andmeta2.csv
files to run examples with random data but with valid index keys - see
dir(system.file("cli", "merge", package = "SomaDataIO"))
-
Package data objects (@stufield, #32)
example_data.adat
was reduced in size ton = 10
samples (from 192) to conform to CRAN size requirements (< 5MB)- the current file was renamed
example_data10.adat
to reflect this change - this likely has far-reaching consequences for users who access
this flat file via
system.file()
- the
example_data
object itself however remains true to its original file (https://github.com/SomaLogic/SomaLogic-Data/blob/master/example_data.adat
) - the directory location
inst/example/
was renamedinst/extdata/
to conform to CRAN package standard naming conventions - the file
single_sample.adat
was removed from package data as it is now redundant (however still used in unit testing) SomaDataObjects
was renamed and is nowSomaScanObjects
-
Gradual deprecation (@stufield)
read.adat()
is now soft-deprecated; please useread_adat() instead
- lifecycle for soft-deprecated
warn()
->stop()
for functions that have been been soft deprecated sincev5.0.0
getSomamers()
getSomamerData()
meltExpressionSet()
-
New S3 print method default (@stufield)
tibble
has newmax_extra_cols =
argument, which is set to6
for theprint.soma_adat
method
-
New S3 merge method (@stufield, #31)
- calling
base::merge()
on asoma_adat
is strongly discouraged - we now redirect users to use
dplyr::*_join()
alternatives which are designed to preservesoma_adat
attributes
- calling
-
Code hardening for
prepHeaderMeta()
(@stufield)- some ADATs do not have
CreatedDate
andCreatedBy
in the HEADER entry. This currently breaks the writer - simplified to make more robust but also refactor to be more convenient (for abnormal ADATs not generated by standard SomaScan processing)
CreatedDateHistory
was removed as an entry from written ADATsCreatedByHistory
was combined and dated for written ADATsNULL
behavior remains if keys are missingCreatedBy
andCreatedDate
will be generated either as new entries or over-written as appropriate
- some ADATs do not have
-
Numerous non-user-facing (API) changes internal package maintenance, efficiency, and structural upgrades were included
-
Bug-fix release related to
write_adat()
:- fixed bug in
write_adat()
that resulted from adding/removing clinical (non-SomaScan) variables to an ADAT. Export viawrite_adat()
resulted in a broken ADAT file (@stufield, #18) write_adat()
now has much higher fidelity to original text file (*.adat
) in full-cycle read-write-read operations; particularly in presence of bangs (!
) in the Header section and in floating point decimals in the?Col.Meta
sectionwrite_adat()
no longer converts commas (,
) to semi-colons (;
) in the?Col.Meta
block (originally introduced to avoid cell alignment issues in*.csv
formats)write_adat()
no longer concatenates written ADATs, when writing to the same file. Data is over-written to file to avoid mangled ADATs resulting from re-writing to the same connection and to match the default behavior ofwrite.table()
,write.csv()
, etc.
- fixed bug in
-
read_adat()
now has more consistent character type theBarcode2
variable in standard ADATs, now forcescharacter
class, does not allow R'sread.delim()
to "guess" the type -
Decreased dependency of
magrittr
pipes (%>%
) in favor of the native R pipe (|>
). As a result the package now depends onR >= 4.1.0
SomaDataIO
will continue to re-exportmagrittr
pipes for backward compatibility, but this should not be considered permanent. Please code accordingly
-
Migration to the default branch in GitHub from
master
->main
(@stufield, #19) -
Numerous non-user-facing (API) changes internal package maintenance, efficiency, and structural upgrades were included
-
Upgrades primarily from improvements to SomaLogic internal code base, including: (@stufield)
- general reduction on external package dependency to improve code stability
- internal usage of base R alternatives to the
readr
package for parsing and importing ADATs (e.g.read.delim()
overreadr::read_delim()
). This is mostly for code simplification, but can often result in marked speed improvements. As the SomaScanplex
size increases, this speed improvement will become more important. parseHeader()
was dramatically simplified, now reading in lines 20L at a time until the RFU block is reached. In addition, once the block is reached, all header lines are read-in once and indexed (as opposed to line-by-line).read_adat()
now specifies column types viacolClasses =
which for the majority of the ADAT is typedouble
for the RFU columns. This should dramatically improve speed of ingest.write_adat()
was simplified internally, with fewer nestedapply
and for-loops.- encoding for all input/output (I/O) is assumed to be
UTF-8
.
-
New
getAnalytes()
S3 method for classrecipe
from therecipes
package. -
New
loadAdatsAsList()
to load multiple ADAT files in a single call and optionally collapse them into a single data frame (@stufield, #8). -
New
getTargetNames()
function to map ADATseq.XXXX.XX
names to corresponding protein targets from the annotations table
-
SomaLogic Inc. is now SomaLogic Operating Co. Inc.
-
Added new documentation regarding
Col.Meta
(@stufield, #12).- documentation around column meta data, row meta data, where they are found in an ADAT, and how to access them.
-
Research Use Only ("RUO") language was added to the README (@stufield, #10).
-
Numerous internal code improvements from SomaLogic code-base (@stufield)
- the consisted of reducing usage of external dependencies,
e.g. using
stop()
overui_stop()
andwarning()
overui_warn()
, usingusethis
,cli
, andcrayon
shims aliases. - package uses
purrr
very selectively and no longer usesstringr
. - using base R alternatives in favor of increased stability for underlying, non-user-facing code.
- the consisted of reducing usage of external dependencies,
e.g. using
-
New
lift_adat()
was added to provided 'lifting' functionality (@stufield, #11)- provides mechanism to convert RFU space between SomaScan versions (e.g. v4.1 -> v4.0).
- added new S3
transform.soma_adat()
method which simplifies linear scaling ofsoma_adat
columns (analytes). - uses an "Annotations file" (Excel) as source of scalars for transformation.
-
Minor improvements and updates to the
README.Rmd
(@stufield, #7)- fixed a broken
adat2eSet()
link in README (#5). - clearer text to the
README
regardingBiobase
installation. - added new links to external Bioconductor website in installation section of README.
- new
pkgdown
and links to Issues (#4). - SomaLogic logo was added to README.
- a lifecycle ("maturing") badge was added.
- fixed a broken
-
Startup message was improved with dynamic width (@stufield).
-
New
locateSeqId()
function to pull outSeqId
regex. (@stufield). -
New
read_annotations()
function (@stufield, #2)- new function to parse/import SomaLogic annotations files (
*.xlsx
).
- new function to parse/import SomaLogic annotations files (
-
New
set_rn()
drop-in replacement formagrittr::set_rownames()
-
getFeatures()
was renamed to be less ambiguous and better align with internal SomaLogic code usage. Now usegetAnalytes()
(@stufield) -
getFeatureData()
was also renamed togetAnalyteInfo()
(@stufield) -
various upgrades as required by code changes in external package dependencies, e.g.
tidyverse
. -
new alias for
read_adat()
,read.adat()
, for backward compatibility to previous versions ofSomaDataIO
(@stufield)
- Initial public release to GitHub!