This module provides a Python library to interact with a collection of frictionless datapackages. Such datapackages consist of a CSV (data) file which is annotated with a JSON file. This allows storing additional information such as units used in the columns of a CSV or store metadata describing the underlying data. Example datapackages can be found here and a JSON could be structured as follows
{
"resources": [
{
"name": "demo_package",
"type": "table",
"path": "demo_package.csv",
"scheme": "file",
"format": "csv",
"mediatype": "text/csv",
"encoding": "utf-8",
"schema": {
"fields": [
{
"name": "t",
"type": "number",
"unit": "s"
},
{
"name": "j",
"type": "number",
"unit": "A / m2"
}
]
},
"metadata": {
"echemdb": {
"description": "Sample data for the unitpackage module.",
"curation": {
"process": [
{
"role": "experimentalist",
"name": "John Doe",
"laboratory": "Institute of Good Scientific Practice",
"date": "2021-07-09"
}
]
}
}
}
}
]
}
The metadata of an entries' resource in a collection is accessible from the python API.
>>> from unitpackage.collection import Collection
>>> db = Collection.from_local('./doc/files')
>>> entry = db['demo_package_cv']
>>> entry.description
'Sample data for the unitpackage module.'
From the API also a simple 2D plot can be drawn.
>>> entry.plot()
Ultimately, the unitpackage
allows for simple transformation of data within a resource into different units.
>>> entry.get_unit('j')
'A / m2'
>>> entry.df
t E j
0 0.000000 -0.196962 0.043009
1 0.011368 -0.196393 0.051408
...
>>> entry.rescale({'E' : 'mV', 'j' : 'uA / m2'}).df
t E j
0 0.000000 -196.961730 43008.842162
1 0.011368 -196.393321 51408.199892
...
Collections for specific measurement types can be created, which provide additional accessibility to the meatadata or simplify the representation of such data in common plot types. An example of such a collection can be found on echemdb.org, which shows Cyclic Voltammetry data annotated following echemdb's metadata schema, which can be stored in a CVCollection
and is retrieved from the echemdb data repository.
Detailed installation instructions, description of the modules, advanced usage examples, including local collection creation, are provided in our documentation.
This package is available on PyPI and can be installed with pip:
pip install unitpackage
The package is also available on conda-forge an can be installed with conda
conda install -c conda-forge unitpackage
or mamba
mamba install -c conda-forge unitpackage
Please consult our documentation for more detailed installation instructions.
The contents of this repository are licensed under the GNU General Public License v3.0 or, at your option, any later version.