-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update pyscicat ingestion documentation #54
base: main
Are you sure you want to change the base?
Changes from all commits
4339c96
4608137
63ed502
eda908f
1860fc8
dc8c359
a8db20f
293b1bd
7a2420d
cbfdc68
062b9cb
7f563e9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,12 +7,12 @@ To begin with: | |
from datetime import datetime | ||
from pathlib import Path | ||
|
||
from pyscicat.client import encode_thumbnail, ScicatClient | ||
from pyscicat.client import encode_thumbnail, ScicatClient,CreateDatasetOrigDatablockDto | ||
from pyscicat.model import ( | ||
Attachment, | ||
Datablock, | ||
DataFile, | ||
Dataset, | ||
RawDataset, | ||
Sample, | ||
Ownable | ||
) | ||
|
@@ -38,13 +38,13 @@ Now we setup an `Ownable` instance. This is a model class that several other mod | |
```python | ||
# Create a RawDataset object with settings for your choosing. Notice how | ||
# we pass the `ownable` instance. | ||
dataset = Dataset( | ||
dataset = RawDataset( | ||
path="/foo/bar", | ||
size=42, | ||
owner="slartibartfast", | ||
contactEmail="slartibartfast@magrathea.org", | ||
creationLocation="magrathea", | ||
creationTime=str(datetime.now()), | ||
creationTime=str(datetime.now().isoformat()), | ||
type="raw", | ||
instrumentId="earth", | ||
proposalId="deepthought", | ||
|
@@ -53,8 +53,12 @@ dataset = Dataset( | |
sourceFolder="/foo/bar", | ||
scientificMetadata={"a": "field"}, | ||
sampleId="gargleblaster", | ||
**ownable.dict()) | ||
dataset_id = scicat.upload_raw_dataset(dataset) | ||
**ownable.model_dump()) | ||
|
||
# Required arguments: `contactEmail`, `creationTime`, `owner`, `sourceFolder`, and `type` (raw or derived) | ||
|
||
dataset_id = scicat.datasets_create(dataset) | ||
|
||
``` | ||
Now we can create a Dataset instance and upload it! Notice how we passed the fields of the `ownable` instance there at the end. | ||
|
||
|
@@ -63,29 +67,34 @@ Note that we store the provided dataset_id in a variable for later use. | |
Also note the `sourceFolder`. This is a folder on the file system that SciCat has access to, and will contain the files for this `Dataset`. | ||
|
||
Proposals and instruments have to be created by an administrator. A sample with `sampleId="gargleblaster"` can be created like this: | ||
|
||
```python | ||
sample = Sample( | ||
sampleId="gargleblaster", | ||
owner="Chamber of Commerce", | ||
description="A legendary drink.", | ||
sampleCharacteristics={"Flavour": "Unknown, but potent"}, | ||
isPublished=False, | ||
**ownable.dict() | ||
**ownable.model_dump() | ||
) | ||
sample_id = client.upload_sample(sample) # sample_id == "gargleblaster" | ||
sample_id = scicat.samples_create(sample) # sample_id == "gargleblaster" | ||
|
||
# Required arguments: `isPublished` | ||
|
||
``` | ||
|
||
## Upload a Datablock | ||
|
||
```python | ||
# Create Datablock with DataFiles | ||
data_file = DataFile(path="file.h5", size=42) | ||
data_block = Datablock(size=42, | ||
version=1, | ||
datasetId=dataset_id, | ||
dataFileList=[data_file], | ||
**ownable.dict()) | ||
scicat.upload_datablock(data_block) | ||
data_file = DataFile(path="file.h5", size=42, time = datetime.now().isoformat()) | ||
|
||
# Required arguments: `path`, `size`, `time` | ||
|
||
data_block = CreateDatasetOrigDatablockDto(size=42, | ||
dataFileList=[data_file]) | ||
|
||
scicat.datasets_origdatablock_create(dataset_id, data_block) | ||
``` | ||
The `Datablock` is a container for `DataFile` instances. We are not loading the files, rather we are creating references that are used (and displayed) in SciCat. | ||
|
||
|
@@ -99,23 +108,28 @@ attachment = Attachment( | |
datasetId=dataset_id, | ||
thumbnail=encode_thumbnail(thumb_path), | ||
caption="scattering image", | ||
**ownable.dict() | ||
**ownable.model_dump() | ||
) | ||
scicat.upload_attachment(attachment) | ||
|
||
# If your image is larger than 760kB you may get an error that the request entity is too large. You can resize the image before calling the encode thumbnail function. | ||
|
||
``` | ||
Now we upload an `Attachment`. This is often used in SciCat to display thumbnails for a `Dataset`. Here, we are loading the actual content of a file (stored in SciCat's database). | ||
|
||
So, to put it all together: | ||
|
||
```python | ||
from datetime import datetime | ||
from pathlib import Path | ||
|
||
from pyscicat.client import encode_thumbnail, ScicatClient | ||
from pyscicat.client import encode_thumbnail, ScicatClient,CreateDatasetOrigDatablockDto | ||
from pyscicat.model import ( | ||
Attachment, | ||
Datablock, | ||
DataFile, | ||
Dataset, | ||
RawDataset, | ||
Sample, | ||
Ownable | ||
) | ||
|
||
|
@@ -131,13 +145,13 @@ thumb_path = Path(__file__).parent.parent / "test/data/SciCatLogo.png" | |
|
||
# Create a RawDataset object with settings for your choosing. Notice how | ||
# we pass the `ownable` instance. | ||
dataset = Dataset( | ||
dataset = RawDataset( | ||
path="/foo/bar", | ||
size=42, | ||
owner="slartibartfast", | ||
contactEmail="slartibartfast@magrathea.org", | ||
creationLocation="magrathea", | ||
creationTime=str(datetime.now()), | ||
creationTime=str(datetime.now().isoformat()), | ||
type="raw", | ||
instrumentId="earth", | ||
proposalId="deepthought", | ||
|
@@ -146,24 +160,24 @@ dataset = Dataset( | |
sourceFolder="/foo/bar", | ||
scientificMetadata={"a": "field"}, | ||
sampleId="gargleblaster", | ||
**ownable.dict()) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You probably did this because of a deprecation warning, but pydantic has been have growing pains. If you do this, pin the version of pydantic in setup.cfg to >= 2.0 |
||
dataset_id = scicat.upload_raw_dataset(dataset) | ||
**ownable.model_dump()) | ||
|
||
dataset_id = scicat.datasets_create(dataset) | ||
|
||
|
||
# Create Datablock with DataFiles | ||
data_file = DataFile(path="file.h5", size=42) | ||
data_block = Datablock(size=42, | ||
version=1, | ||
datasetId=dataset_id, | ||
dataFileList=[data_file], | ||
**ownable.dict()) | ||
scicat.upload_datablock(data_block) | ||
data_file = DataFile(path="file.h5", size=42, time = datetime.now().isoformat()) | ||
data_block = CreateDatasetOrigDatablockDto(size=42, | ||
dataFileList=[data_file]) | ||
|
||
#Create Attachment | ||
scicat.datasets_origdatablock_create(dataset_id, data_block) | ||
|
||
# Create Attachment | ||
attachment = Attachment( | ||
datasetId=dataset_id, | ||
thumbnail=encode_thumbnail(thumb_path), | ||
caption="scattering image", | ||
**ownable.dict() | ||
**ownable.model_dump() | ||
) | ||
scicat.upload_attachment(attachment) | ||
|
||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. unrelated change, but probably OK here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
,<space>>...