Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating a collection #116

Merged
merged 5 commits into from
Aug 12, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
342 changes: 342 additions & 0 deletions examples/jupyter_nb/changing_datafile_with_tags_example.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,342 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Changing datafile using Tags"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Table of Contents (TOC) <a class=\"anchor\" id=\"toc\"></a>\n",
"\n",
"- [1. Imports](#first-bullet)\n",
"- [2. Zegami Client, Workspace, and Collection](#second-bullet)\n",
"- [3. Chaging Datafile Value of One Item](#third-bullet)\n",
" - [3.1. Creating a Tag](#creating-tag)\n",
" - [3.2. Converting Fahrenheit values to Celsius](#fah_to_cel)\n",
"\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Imports <a class=\"anchor\" id=\"first-bullet\"></a> <span style=\"font-size:0.5em;\">[(Back to TOC)](#toc)</span>\n",
"\n",
"First we'll start by importing all of the needed libraries."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from zegami_sdk.client import ZegamiClient"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 2. Zegami client, workspace, and collection <a class=\"anchor\" id=\"second-bullet\"></a> <span style=\"font-size:0.5em;\">[(Back to TOC)](#toc)</span>\n",
"\n",
"Here we'll initialize the Zegami client so that we can have access to our collections.\n",
"\n",
"If you haven't intialized your client before, you'll to have to provide your username and password, in order to create a security token. You only have to do this once. After that, you can initialize the client without providing login details, as the security token will be used instead!"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Used token from '/home/martim-zegami/zegami_com.zegami.token'.\n",
"Client initialized successfully, welcome .\n",
"\n"
]
}
],
"source": [
"zc = ZegamiClient()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"After initializing the client, we can retrieve our workspace and collection."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[<Workspace id=GUDe4kRY name=Martim Chaves>]\n"
]
}
],
"source": [
"workspaces_lst = zc.workspaces\n",
"print(workspaces_lst)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"# Get workspace using the ID\n",
"WORKSPACE_ID = zc.workspaces[0].id # id=GUDe4kRY\n",
"workspace = zc.get_workspace_by_id(WORKSPACE_ID)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Collections in 'Martim Chaves' (3):\n",
"6271698f81e4bccb640d6e24 : Flags of the world\n",
"62792f0488e4d3d7181d8256 : nations_of_the_world\n",
"627e12a31bdd62bb88d6afb8 : X-ray-analysis\n"
]
}
],
"source": [
"workspace.show_collections()"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<CollectionV1 id=627e12a31bdd62bb88d6afb8 name=X-ray-analysis>\n"
]
}
],
"source": [
"# Let's get the X-ray-analysis collection, which is the one we're currently working with\n",
"collection = workspace.get_collection_by_name('X-ray-analysis')\n",
"print(collection)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Chaging datafile value of an item using Tags <a class=\"anchor\" id=\"third-bullet\"></a> <span style=\"font-size:0.5em;\">[(Back to TOC)](#toc)</span>\n",
"\n",
"When we were investigating the data using the Zegami platform, we noticed that one datapoint didn't have the temperature data in the right unit (Fahrenheit, instead of Celsius). This was causing an issue where the temperature axis didn't have an appropriate scale, as there was an anomaly. We can change the values in the datafile using the SDK!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src=\"./images/weird_axis.png\" width=\"1000\"/>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3.1. Creating a Tag <a class=\"anchor\" id=\"creating-tag\"></a> <span style=\"font-size:0.5em;\">[(Back to TOC)](#toc)</span>\n",
"\n",
"In this case, we're only looking at one sample, so we could manually change the value in the datafile and reupload it. But, for the sake of demonstrating something that is more scalable, let's imagine that there are several samples that have this issue. Using the platform, we can easily create a **Tag** associated with these samples, using the selection tool."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src=\"./images/tag_wrong_unit.png\" width=\"1000\"/>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can now change the values of the samples belonging to that Tag."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3.2. Converting Fahrenheit values to Celsius <a class=\"anchor\" id=\"fah_to_cel\"></a> <span style=\"font-size:0.5em;\">[(Back to TOC)](#toc)</span>\n",
"\n",
"First, we'll start by creating the function that we'll use to modify the values. Then, we'll get a copy of the dataframe containing the samples (rows) belonging to the Tag that we created. We'll alter that copy, and then we'll use it to update the datafile."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"def fahrenheit_to_celsius(temp: float) -> float:\n",
" temp -= 32\n",
" temp *= 5/9.\n",
" return temp"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"collec_temp_unit_tag = collection.get_rows_by_tags(['wrong_temp_unit']).copy()"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [],
"source": [
"collec_temp_unit_tag['temperature'] = collec_temp_unit_tag['temperature'].apply(fahrenheit_to_celsius)"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [],
"source": [
"collection.rows.update(collec_temp_unit_tag)"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [],
"source": [
"collection.replace_data(collection.rows)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When doing these sorts of operations (uploading data), we should check collection.status. Changes may take some time."
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'changed_at': 'Mon, 16 May 2022 19:13:47 GMT',\n",
" 'progress': 1.0,\n",
" 'status': 'completed'}"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"collection.status"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"108 35.0\n",
"Name: temperature, dtype: float64"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"collection.get_rows_by_tags(['wrong_temp_unit']).temperature"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that the value has been updated, let's check Zegami's platform!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src=\"./images/fixed_temp_axis.png\" width=\"1000\"/>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Much better!"
]
}
],
"metadata": {
"interpreter": {
"hash": "17361232fb9bcb0ba5939612f19d0269bf9334e24dd707504afe9bf985e11e05"
},
"kernelspec": {
"display_name": "Python 3.9.12 ('env': venv)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.12"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
Loading