diff --git a/site/content/en/docs/manual/basics/attach-cloud-storage.md b/site/content/en/docs/manual/basics/attach-cloud-storage.md index 78e8c2c1e800..4f2a490bd848 100644 --- a/site/content/en/docs/manual/basics/attach-cloud-storage.md +++ b/site/content/en/docs/manual/basics/attach-cloud-storage.md @@ -8,376 +8,418 @@ description: 'Instructions on how to attach cloud storage using UI' In CVAT you can use [AWS-S3](using-aws-s3), [Azure Blob Container](#using-azure-blob-container) and [Google cloud](#using-google-cloud-storage) storages to store image datasets for your tasks. -## Using AWS-S3 +See: + +- [AWS S3](#aws-s3) + - [Create a bucket](#create-a-bucket) + - [Upload data](#upload-data) + - [Access permissions](#access-permissions) + - [Authorized access](#authorized-access) + - [Anonymous access](#anonymous-access) + - [Attach AWS S3 storage](#attach-aws-s3-storage) + - [AWS manifest file](#aws-manifest-file) +- [Google Cloud](#google-cloud) + - [Create a bucket](#create-a-bucket-1) + - [Upload data](#upload-data-1) + - [Access permissions](#access-permissions-1) + - [Authorized access](#authorized-access-1) + - [Anonymous access](#anonymous-access-1) + - [Attach Google Cloud storage](#attach-google-cloud-storage) +- [Microsoft Azure](#microsoft-azure) + - [Create a bucket](#create-a-bucket-2) + - [Create a container](#create-a-container) + - [Upload data](#upload-data-2) + - [SAS token](#sas-token) + - [Personal use](#personal-use) + - [Attach Azure Blob Container](#attach-azure-blob-container) +- [Prepare the dataset](#prepare-the-dataset) + +## AWS S3 -### Create AWS account +### Create a bucket -First, you need to create an AWS account, to do this, [register of 5 steps](https://portal.aws.amazon.com/billing/signup#/start) -following the instructions -(even if you plan to use a free basic account you may need to link a credit card to verify your identity). +To create bucket, do the following: -To learn more about the operation and benefits of AWS cloud, -take a free [AWS Cloud Practitioner Essentials](https://www.aws.training/Details/eLearning?id=60697) course, -which will be available after registration. +1. Create an [AWS account](https://portal.aws.amazon.com/billing/signup#/start). +2. Go to [console AWS-S3](https://s3.console.aws.amazon.com/s3/home), and click **Create bucket**. -### Create a bucket + ![](/images/aws-s3_tutorial_1.jpg) -After the account is created, go to [console AWS-S3](https://s3.console.aws.amazon.com/s3/home) -and click `Create bucket`. +3. Specify the name and region of the bucket. You can also + copy the settings of another bucket by clicking on the **Choose bucket** button. +4. Enable **Block all public access**. For access, you will use **access key ID** and **secret access key**. +5. Click **Create bucket**. -![](/images/aws-s3_tutorial_1.jpg) +A new bucket will appear on the list of buckets. -You'll be taken to the bucket creation page. Here you have to specify the name of the bucket, region, -optionally you can copy the settings of another bucket by clicking on the `choose bucket` button. -Checkbox block all public access can be enabled as we will use `access key ID` and `secret access key` to gain access. -In the following sections, you can leave the default settings and click `create bucket`. -After you create the bucket it will appear in the list of buckets. +### Upload data -### Create user and configure permissions +You need to upload data for annotation and the `manifest.jsonl` file. -To access bucket you will need to create a user, to do this, go [IAM](https://console.aws.amazon.com/iamv2/home#/users) -and click `add users`. You need to choose AWS access type, have an access key ID and secret access key. +1. Prepare data. + For more information, + see [prepare the dataset](#prepare-the-dataset). +2. Open the bucket and click **Upload**. -![](/images/aws-s3_tutorial_2.jpg) + ![](/images/aws-s3_tutorial_5.jpg) -After pressing `next` button to configure permissions, you need to create a user group. -To do this click `create a group`, input the `group name` and select permission policies add `AmazonS3ReadOnlyAccess` -using the search (if you want the user you create to have write rights to bucket select `AmazonS3FullAccess`). +3. Drag the manifest file and image folder on the page and click **Upload**: -![](/images/aws-s3_tutorial_3.jpg) +![](/images/aws-s3_tutorial_1.gif) -You can also add tags for the user (optional), and look again at the entered data. In the last step of creating a user, -you will be provided with `access key ID` and `secret access key`, -they will need to be used in CVAT when adding cloud storage. +### Access permissions -![](/images/aws-s3_tutorial_4.jpg) +#### Authorized access -### Upload dataset +To add access permissions, do the following: -#### Prepare dataset +1. Go to [IAM](https://console.aws.amazon.com/iamv2/home#/users) and click **Add users**. +2. Set **User name** and enable **Access key - programmatic access**. -For example, let's take [The Oxford-IIIT Pet Dataset](https://www.robots.ox.ac.uk/~vgg/data/pets/): + ![](/images/aws-s3_tutorial_2.jpg) -- Download the [archive with images](https://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz). -- Unpack the archive into the prepared folder - and create a manifest file as described in [prepare manifest file section](/docs/manual/advanced/dataset_manifest/): +3. Click **Next: Permissions**. +4. Click **Create group**, enter the group name. +5. Use search to find and select: - ```bash - python /utils/dataset_manifest/create.py --output-dir - ``` + - For read-only access: **AmazonS3ReadOnlyAccess**. + - For full access: **AmazonS3FullAccess**. -#### Upload + ![](/images/aws-s3_tutorial_3.jpg) -- When the manifest file is ready, open the previously prepared bucket and click `Upload`: +6. (Optional) Add tags for the user and go to the next page. +7. Save **Access key ID** and **Secret access key**. - ![](/images/aws-s3_tutorial_5.jpg) +![](/images/aws-s3_tutorial_4.jpg) -- Drag the manifest file and image folder on the page and click `Upload`: +For more information, +see [Creating an IAM user in your AWS account](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html) - ![](/images/aws-s3_tutorial_1.gif) +#### Anonymous access -Now you can [attach new cloud storage into CVAT](#attach-new-cloud-storage). +On how to grant public access to the +bucket, see +[Configuring block public access settings for your S3 buckets](https://docs.aws.amazon.com/AmazonS3/latest/userguide/configuring-block-public-access-bucket.html) -## Using Azure Blob Container +### Attach AWS S3 storage -### Create Microsoft account +To attach storage, do the following: -First, create a Microsoft account by [registering](https://signup.live.com/signup?ru=https://login.live.com/), -or you can use your GitHub account to log in. After signing up for Azure, you'll need to choose a subscription plan, -you can choose a free 12-month subscription, but you'll need to enter your credit card details to verify your identity. -To learn more about Azure, read [documentation](https://docs.microsoft.com/en-us/azure/). +1. Log into CVAT and in the separate tab + open your bucket page. +2. In the CVAT, on the top menu select **Cloud storages** > on the opened page click **+**. -### Create a storage account +Fill in the following fields: -After registration, go to [Azure portal](https://portal.azure.com/#home). -Hover over the resource groups and click `create` in the window that appears. + -![](/images/azure_blob_container_tutorial1.jpg) +| CVAT | AWS S3 | +| ---------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| **Display name** | Preferred display name for your storage. | +| **Description** | (Optional) Add description of storage. | +| **Provider** | From drop-down list select **AWS S3**. | +| **Bucket name** | Name of the [Bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket). | +| **Authorization type** | Depends on the bucket setup:
  • **Key id and secret access key pair**: available on [IAM](https://console.aws.amazon.com/iamv2/home?#/users).
  • **Anonymous access**: for anonymous access. Public access to the bucket must be enabled. | +| **Region** | (Optional) Choose a region from the list or add a new one. For more information, see [**Available locations**](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html#concepts-available-regions). | +| **Manifests** | Click **+ Add manifest** and enter the name of the manifest file with an extension. For example: `manifest.jsonl`. | -Enter a name for the group and click `review + create`, check the entered data and click `create`. -After the resource group is created, -go to the [resource groups page](https://portal.azure.com/#blade/HubsExtension/BrowseResourceGroups) -and navigate to the resource group that you created. -Click `create` for create a storage account. + -![](/images/azure_blob_container_tutorial2.jpg) +After filling in all the fields, click **Submit**. -- **Basics** +### AWS manifest file - Enter `storage account name` (will be used in CVAT to access your container), select a `region`, - select `performance` in our case will be `standard` enough, select `redundancy` enough `LRS` - [more about redundancy](https://docs.microsoft.com/en-us/azure/storage/common/storage-redundancy). - Click `next` to go to the advanced section. +To prepare the manifest file, do the following: - ![](/images/azure_blob_container_tutorial4.jpg) +1. Go to [**AWS cli**](https://aws.amazon.com/cli/) and run + [script for prepare manifest file](https://github.com/cvat-ai/cvat/tree/develop/utils/dataset_manifest). +2. Perform the installation, following the [**aws-shell manual**](https://github.com/awslabs/aws-shell), +
    You can configure credentials by running `aws configure`. +
    You will need to enter `Access Key ID` and `Secret Access Key` as well as the region. -- **Advanced** +```bash +aws configure +Access Key ID: +Secret Access Key: +``` - In the advanced section, you can change public access by disabling `enable blob public access` - to deny anonymous access to the container. - If you want to change public access you can find this switch in the `configuration` section of your storage account. +3. Copy the content of the bucket to a folder on your computer: - After that, go to the review section, check the entered data and click `create`. +```bash +aws s3 cp --recursive +``` - ![](/images/azure_blob_container_tutorial5.jpg) +4. After copying the files, you can create a manifest file as described in [preapair manifest file section](/docs/manual/advanced/dataset_manifest/): -You will be reached to the deployment page after the finished, -navigate to the resource by clicking on `go to resource`. +```bash +python /utils/dataset_manifest/create.py --output-dir +``` -![](/images/azure_blob_container_tutorial6.jpg) +5. When the manifest file is ready, upload it to aws s3 bucket: -### Create a container +- For read and write permissions when you created the user, run: -Go to the containers section and create a new container. Enter the `name` of the container -(will be used in CVAT to access your container) and select `container` in `public access level`. +```bash +aws s3 cp /manifest.jsonl +``` -![](/images/azure_blob_container_tutorial7.jpg) +- For read-only permissions, use the download through the browser, click upload, + drag the manifest file to the page and click upload. -### SAS token +![](/images/aws-s3_tutorial_5.jpg) -Using the `SAS token`, you can securely transfer access to the container to other people by preconfiguring rights, -as well as the date/time of the starting and expiration of the token. -To generate a SAS token, go to `Shared access signature` section of your storage account. -Here you should enable `Blob` in the `Allowed services`, `Container` and `Object` in the `Allowed resource types`, -`Read` and `List` in the `Allowed permissions`, `HTTPS and HTTP` in the `Allowed protocols`, -also here you can set the date/time of the starting and expiration for the token. Click `Generation SAS token`. -and copy `SAS token` (will be used in CVAT to access your container). +## Google Cloud -![](/images/azure_blob_container_tutorial3.jpg) +### Create a bucket -For personal use, you can enter the `Access Key` from the your storage account in the `SAS Token` field, -`access key` can be found in the `security + networking` section. -Click `show keys` to show the key. +To create bucket, do the following: -![](/images/azure_blob_container_tutorial8.jpg) +1. Create [Google account](https://support.google.com/accounts/answer/27441?hl=en) and log into it. +2. On the [Google Cloud](https://cloud.google.com/) page, click **Start Free**, then enter the required + data and accept the terms of service. + > **Note:** Google requires to add payment, you will need a bank card to accomplish step 2. +3. [Create a Bucket](https://cloud.google.com/storage/docs/creating-buckets) with the following parameters: + - **Name your bucket**: Unique name. + - **Choose where to store your data**: Set up a location nearest to you. + - **Choose a storage class for your data**: `Set a default class` > `Standart`. + - **Choose how to control access to objects**: `Enforce public access prevention on this bucket` > + `Uniform` (default). + - **How to protect data**: `None` -### Upload dataset +![GB](/images/google_bucket.png) -Prepare the dataset as in the point [prepare dataset](#prepare-dataset). +You will be forwarded to the bucket. -- When the dataset is ready, go to your container and click `upload`. -- Click `select a files` and select all images from the images folder - in the `upload to folder` item write the name of the folder in which you want to upload images in this case "images". +### Upload data - ![](/images/azure_blob_container_tutorial9.jpg) +You need to upload data for annotation and the `manifest.jsonl` file. -- Click `upload`, when the images are loaded you will need to upload a manifest file. When loading a manifest, you - need to make sure that the relative paths specified in the manifest file match the paths - to the files in the container. Click `select a file` and select manifest file, in order to upload file to the root - of the container leave blank `upload to folder` field. +1. Prepare data. + For more information, + see [prepare the dataset](#prepare-the-dataset). +2. Open the bucket and from the top menu + select **Upload files** or **Upload folder** + (depends on how your files are organized). -Now you can [attach new cloud storage into CVAT](#attach-new-cloud-storage). +### Access permissions -## Using Google Cloud Storage +To access Google Cloud Storage get a **Project ID** +from [cloud resource manager page](https://console.cloud.google.com/cloud-resource-manager) -### Create Google account +![](/images/google_cloud_storage_tutorial5.jpg) -First, create a Google account, go to [account login page](https://accounts.google.com/) and click `Create account`. -After, go to the [Google Cloud page](https://cloud.google.com), click `Get started`, enter the required data -and accept the terms of service (you'll need credit card information to register). +And follow instructions below based on the preferable type of access. -### Create a bucket +#### Authorized access -Your first project will be created automatically, you can see it on the [cloud resource manager page](https://console.cloud.google.com/cloud-resource-manager). -To create a bucket, go to the [cloud storage page](https://console.cloud.google.com/storage/browser) -and press `Create bucket`. Next, enter the name of the bucket, add labels if necessary, select the type of location -for example region and the location nearest to you, select storage class, when selecting access control -you can enable `Enforce public access prevention on this bucket` (if you plan to have anonymous access to your bucket, -it should be disabled) you can select `Uniform` or `Fine-grained` access control, if you need protection of your -object data select protect object data type. When all the information is entered click `Create` to create the bucket. +For authorized access you need to create a service account and key file. -![](/images/google_cloud_storage_tutorial1.jpg) +To create a service account: -### Upload +1. In Google Cloud platform, go to **IAM & Admin** > **Service Accounts** and click **+Create Service Account**. +2. Enter your account name and click **Create And Continue**. +3. Select a role, for example **Basic** > **Viewer**, and click **Continue**. +4. (Optional) Give access rights to the service account. +5. Click **Done**. -Prepare the dataset as in the point [prepare dataset](#prepare-dataset). +![](/images/google_cloud_storage_tutorial2.jpg) -To upload files, you can simply drag and drop files and folders into a browser window -or use the `upload folder` and/or `upload files`. +To create a key: -### Access permissions +1. Go to **IAM & Admin** > **Service Accounts** > click on account name > **Keys**. +2. Click **Add key** and select **Create new key** > **JSON** +3. Click **Create**. The key file will be downloaded automatically. -To access Google Cloud Storage from CVAT you will need a `Project ID` -you can find it by going to [cloud resource manager page](https://console.cloud.google.com/cloud-resource-manager) +![](/images/google_cloud_storage_tutorial3.jpg) -![](/images/google_cloud_storage_tutorial5.jpg) +For more information about keys, see +[Learn more about creating keys](https://cloud.google.com/docs/authentication/getting-started). -#### Create a service account and key file +#### Anonymous access -To access your bucket you need a key file and a service account. To create a service account, -go to `IAM & Admin`/`Service Accounts` and press `Create Service Account`. Enter your account -name and click `Create And Continue`. Select a role for example `Basic`/`Viewer`. -Next, you can give access rights to the service account, to complete click `Done`. +To configure anonymous access: -![](/images/google_cloud_storage_tutorial2.jpg) +1. Open the bucket and go to the **Permissions** tab. +2. Сlick **+ Grant access** to add new principals. +3. In the **New principals** field specify `allUsers`, + select roles: `Cloud Storage Legacy` > `Storage Legacy Bucket Reader`. +4. Click **Save**. -The account you created will appear in the service accounts list, open it and go to the `Keys` tab. -To create a key, click `ADD` and select `Create new key`, next you need to choose the key type `JSON` and select `Create`. -The key file will be downloaded automatically. +![](/images/google_cloud_storage_tutorial4.jpg) -![](/images/google_cloud_storage_tutorial3.jpg) +Now you can attach new Azure Blob container into CVAT. -[Learn more about creating keys](https://cloud.google.com/docs/authentication/getting-started). +### Attach Google Cloud storage -#### Anonymous access +To attach storage, do the following: -To configure anonymous access, open your bucket and go to the permissions tab click `ADD` to add new principals. -In `new principals` field specify `allUsers`, select role for example `Cloud Storage Legacy`/`Storage Legacy Bucket Reader` -and press `SAVE`. +1. Log into CVAT and in the separate tab + open your [bucket](https://console.cloud.google.com/storage/browser) + page. +2. In the CVAT, on the top menu select **Cloud storages** > on the opened page click **+**. -![](/images/google_cloud_storage_tutorial4.jpg) +Fill in the following fields: -Now you can attach new cloud storage into CVAT. + -## Attach new cloud storage +| CVAT | Google Cloud | +| ---------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| **Display name** | Preferred display name for your storage. | +| **Description** | (Optional) Add description of storage. | +| **Provider** | From drop-down list select **Google Cloud Storage**. | +| **Bucket name** | Name of the bucket. You can find it on the [storage browser page](https://console.cloud.google.com/storage/browser). | +| **Authorization type** | Depends on the bucket setup:
  • **Authorized access**: Click on the **Key file** field and upload key file from computer.
    **Advanced**: For self-hosted solution, if the key file was not attached, then environment variable `GOOGLE_APPLICATION_CREDENTIALS` that was specified for an environment will be used. For more information, see [Authenticate to Cloud services using client libraries](https://cloud.google.com/docs/authentication/client-libraries#setting_the_environment_variable).
  • **Anonymous access**: for anonymous access. Public access to the bucket must be enabled. | +| **Prefix** | (Optional) Used to filter data from the bucket. | +| **Project ID** | [Project ID](#authorized-access).
    For more information, see [projects page](https://cloud.google.com/resource-manager/docs/creating-managing-projects) and [cloud resource manager page](https://console.cloud.google.com/cloud-resource-manager).
    **Note:** Project name does not match the project ID. | +| **Location** | (Optional) Choose a region from the list or add a new one. For more information, see [**Available locations**](https://cloud.google.com/storage/docs/locations#available-locations). | +| **Manifests** | Click **+ Add manifest** and enter the name of the manifest file with an extension. For example: `manifest.jsonl`. | -After you upload the dataset and manifest file to AWS-S3, Azure Blob Container or Google Cloud Storage -you will be able to attach a cloud storage. To do this, press the `+` button on the `Cloud storages` page -and fill out the following form: + -![](/images/image228.jpg) +After filling in all the fields, click **Submit**. -- `Display name` - the display name of the cloud storage. -- `Description` (optional) - description of the cloud storage, appears when you click on the `?` button - of an item on cloud storages page. -- `Provider` - choose provider of the cloud storage: +## Microsoft Azure - - [**AWS-S3**](#using-aws-s3): +### Create a bucket - - [`Bucket`](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket) - cloud storage bucket name. +To create bucket, do the following: - - [`Authorization type`](https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-control-best-practices.html): +1. Create an [Microsoft Azure](https://azure.microsoft.com/en-us/free/) account and log into it. +2. Go to [Azure portal](https://portal.azure.com/#home), hover over the resource + , and in the pop-up window click **Create**. - - `Key id and secret access key pair` - available on [IAM](https://console.aws.amazon.com/iamv2/home?#/users) - to obtain an access key and a secret key, create a user using IAM and grant the appropriate rights [learn more](#create-user-and-configure-permissions). + ![](/images/azure_blob_container_tutorial1.jpg) - - `ACCESS KEY ID` - - `SECRET ACCESS KEY ID` +3. Enter a name for the group and click **Review + create**, check the entered data and click **Create**. +4. Go to the [resource groups page](https://portal.azure.com/#view/HubsExtension/BrowseResourceGroups), + navigate to the group that you created and click **Create resources**. +5. On the marketplace page, use search to find **Storage account**. - - `Anonymous access` - for anonymous access, you need to enable public access to bucket. + ![](/images/azure_blob_container_tutorial2.png) - - `Region` - here you can choose a region from the list or add a new one. To get more information click - on [`?`](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html#concepts-available-regions). +6. Click on **Storage account** and on the next page click **Create**. +7. On the **Basics** tab, fill in the following fields: -
    + - **Storage account name**: to access container from CVAT. + - Select a region closest to you. + - Select **Performance** > **Standart**. + - Select **Local-redundancy storage (LRS)**. + - Click **next: Advanced>**. - - [**Azure Blob Container**](https://docs.microsoft.com/en-us/azure/storage/blobs/): + ![](/images/azure_blob_container_tutorial4.png) - - `Container name` - name of the cloud storage container. +8. On the **Advanced** page, fill in the following fields: + - (Optional) Disable **Allow enabling public access on containers** to prohibit anonymous access to the container. + - Click **Next > Networking**. - - `Authorization type`: +![](/images/azure_blob_container_tutorial5.png) - - [`Account name and SAS token`](https://docs.microsoft.com/en-us/azure/cognitive-services/translator/document-translation/create-sas-tokens?tabs=blobs): +9. On the **Networking** tab, fill in the following fields: - - `Account name` - storage account name. - - `SAS token` - is located in the `Shared access signature` section of your `Storage account` [learn more](#sas-token). + - If you want to change public access, enable **Public access from all networks**. + - Click **Next>Data protection**. - - [`Anonymous access`](https://docs.microsoft.com/en-us/azure/storage/blobs/anonymous-read-access-configure?tabs=portal) - - for anonymous access `enable blob public access` in the `configuration` section of your storage account. - in this case, you only need the storage account name to gain anonymous access. - - `Account name` - storage account name. + > You do not need to change anything in other tabs until you need some specific setup. -
    +10. Click **Review** and wait for the data to load. +11. Click **Create**. Deployment will start. +12. After deployment is over, click **Go to resource**. - - [**Google Cloud**](https://cloud.google.com/docs): +![](/images/azure_blob_container_tutorial6.jpg) - - [`Bucket name`](https://cloud.google.com/storage/docs/creating-buckets) - cloud storage bucket name, - you can find the created bucket on the [storage browser page](https://console.cloud.google.com/storage/browser). +### Create a container - - `Authorization type`: +To create container, do the following: - - [`Key file`](#create-a-service-account-and-key-file) - you can drag a key file to the area `attach a file` - or click on the area to select the key file through the explorer. If the environment variable - `GOOGLE_APPLICATION_CREDENTIALS` is specified for an environment with a deployed CVAT instance, then it will - be used if you do not attach the key file - ([more about `GOOGLE_APPLICATION_CREDENTIALS`](https://cloud.google.com/docs/authentication/getting-started#setting_the_environment_variable)). +1. Go to the containers section and on the top menu click **+Container** - - [`Anonymous access`](#anonymous-access) - for anonymous access, you need to enable public access to bucket. +![](/images/azure_blob_container_tutorial7.jpg) - - `Prefix` - used to filter data from the bucket. +3. Enter the name of the container. +4. (Optional) In the **Public access level** drop-down, select type of the access. +
    **Note:** this field will inactive if you disabled **Allow enabling public access on containers**. +5. Click **Create**. - - [`Project ID`](https://cloud.google.com/resource-manager/docs/creating-managing-projects) - you can find - the created project on the [cloud resource manager page](https://console.cloud.google.com/cloud-resource-manager), - note that the project name does not match the project ID. +### Upload data - - `Location` - here you can choose a region from the list or add a new one. To get more information click - on [`?`](https://cloud.google.com/storage/docs/locations#available-locations). +You need to upload data for annotation and the `manifest.jsonl` file. -
    +1. Prepare data. + For more information, + see [prepare the dataset](#prepare-the-dataset). +2. Go to container and click **Upload**. +3. Click **Browse for files** and select images. + > Note: If images are in folder, specify folder in the **Advanced settings** > **Upload to folder**. +4. Click **Upload**. -- `Manifest` - the path to the manifest file on your cloud storage. - You can add multiple manifest files using the `Add manifest` button. - You can find on how to prepare dataset manifest [`here`](/docs/manual/advanced/dataset_manifest/). - If you have data on the cloud storage and don't want to download content locally, you can mount your - cloud storage as a share point according to [`that guide`](/docs/administration/advanced/mounting_cloud_storages/) - and prepare manifest for the data. +![](/images/azure_blob_container_tutorial9.jpg) -To publish the cloud storage, click `submit`, after which it will be available on -the [Cloud storages page](/docs/manual/basics/cloud-storages/). +### SAS token -## Using AWS Data Exchange +Use the SAS token to grant secure access to the container. -### Subscribe to data set +To configure the SAS token: -You can use AWS Data Exchange to add image datasets. -For example, consider adding a set of datasets `500 Image & Metadata Free Sample`. -Go to [browse catalog](https://console.aws.amazon.com/dataexchange) and use the search to find -`500 Image & Metadata Free Sample`, open the dataset page and click `continue to subscribe`, -you will be taken to the page complete subscription request, read the information provided -and click send subscription request to provider. +1. Go to **Home** > **Resourse groups** > You resourse name > Your storage account. +2. On the left menu, click **Shared access signature**. +3. Change the following fields: + - **Allowed services**: Enable **Blob** . Disable all other fields. + - **Allowed resource types**: Enable **Container** and **Object**. Disable all other fields. + - **Allowed permissions**: Enable **Read**, **Write**, and **List**. Disable all other fields. + - **Start and expiry date**: Set up start and expiry dates. + - **Allowed protocols**: Select **HTTPS and HTTP** + - Leave all other fields with default parameters. +4. Click **Generate SAS token** and copy **SAS token**. -![](/images/aws-s3_tutorial_6.jpg) +![](/images/azure_blob_container_tutorial3.jpg) -### Export to bucket +### Personal use -After that, this dataset will appear in the -[list subscriptions](https://console.aws.amazon.com/dataexchange/home/subscriptions#/subscriptions). -Now you need to export the dataset to `Amazon S3`. -First, let's create a new one bucket similar to [described above](#create-a-bucket). -To export one of the datasets to a new bucket open it `entitled data` select one of the datasets, -select the corresponding revision and click export to Amazon S3 -(please note that if bucket and dataset are located in different regions, export fees may apply). -In the window that appears, select the created bucket and click export. +For personal use, you can use the **Access Key** +from your storage account in the CVAT **SAS Token** field. -![](/images/aws-s3_tutorial_7.jpg) +To get the **Access Key**: -### Prepare manifest file +1. In the Azure Portal, go to the **Security + networking** > **Access Keys** +2. Click **Show** and copy the key. -Now you need to prepare a manifest file. I used [AWS cli](https://aws.amazon.com/cli/) and -[script for prepare manifest file](https://github.com/cvat-ai/cvat/tree/develop/utils/dataset_manifest). -Perform the installation using the manual [aws-shell](https://github.com/awslabs/aws-shell), -I used `aws-cli 1.20.49` `Python 3.7.9` `Windows 10`. -You can configure credentials by running `aws configure`. -You will need to enter `Access Key ID` and `Secret Access Key` as well as region. +![](/images/azure_blob_container_tutorial8.jpg) -```bash -aws configure -Access Key ID: -Secret Access Key: -``` +### Attach Azure Blob Container -Copy the content of the bucket to a folder on your computer: +To attach storage, do the following: -```bash -aws s3 cp --recursive -``` +1. Log into CVAT and in the separate tab + open your bucket page. +2. In the CVAT, on the top menu select **Cloud storages** > on the opened page click **+**. -After copying the files, you can create a manifest file as described in [preapair manifest file section](/docs/manual/advanced/dataset_manifest/): +Fill in the following fields: -```bash -python /utils/dataset_manifest/create.py --output-dir -``` + -When the manifest file is ready, you can upload it to aws s3 bucket. If you gave full write permissions -when you created the user, run: +| CVAT | Azure | +| ---------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| **Display name** | Preferred display name for your storage. | +| **Description** | (Optional) Add description of storage. | +| **Provider** | From drop-down list select **Azure Blob Container**. | +| **Container name`** | Name of the cloud storage container. | +| **Authorization type** | Depends on the container setup.
    **[Account name and SAS token](https://docs.microsoft.com/en-us/azure/cognitive-services/translator/document-translation/create-sas-tokens?tabs=blobs)**:
    • **Account name** enter storage account name.
    • **SAS token** is located in the **Shared access signature** section of your [Storage account](#sas-token).
    . **[Anonymous access](https://docs.microsoft.com/en-us/azure/storage/blobs/anonymous-read-access-configure?tabs=portal)**: for anonymous access **Allow enabling public access on containers** must be enabled. | +| **Manifests** | Click **+ Add manifest** and enter the name of the manifest file with an extention. For example: `manifest.jsonl`. | -```bash -aws s3 cp /manifest.jsonl -``` + -If you have given read-only permissions, use the download through the browser, click upload, -drag the manifest file to the page and click upload. +After filling in all the fields, click **Submit**. -![](/images/aws-s3_tutorial_5.jpg) -Now you can [attach new cloud storage](#attach-new-cloud-storage) using the dataset `500 Image & Metadata Free Sample`. + +## Prepare the dataset + +For example, the dataset is [The Oxford-IIIT Pet Dataset](https://www.robots.ox.ac.uk/~vgg/data/pets/): + +1. Download the [archive with images](https://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz). +2. Unpack the archive into the prepared folder. +3. Create a manifest. For more information, see [**Dataset manifest**](/docs/manual/advanced/dataset_manifest/): + +```bash +python /utils/dataset_manifest/create.py --output-dir +``` diff --git a/site/content/en/images/azure_blob_container_tutorial2.jpg b/site/content/en/images/azure_blob_container_tutorial2.jpg deleted file mode 100644 index ed7ffc8f2075..000000000000 Binary files a/site/content/en/images/azure_blob_container_tutorial2.jpg and /dev/null differ diff --git a/site/content/en/images/azure_blob_container_tutorial2.png b/site/content/en/images/azure_blob_container_tutorial2.png new file mode 100644 index 000000000000..6a769a88cf65 Binary files /dev/null and b/site/content/en/images/azure_blob_container_tutorial2.png differ diff --git a/site/content/en/images/azure_blob_container_tutorial5.jpg b/site/content/en/images/azure_blob_container_tutorial5.jpg deleted file mode 100644 index db9606458f31..000000000000 Binary files a/site/content/en/images/azure_blob_container_tutorial5.jpg and /dev/null differ diff --git a/site/content/en/images/azure_blob_container_tutorial5.png b/site/content/en/images/azure_blob_container_tutorial5.png new file mode 100644 index 000000000000..cf40f9b71aeb Binary files /dev/null and b/site/content/en/images/azure_blob_container_tutorial5.png differ diff --git a/site/content/en/images/azure_blob_container_tutorial8.jpg b/site/content/en/images/azure_blob_container_tutorial8.jpg index 3cebb4a45348..27f94305aa1b 100644 Binary files a/site/content/en/images/azure_blob_container_tutorial8.jpg and b/site/content/en/images/azure_blob_container_tutorial8.jpg differ diff --git a/site/content/en/images/google_bucket.png b/site/content/en/images/google_bucket.png new file mode 100644 index 000000000000..c6dd2c62c427 Binary files /dev/null and b/site/content/en/images/google_bucket.png differ diff --git a/site/content/en/images/google_cloud_storage_tutorial1.jpg b/site/content/en/images/google_cloud_storage_tutorial1.jpg deleted file mode 100644 index 83d3aa57e561..000000000000 Binary files a/site/content/en/images/google_cloud_storage_tutorial1.jpg and /dev/null differ