Skip to content

2. Design and Planing

heng2j edited this page Oct 15, 2018 · 3 revisions

Initial Design and Planing for Deep Images Hub(DIH)

Requirements from Image Suppliers:

  • As an Images Supplier, I am able to upload an batch of images with just a single label.

Tasks:

  • Simulate batch images submission by copying image sources from S3 buckets to DIH's S3 buckets

  • DIH shall have the mechanism to verify user submitted labels if exist in DIH's database.

    • If label doesn't exist inform user.
  • DIH shall have the mechanism to chain the user supplied label into its own branch of category

    • Create backend data relationship to organize and allocate data
  • If user opt-in to share their location of where they uploaded the images, DIH shall be able to record that.

    • Simulate user geolocation info

Requirements from Data Scientists:

  • As a Data Scientist, when I visit Deep Image Hub I should see how many categories(labels) of images I can download.
  • As a Data Scientist, when I visit Deep Image Hub I should see the latest batches of images that just uploaded.
  • As a Data Scientist, I should be able to select labels of images and download the images by categories.
    • For example, if I choose the category Food, I should be able to download all the images about Food.
  • As a Data Scientist, I should be able to request new label if it is not already in DIH
  • As a Data Scientist, I should be able to request to train a baseline model with my choice of labels of images.
    • And if there are not enough images (above 500) for certain labels, I can still enqueue my training request
  • As a Data Scientist, I should be able to get a the download link of my model once it is trained.
    • The model training summary about the final accuracy scores and losses should be reported as well

Tasks:

  • DIH webpage shall display all the labels that has images and they should be able to group by their own categories
    • The label name and the number of images under this label shall be displayed
  • DIH webpage shall be able to constantly display the latest image batches submissions with label name, number of images and where they came from
  • DIH shall allow user to download batch of images by the parent category of the images
    • Building label relationships and use hierarchical and recursive queries in SQL to achieve this request
  • DIH shall able to allow user to add new labels once they also provide the immediate parent label
    • For example LaCorix's immediate parent label will be soft_drink
  • DIH shall have a user requests watch list to keep track of the user requests
    • A schedule workflow will be needed to constantly check if the requirements are full filled
  • DIH should keep track of the modeling training results and display on the model list web page
    • A email with downloadable links and brief summary of the model training report should send to user once model training is done.