You need to agree to share your contact information to access the SceneDAPR dataset. The Dataset is licensed for non-commercial scientific research purposes. Any action that violates the license agreement is prosecutable by law. Please send an email to jiwonkang@g.skku.edu with the following form. If you do not receive a feedback in 7 days, please contact jinyounghan@skku.edu.
- Name:
- Email:
- Affiliation:
- Short description of your research:
git clone https://github.com/DSAIL-SKKU/SceneDAPR.git
cd SceneDAPR
pip install -r requirements.txt
- Data description of scene sketches across the age groups
Age Group (age) | Children & Adolescents (8-18) | Adults (19-59) | Seniors (60-96) | All |
---|---|---|---|---|
Total | 462 (33.02%) | 650 (46.46%) | 287 (20.51%) | 1,399 (100%) |
- Data description of objects appeared in the collected scene sketches across the six DAPR object categories among the total number of 148 object categories
DAPR object | Person | Rain | Umbrella | Cloud | Puddle | Lightning | All |
---|---|---|---|---|---|---|---|
Total | 1,723 | 1,498 | 826 | 621 | 468 | 105 | 5,239 |
SceneDAPR consists of 1,399 scene-level free-hand drawings along with segmentation labels of 6,420 object-level sketches. Furthermore, our dataset provides the sequential stroke order of the sketches and metadata about the drawing (e.g., drawing time) and its drawer (e.g., age and gender)
dataset/collected
: Scene-level free-hand drawings of the diverse age groupsdataset/synthesized
: This dataset is used for data augmentation. The synthesized sketches are created based on layout template, including the locations of the objects from the collected scene sketches. We replace images with other alternative drawings for the given objects.
The SceneDAPR dataset is structured as follows. All the files are provided with a canvas size of 757x535 (hxw).
- Data Desctipion (
dataset/final-dataset-dist.csv
)- image ID
- age group (children & adolescents, adults, seniors)
- age
- dataset (train & test split)
- gender
- self-reported description of the drawing
- mainCharacter: Who is the main character in the picture? (Optional)
- emotion: How does the main character in the picture feel?
- activity: What is the person in the picture doing?
- situation: Why is the person in the picture standing in the rain?
- expectation: What will happen to the person in the picture in the future?
- need: What does the person in the picture need?
- drawing time
- stroke count
- stroke length
- Pixel-based: a image file (
.png
) of the scene-level sketchdataset/collected/png
dataset/synthesized/creation/png
- Stroke-based: a vector file (
.svg
) of the scene-level and object-level sketch. The Ramer–Douglas–Peucker algorithm with an epsilon value of 0.5 used to simplify all strokes-
Scene-level:
dataset/collected/svg-scene
-
Object-level:
dataset/collected/svg-object
-
(Tip) To convert
.svg
to an array of the coordinate (x,y), you can use asvg_to_coordinate
function insrc/preprocessing.py
-
The proposed SceneDAPR supports the object detection task for coarse-grained analysis.
pip install -r ultralytics/requirements.txt
- Complete the root path in the
dataset/yolo-exp-All.yaml
...
path: /path/to/dataset # dataset root dir
...
# Example with YOLOv8-nano
yolo train model=yolov8n.pt data="dataset/yolo-exp-All.yaml" epochs=300 batch=16
- Other Options: If you want to train the model to investigate the effect on age-biased datasets, you can use as following options instead of
-data dataset/yolo-exp-All.yaml
.-data dataset/yolo-exp-children-Adolescents.yaml
(Children&Adolescent-biased)-data dataset/yolo-exp-adults.yaml
(Adult-biased)-data dataset/yolo-exp-seniors.yaml
(Senior-biased)
yolo predict model=weights/yolov8-all.pt source=inference_test.jpg
- Other Options: You can use as following options to inference with your data instead of
-source inference_test.jpg
.-source path/
# for directory-source path/*.jpg
# for glob
BIBTEX
@inproceedings{kang2024scenedapr,
title={SceneDAPR: A Scene-Level Free-Hand Drawing Dataset for Web-based Psychological Drawing Assessment},
author={Kang, Jiwon and Kim, Jiwon and Yang, Migyeong and Park, Chaehee and Kim, Taeeun and Song, Hayeon and Han, Jinyoung},
booktitle={Proceedings of the ACM on Web Conference 2024},
pages={4630--4641},
year={2024}
}
-
This work is made available under Creative Commons Attribution-NonCommercial 4.0 International license (CC BY-NC 4.0)
-
If you want to use the SceneDAPR for psychological tasks, we intend to use the dataset to develop an automated system with a human-in-the-loop feedback system. We do not intend to substitute and exclude human experts in psychological tasks.
This repository is maintained by Jiwon Kang and Jiwon Kim. Any feedback, extensions & suggestions are welcome! Please send an email to jiwonkang@g.skku.edu.
Synthesizing for more drawings
- Download external datasets for the objects (rain, person, umbrella, lightning, pool, and cloud). The external drawings and their download links the SceneDAPR dataset used as follows:
- Person, Umbrella, Cloud: TU-Berlin in a
.png
format [download] - Rain, Umbrella, Puddle, Lightning, Cloud: QuickDraw! in a
.ndjson
format [download]
- Complete the cofiguration of
dataset_paths
andcategory_sample_paths
inconfig.py
dataset_paths = {
"quickdraw": "/path/to/ndjson/",
"tu-berlin": "/path/to/png/",
}
category_sample_paths = {
"rain": [{
"format": "ndjson", # Default: ndjson (Available format: png, jpg, and ndjson)
"dataset": "quickdraw", # Default: quickdraw
"path": # Default: dataset_paths["quickdraw"] + "rain.ndjson"
}
],
# ...Complete the configuration of other categories...
}
- Run
scene_augmentator.py
for synthesizing more drawings based on the collected drawings. Then, you can obtain (i) the synthesized images (.png
) and (ii) the corresponding COCO format file (.json
)
python scene_augmentator.py -N 10 \
--save_dir dataset/augmentation/png/ \
--mask_info_dir dataset/230530-train_6_classes-1399.json \
--save_info_dir dataset/augmentation/230530-train_6_classes-1399-aug.json \
--png_save_dir dataset/augmentation/object_png/
- (Optional) If you want to conduct experiments with a YOLO model, then run
coco2yolo.py
to convert the COCO format file for compatibility
python coco2yolo.py --img_folder dataset/png/ \
--yolo_folder dataset/yolo/ \
--coco_file dataset/230530-train_6_classes-1399.json \
--split_file dataset/final-dataset-230530-dist-1399.csv
Create your own dataset
- This repository contains a web-based drawing collection system used in the paper
- Run a client
cd collection_system/client
npm i # install dependencies
npm start # dev
- Run a server
cd collection_system/server
prisma db push # init database
uvicorn main:app --reload