Create CARES.md

AIResponsibly · Jun 13, 2024 · a8eedda · a8eedda
1 parent 558d7ad
commit a8eedda
Showing 1 changed file with 36 additions and 0 deletions.
diff --git a/summaries/safety/CARES.md b/summaries/safety/CARES.md
@@ -0,0 +1,36 @@
+# CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models
+- **Published**: arXiv, 2024
+- **Link**: [arXiv:2406.06007v1](https://arxiv.org/abs/2406.06007v1)
+- **Summary**: CARES evaluates the trustworthiness of medical vision language models (Med-LVLMs) across five dimensions: trustfulness, fairness, safety, privacy, and robustness.
+
+### Problem 
+- Unverified trustworthiness of Med-LVLMs poses risks in medical applications.
+  - Factual inaccuracies in medical diagnoses.
+  - Overconfidence in generated diagnoses.
+  - Privacy breaches.
+  - Health disparities across demographic groups.
+
+### Contributions
+- Introduction of CARES benchmark for evaluating Med-LVLMs' trustworthiness.
+- Assessment across five critical dimensions: trustfulness, fairness, safety, privacy, and robustness.
+- Public release of benchmark and code.
+
+### Method
+- Dataset from seven medical multimodal and image classification datasets.
+- 18K images and 41K question-answer pairs in various formats.
+- Evaluation based on trustfulness, fairness, safety, privacy, and robustness.
+
+### Result
+- Factual inaccuracies.
+- Poor uncertainty estimation.
+- Performance disparities across demographics.
+- Vulnerability to attacks.
+- Privacy leaks.
+- Inadequate handling of OOD samples.
+
+### Conclusion
+- Existing Med-LVLMs are unreliable and pose significant trustworthiness issues.
+- CARES aims to drive standardization and development of more reliable Med-LVLMs.
+
+### Reference
+- Peng Xia, Ze Chen, Juanxi Tian, Yangrui Gong, et al. "CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models." arXiv preprint arXiv:2406.06007v1, 2024.