PT - JOURNAL ARTICLE AU - Jeong, Yejin AU - Schaekermann, Mike AU - Lin, Steven TI - Crowdsourcing Dermatology Images with Google Search Ads: Creating a Real-World Skin Condition Dataset for AI Development AID - 10.1370/afm.22.s1.6186 DP - 2024 Nov 20 TA - The Annals of Family Medicine PG - 6186 VI - 22 IP - Supplement 1 4099 - http://www.annfammed.org/content/22/Supplement_1/6186.short 4100 - http://www.annfammed.org/content/22/Supplement_1/6186.full SO - Ann Fam Med2024 Nov 20; 22 AB - Context: Traditional clinical health datasets often fail to represent the diversity and breadth of real-world disease, influencing research and AI tools developed on those datasets. Dermatology, easily documentable through patient-contributed images, presents an ideal case for innovative dataset creation.Objective: To develop a scalable, representative dermatology dataset using crowdsourcing via internet search ads, enhancing medical education and AI application development.Study Design and Analysis: We employed Google Search ads to gather dermatology images from the public, with informed consent, between March and November 2023. Images were curated, de-identified, and labeled by dermatologists, providing a dataset of 10,408 images from 5,033 contributors across the United States.Setting or Dataset: The dataset aggregated includes diverse skin conditions and demographic data, now publicly accessible on GitHub.Population Studied: Internet users across the United States, representing various demographics and skin types, contributed to the dataset.Intervention/Instrument: Google Search ads targeted individuals searching for skin-related terms, inviting them to contribute images and related information via a web platform.Outcome Measures: Measures included the volume of contributions, demographic diversity, and the diagnostic usability of images as evaluated by dermatologists.Results: The study achieved a median of 22 submissions per day, with a significant representation of diverse skin conditions and demographics. Over 97.5% of contributions were usable, with high dermatologist confidence correlated with the completeness of accompanying data.Conclusions: Crowdsourcing via search ads effectively generates diverse, representative dermatological datasets. The SCIN dataset can significantly enhance dermatological research, education, and AI tool accuracy, particularly in underrepresented communities.