Data Science Education and Training
Big Data and Social Science: A Practical Guide to Methods and Tools. Ian Foster, Rayid Ghani, Ron Jarmin and Frauke Kreuter, Julia Lane. Chapman and Hall/CRC Press, 2016. (Second edition 2020)
An Experience-Centered Approach to Training Effective Data Scientists. Kit T Rodolfa, Adolfo De Unanue, Matt Gee, and Rayid Ghani. Big Data Journal. 2019.
Taking our Medicine: Standardizing Data Science Education With Practice at the Core. Kit Rodolfa and Rayid Ghani. Commentary, Harvard Data Science Review, 2021.
Change Through Data: A Data Analytics Training Program for Government Employees. Frauke Kreuter, Rayid Ghani, Julia Lane. Harvard Data Science Review, 1(2). 2019
Machine Learning (Book Chapter). Rayid Ghani and Malte Schierholz. In Big Data and Social Science: A Practical Guide to Methods and Tools. Chapman and Hall/CRC Press, 2016.
Bias, Fairness, and Equity in AI/Machine Learning Systems
Toward Operationalizing Pipeline-aware ML Fairness: A Research Agenda for Developing Practical Guidelines and Tools. Emily Black, Rakshit Naidu, Rayid Ghani, Kit Rodolfa, Daniel Ho, Hoda Heidari. EAAMO ’23: Equity and Access in Algorithms, Mechanisms, and Optimization, Boston, MA, USA, October 2023,
Empirical observation of negligible fairness–accuracy trade-offs in machine learning for public policy. Kit T. Rodolfa, Hemank Lamba, Rayid Ghani. Nature Machine Intelligence 3, 896–904 (2021).
Bias and Fairness (in Machine Learning) Book Chapter. Kit T. Rodolfa, Pedro Saleiro, Rayid Ghani. In Big Data and Social Science: A Practical Guide to Methods and Tools. Chapman and Hall/CRC Press, 2020
An Empirical Comparison of Bias Reduction Methods on Real-World Problems in High-Stakes Policy Settings. Hemank Lamba, Kit T. Rodolfa, Rayid Ghani. ACM SIGKDD Explorations, 2021.
Predictive Fairness to Reduce Misdemeanor Recidivism Through Social Service Interventions. K. Rodolfa; E. Salomon; L. Haynes; I. Mendieta; J. Larson; R. Ghani. Proceedings of the ACM Conference on Fairness, Accountability, and Transparency (ACM FAT*) 2020.
Aequitas: A Bias and Fairness Audit Toolkit. Pedro Saleiro, Benedict Kuester, Loren Hinkson, Jesse London, Abby Stevens, Ari Anisfeld, Kit T. Rodolfa, Rayid Ghani.
Explainability and Human-AI Interaction
Explainable Machine Learning for Public Policy: Use Cases, Gaps, and Research Directions. Kasun Amarasinghe, Kit Rodolfa, Hemank Lamba, Rayid Ghani. Data and Policy (2023). Cambridge University Press.
Machine Learning Informed Decision-Making with Interpreted Model’s Outputs: A Field Intervention. Leid Zejnilovic, Susana Lavado, Carlos Soares, Íñigo Martínez De Rituerto De Troya, Andrew Bell, Rayid Ghani. Academy of Management Proceedings, 2021.
On the Importance of Application-Grounded Experimental Design for Evaluating Explainable ML Methods. Kasun Amarasinghe, Kit T Rodolfa, Sérgio Jesus, Valerie Chen, Vladimir Balayan, Pedro Saleiro, Pedro Bizarro, Ameet Talwalkar, Rayid Ghani. Arxiv.
Machine learning and artificial intelligence research for patient benefit: 20 critical questions on transparency, replicability, ethics, and effectiveness.
Artificial Intelligence for Social Good. Gregory D. Hager, Ann Drobnis, Fei Fang, Rayid Ghani, Amy Greenwald, Terah Lyons, David C. Parkes, Jason Schultz, Suchi Saria, Stephen F. Smith, and Milind Tambe. Computing Community Consortium. March 2017.
Case Studies: Applying Machine Learning/Data Science/AI to tackle Social and Policy Problems
Bandit Data-Driven Optimization. Zheyuan Ryan Shi, Zhiwei Steven Wu, Rayid Ghani, Fei Fang. AAAI-22: the 36th AAAI Conference on Artificial Intelligence, 2022.
A recommendation and risk classification system for connecting rough sleepers to essential outreach services. Harrison Wilde, Lucia L. Chen, Austin Nguyen, Zoe Kimpel, Joshua Sidgwick, Adolfo De Unanue, Davide Veronese, Bilal Mateen, Rayid Ghani, and Sebastian Vollmer. Data & Policy 3 (2021).
Validation of a Machine Learning Model to Predict Childhood Lead Poisoning. JAMA Netw Open. 2020;3(9):e2012734. doi:10.1001/jamanetworkopen.2020.12734
Predictive Analytics for Retention in Care in an Urban HIV Clinic. Arthi Ramachandran, Avishek Kumar, Hannes Koenig, Adolfo De Unanue, Christina Sung, Joe Walsh, John Schneider, Rayid Ghani & Jessica P. Ridgway Nature Scientific Reports 10, 6421 (2020). https://doi.org/10.1038/s41598-020-62729-x
Using Machine Learning to Help Vulnerable Tenants in New York City. Teng Ye, Rebecca Johnson, Samantha Fu, Jerica Copeny, Bridgit Donnelly, Alex Freeman, Mirian Lima, Joe Walsh, and Rayid Ghani. Proceedings of the 2nd ACM SIGCAS Conference on Computing and Sustainable Societies (COMPASS ’19). ACM, New York, NY, USA, 248-258.
Deploying Machine Learning Models for Public Policy: A Framework. Klaus Ackermann, Joe Walsh, Adolfo De Unánue, Hareem Naveed, Andrea Navarrete Rivera, Sun-Joo Lee, Jason Bennett, Michael Defoe, Crystal Cody, Lauren Haynes and Rayid Ghani. 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2018).
Reducing Incarceration through Prioritized Interventions. Matthew J. Bauman, Kate Boxer, Tzu-Yun Lin, Erika Salomon, Hareem Naveed, Lauren Haynes, Joe Walsh, Jen Helsby, Steve Yoder, Robert Sullivan, Rayid Ghani. ACM SIGCAS Conference on Computing and Sustainable Societies, 2018.
Improving Government Response to Citizen Requests Online. Garren Gaut, Andrea Navarette, Laila Wahedi, Paul van der Boor, Adolfo de Unánue, Jorge Díaz, Eduardo Clark, Rayid Ghani. ACM SIGCAS Conference on Computing and Sustainable Societies, 2018.
Using Machine Learning to Assess the Risk of and Prevent Water Main Breaks. Avishek Kumar, Syed Ali Asad Rizvi, Benjamin Brooks, Ali Vanderveld, Kevin Hayes Wilson, Chad Kenney, Adria Finch, Andrew Maxwell, Sam Edelstein, Joe Zuckerbraun and Rayid Ghani. 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2018).
Machine Learning for Social Services: A case study of prenatal case management in Illinois. Ian Pan, Laura B. Nolan, Rashida R. Brown, Romana Khan, Paul van der Boor, Daniel G. Harris, Rayid Ghani. American Journal of Public Health, 2017.
Early Intervention Systems – Predicting Adverse Interactions Between Police and the Public. Jennifer Helsby, Samuel Carton, Kenneth Joseph, Ayesha Mahmud, Youngsoo Park, Andrea Navarrete, Klaus Ackermann, Joe Walsh, Lauren Haynes, Crystal Cody, Major Estella Patterson, Rayid Ghani. Criminal Justice Policy Review, 2017.
Building Better Early Intervention Systems. Crystal Cody, Estella Patterson, Kerr Putney, Jennifer Helsby, Joe Walsh, Lauren Haynes, and Rayid Ghani. Police Chief Magazine. International Association of Chiefs of Police. 2016
Detecting fraud, corruption, and collusion in international development contracts. Emily Grace, Ankit Rai, Elissa Redmiles, Rayid Ghani. 2016 IEEE International Conference on Big Data.
The Legislative Influence Detector: Finding Text Reuse in State Legislation. Burgess et al. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2016).
Identifying Police Officers at Risk of Adverse Events. Carton et al. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2016).
Designing Policy Recommendations to Reduce Home Abandonment in Mexico. Ackerman et al. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2016).
Identifying Earmarks in Congressional Bills. Khabsa et al. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2016).
A Machine Learning Framework to Identify Students at Risk of Adverse Academic Outcomes.
Himabindu Lakkaraju, Everaldo Aguiar, Carl Shan, David Miller, Nasir Bhanpuri, Rayid Ghani, Kecia Addison. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2015)
Predictive Modeling for Public Health: Preventing Childhood Lead Poisoning.
Eric Potash, Joe Brew, Alexander Loewi, Subhabrata Majumdar, Andrew Reece, Joe Walsh, Eric Rozier, Emile Jorgenson, Raed Mansour, Rayid Ghani. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2015)
Early Prediction of Code Blue Using Electronic Medical Records.
Sriram Somanchi, Samrachana Adhikari, Allen Lin, Elena Eneva, and Rayid Ghani. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2015)
Who, When, and Why: A Machine Learning Approach to Prioritizing Students at Risk of not Graduating High School on Time.
Everaldo Aguiar, Himabindu Lakkaraju, Nasir Bhanpuri, David Miller, Ben Yuhas, Kecia Addison, Shihching Liu, Marilyn Powell, and Rayid Ghani. 5th International Learning Analytics and Knowledge (LAK) Conference 2015.
Early Code Blue Prediction Using Patient Medical Records.
Sriram Somanchi, Samrachana Adhikari, Allen Lin, Elena Eneva, and Rayid Ghani. Workshop on Machine Learning for Clinical Data Analysis and Healthcare – held with NIPS 2013.