[Edited Books]
Big Data and Social Science: A Practical Guide to Methods and Tools.
Ian Foster, Rayid Ghani, Ron Jarmin and Frauke Kreuter, Julia Lane.
Chapman and Hall/CRC Press, 2016.
Big Data and Social Science: A Practical Guide to Methods and Tools.
Ian Foster, Rayid Ghani, Ron Jarmin and Frauke Kreuter, Julia Lane.
Chapman and Hall/CRC Press, Second Edition. 2020
Data Mining for Business Applications.
Editors: Carlos Soares, Rayid Ghani. IOS Press, 2010.
[Book Chapters]
Machine Learning.
Rayid Ghani and Malte Schierholz. In Big Data and Social Science: A Practical Guide to Methods and Tools. Chapman and Hall/CRC Press, 2016.Bias and Fairness in Machine Learning. K Rodolfa, P Saleiro, R Ghani. In Big Data and Social Science: A Practical Guide to Methods and Tools. Chapman and Hall/CRC Press, 2020
Machine Learning and Semantic Technologies for Enterprise Knowledge Management.
Rayid Ghani and Divna Djordjevic.
Book Chapter, Context and Semantics in Knowledge Management. Springer 2011Data Mining for Consumer Modeling and Personalized Promotions. [pdf] Rayid Ghani, Chad Cumby, Andrew Fano, and Marko Krema.
Book Chapter – Data Mining Methods and Applications. Kenneth D. Lawrence, Stephan Kudyba, Ronald K. Klimberg (Eds.). Auerbach Publications. 2008
Extracting and using Attribute-Value pairs from product descriptions on the web. [pdf] Katharina Probst, Rayid Ghani , Yan Liu, Marko Krema, and Andrew Fano.
Book chapter – Web Mining. 2007
[Edited Proceedings]
Big Data for Social Good. Special Issue of the Big Data Journal, 2015.
Charlie Catlett and Rayid Ghani, Editors.
Volume: 3 Issue 1: March 17, 2015Proceedings of the KDD Workshop on Data Science for Social Good.
Arindam Banerjee, Lise Getoor, Rayid Ghani, Claire Montelioni, Matt Rattigan (Eds).
KDD 2014 Workshop.The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2014.
Sofus Macskassy, Claudia Perlich, Jure Leskovec, Wei Wang, Rayid Ghani, Prem Melville (Eds.)
New York, NY, USA, 2014.The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013.
Inderjit S. Dhillon, Yehuda Koren, Rayid Ghani, Ted E. Senator, Paul Bradley, Rajesh Parekh, Jingrui He, Robert L. Grossman, Ramasamy Uthurusamy (Eds.)
Chicago, IL, USA, August 11-14, 2013. ACM 2013, ISBN 978-1-4503-2174-
[Journal, Conference, and Workshop Papers]
Bandit Data-Driven Optimization. Zheyuan Ryan Shi, Zhiwei Steven Wu, Rayid Ghani, Fei Fang. AAAI-22: the 36th AAAI Conference on Artificial Intelligence. 2022.
Explainable machine learning for public policy: Use cases, gaps, and research directions. K Amarasinghe, KT Rodolfa, H Lamba, R Ghani. Data & Policy 5, e5 18. 2023
Bandit Data-Driven Optimization. Zheyuan Ryan Shi, Zhiwei Steven Wu, Rayid Ghani, Fei Fang. AAAI-22: the 36th AAAI Conference on Artificial Intelligence. 2022.
Empirical observation of negligible fairness–accuracy trade-offs in machine learning for public policy. K Rodolfa, H Lamba, R Ghani
Nature Machine Intelligence 3 (10), 896-904 57 2021.An empirical comparison of bias reduction methods on real-world problems in high-stakes policy settings. H Lamba, KT Rodolfa, R Ghani. ACM SIGKDD Explorations Newsletter 23 (1), 69-85 9 2021
Taking our medicine: Standardizing data science education with practice at the core. K Rodolfa, R Ghani. Harvard Data Science Review 3 (1) 1 2021
Machine learning informed decision-making with interpreted model’s outputs: A field intervention. L Zejnilovic, S Lavado, C Soares, Í Martínez De Rituerto De Troya, A Bell, et al. Academy of Management Proceedings 2021 (1), 15424 1 2021
A recommendation and risk classification system for connecting rough sleepers to essential outreach services. H Wilde, LL Chen, A Nguyen, Z Kimpel, J Sidgwick, A De Unanue, .et al. Data & Policy 3, e2 3 2021
Bias and fairness. K Rodolfa, P Saleiro, R Ghani. Big data and social science, 281-312 13 2020
Mapping new informal settlements using machine learning and time series satellite images: An application in the Venezuelan migration crisis. I Tingzon, N Dejito, RA Flores, R De Guzman, L Carvajal, KZ Erazo, et al. 2020 IEEE/ITU International Conference on Artificial Intelligence for Good. 2020
Validation of a machine learning model to predict childhood lead poisoning. E Potash, R Ghani, J Walsh, E Jorgensen, C Lohff, N Prachand, et al. JAMA network open 3 (9), e2012734-e2012734 25 2020
Predictive analytics for retention in care in an urban HIV clinic. A Ramachandran, A Kumar, H Koenig, A De Unanue, C Sung, J Walsh, …
Scientific reports 10 (1), 6421 22 2020Machine learning and artificial intelligence research for patient benefit: 20 critical questions on transparency, replicability, ethics, and effectiveness. S Vollmer, BA Mateen, G Bohner, FJ Király, R Ghani, P Jonsson, et al.
BMJ 368 259 2020Predictive Fairness to Reduce Misdemeanor Recidivism Through Social Service Interventions. K. Rodolfa; E. Salomon; L. Haynes; I. Mendieta; J. Larson; R. Ghani. Proceedings of the ACM Conference on Fairness, Accountability, and Transparency (ACM FAT*) 2020.
Solve for good: A data science for social good marketplace. R Ghani, L Green, A Bengoa, M Shah ACM SIGKDD Explorations Newsletter 21 (2), 3-5. 2020.
An Experience-Centered Approach to Training Effective Data Scientists. Kit T Rodolfa, Adolfo De Unanue, Matt Gee, and Rayid Ghani. Big Data Journal. 2019.
Using Machine Learning to Help Vulnerable Tenants in New York City. Teng Ye, Rebecca Johnson, Samantha Fu, Jerica Copeny, Bridgit Donnelly, Alex Freeman, Mirian Lima, Joe Walsh, and Rayid Ghani. Proceedings of the 2nd ACM SIGCAS Conference on Computing and Sustainable Societies (COMPASS ’19). ACM, New York, NY, USA, 248-258.
Deploying Machine Learning Models for Public Policy: A Framework. Klaus Ackermann, Joe Walsh, Adolfo De Unánue, Hareem Naveed, Andrea Navarrete Rivera, Sun-Joo Lee, Jason Bennett, Michael Defoe, Crystal Cody, Lauren Haynes and Rayid Ghani. 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2018).
Reducing Incarceration through Prioritized Interventions. Matthew J. Bauman, Kate Boxer, Tzu-Yun Lin, Erika Salomon, Hareem Naveed, Lauren Haynes, Joe Walsh, Jen Helsby, Steve Yoder, Robert Sullivan, Rayid Ghani. ACM SIGCAS Conference on Computing and Sustainable Societies, 2018.
Improving Government Response to Citizen Requests Online. Garren Gaut, Andrea Navarette, Laila Wahedi, Paul van der Boor, Adolfo de Unánue, Jorge Díaz, Eduardo Clark, Rayid Ghani. ACM SIGCAS Conference on Computing and Sustainable Societies, 2018.
Using Machine Learning to Assess the Risk of and Prevent Water Main Breaks. Avishek Kumar, Syed Ali Asad Rizvi, Benjamin Brooks, Ali Vanderveld, Kevin Hayes Wilson, Chad Kenney, Adria Finch, Andrew Maxwell, Sam Edelstein, Joe Zuckerbraun and Rayid Ghani. 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2018).
Artificial Intelligence for Social Good. Gregory D. Hager, Ann Drobnis, Fei Fang, Rayid Ghani, Amy Greenwald, Terah Lyons, David C. Parkes, Jason Schultz, Suchi Saria, Stephen F. Smith, and Milind Tambe. Computing Community Consortium. March 2017.
Visualizing Meta-Explanations in Early Intervention Systems for Police Departments [Poster]. Damon Crockett, Joe Walsh, Klaus Ackermann, Andrea Navarrete, Rayid Ghani. IEEE VIS 2017
Machine Learning for Social Services: A case study of prenatal case management in Illinois. Ian Pan, Laura B. Nolan, Rashida R. Brown, Romana Khan, Paul van der Boor, Daniel G. Harris, Rayid Ghani. American Journal of Public Health, 2017.
Early Intervention Systems – Predicting Adverse Interactions Between Police and the Public. Jennifer Helsby, Samuel Carton, Kenneth Joseph, Ayesha Mahmud, Youngsoo Park, Andrea Navarrete, Klaus Ackermann, Joe Walsh, Lauren Haynes, Crystal Cody, Major Estella Patterson, Rayid Ghani. Criminal Justice Policy Review, 2017.
Building Better Early Intervention Systems. Crystal Cody, Estella Patterson, Kerr Putney, Jennifer Helsby, Joe Walsh, Lauren Haynes, and Rayid Ghani. Police Chief Magazine. International Association of Chiefs of Police. 2016
Detecting fraud, corruption, and collusion in international development contracts. Emily Grace, Ankit Rai, Elissa Redmiles, Rayid Ghani. 2016 IEEE International Conference on Big Data.
The Legislative Influence Detector: Finding Text Reuse in State Legislation. Burgess et al. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2016).
Identifying Police Officers at Risk of Adverse Events. Carton et al. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2016).
Designing Policy Recommendations to Reduce Home Abandonment in Mexico. Ackerman et al. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2016).
Identifying Earmarks in Congressional Bills. Khabsa et al. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2016).
A Machine Learning Framework to Identify Students at Risk of Adverse Academic Outcomes.
Himabindu Lakkaraju, Everaldo Aguiar, Carl Shan, David Miller, Nasir Bhanpuri, Rayid Ghani, Kecia Addison. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2015)Predictive Modeling for Public Health: Preventing Childhood Lead Poisoning.
Eric Potash, Joe Brew, Alexander Loewi, Subhabrata Majumdar, Andrew Reece, Joe Walsh, Eric Rozier, Emile Jorgenson, Raed Mansour, Rayid Ghani. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2015)Early Prediction of Code Blue Using Electronic Medical Records.
Sriram Somanchi, Samrachana Adhikari, Allen Lin, Elena Eneva, and Rayid Ghani. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2015)Who, When, and Why: A Machine Learning Approach to Prioritizing Students at Risk of not Graduating High School on Time.
Everaldo Aguiar, Himabindu Lakkaraju, Nasir Bhanpuri, David Miller, Ben Yuhas, Kecia Addison, Shihching Liu, Marilyn Powell, and Rayid Ghani. 5th International Learning Analytics and Knowledge (LAK) Conference 2015.Early Code Blue Prediction Using Patient Medical Records.
Sriram Somanchi, Samrachana Adhikari, Allen Lin, Elena Eneva, and Rayid Ghani. Workshop on Machine Learning for Clinical Data Analysis and Healthcare – held with NIPS 2013.Online Active Learning with Imbalanced Classes. Zahra Ferdowsi, Rayid Ghani, Rafaella Settimi. IEEE International Conference on Data Mining (ICDM 2013).
Targeting and Influencing at Scale: From Presidential Elections to Social Good.
Rayid Ghani.
Talk Abstract – KDD’13. Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data MiningTop-10 Data Mining Case Studies. Gabor Melli et al. International Journal of Information Technology & Decision Making Vol 11 issue 02. 2012.
Interactive Learning for Efficiently Detecting Errors In Insurance Claims.
Rayid Ghani and Mohit Kumar. Proceedings of the Seventeenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2011).A Machine Learning Based System for Semi-Automatically Redacting Documents.
Chad Cumby and Rayid Ghani. Proceedings of the 23rd Annual Conference on Innovative Applications of Artificial Intelligence (IAAI) 2011.Framework for interactive classification problems.
Mohit Kumar, Rayid Ghani, Mohak Shah, Jaime Carbonell, Alex Rudnicky. ICML Workshop on Combining Learning Strategies to Reduce Label Cost – held with ICML 2011An Online Strategy for Safe Active Learning.
Zahra Ferdowsi, Rayid Ghani, Mohit Kumar. ICML Workshop on Combining Learning Strategies to Reduce Label Cost – held with ICML 2011Testing Software In Age Of Data Privacy: A Balancing Act.
Kunal Taneja, Mark Grechanik, Rayid Ghani and Tao Xie.
Joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE 2011).Inference Control to Protect Sensitive Information in Text Documents. [pdf] Chad Cumby, Rayid Ghani.
ACM SIGKDD Workshop on Intelligence and Security Informatics held with ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2010). 2010Data Mining to Predict and Prevent Errors in Healthcare Claims Processing. [pdf] Mohit Kumar, Rayid Ghani, and Zhu-Song Mei. Proceedings of the Sixteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2010).
Online Cost-Sensitive Learning for Efficient Interactive Classification. [pdf] Rayid Ghani and Mohit Kumar. Budgeted Learning Workshop at the 27 th International Conference on Machine Learning ICML 2010.
Toward Optimal Ordering of Prediction Tasks [pdf]
Abhimanyu Lad, Yiming Yang, Rayid Ghani and Bryan Kisiel. SIAM International Conference on Data Mining (SDM09), 2009Graph Structure Learning for Task Ordering. Yiming Yang, Henry Shu, Bryan Kisiel, Chad Cumby, Rayid Ghani, Katharina Probst. ICEIS 2009
Improving Knowledge Worker Productivity – the Active integrated approach
P. Warren, N. Kings, I. Thurlow, J. Davies, T. Buerger, E. Simperl, C. Ruiz, J. M. Gomez-Perez, V. Ermolayev, R. Ghani, M. Tilly, T. Bösser, A. Imtiaz 2009, BT Technologiy Journal (2009)ACTIVE – Enabling the Knowledge-Powered Enterprise: Semantic Technology for Knowledge Worker Productivity.
Warren, P., Thurlow, I., Ghani, R., Probst, K., Jentzsch, E., Ermolayev, V.
In Proc 2nd European Semantic Technology Conference (ESTC 2008), Vienna, Austria, Sep. 29 – Oct. 3, 2008Maximizing Privacy Under Data Distortion Constraints in Noise Perturbation Methods. [pdf] Yaron Rachlin, Katharina Probst, Rayid Ghani. The Second ACM SIGKDD International Workshop on Privacy, Security, and Trust in KDD held with ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2008). 2008
Trade-offs in the Use of Bayesian Filtering for Sensor Fusion. [pdf] [powerpoint presentation]
Anatole Gershman, Rayid Ghani, Damian Roqueiro, and Gang Wei. International Workshop on Knowledge Discovery from Sensor Data (Sensor-KDD’07) –held with ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2007).Towards Interactive Active Learning in MultiView Feature Sets for Information Extraction. [pdf] Katharina Probst, Rayid Ghani. European Conference on Machine Learning (ECML/PKDD 2007).
Semi-supervised Learning of Attribute-Value Pairs from Product Descriptions [pdf] Katharina Probst, Rayid Ghani, Marko Krema, Andy Fano, Yan Liu. Proceedings of the International Joint Conference in Artificial Intelligence 2007 (IJCAI-07).
Semi-Supervised Learning to Extract Attribute-Value Pairs from Product Descriptions on the Web [pdf] [powerpoint presentation] Katharina Probst, Rayid Ghani, Marko Krema, Andrew Fano, and Yan Liu
Workshop on Web Mining at the European Conference on Machine Learning (ECML 2006)Data Mining for Business Applications: KDD 2006 Workshop Report. [pdf]
Rayid Ghani, Carlos Soares. SIGKDD Explorations December 2006 Vol 8 Issue 2 (2006).Text Mining to Extract Product Attributes. [pdf]
Rayid Ghani, Katharina Probst, Yan Liu, Marko Krema, and Andrew Fano. SIGKDD Explorations June 2006 Vol 8 Issue 1 (2006).Using Bayesian Reasoning From Sensor Network for Indoor Surveillance. [pdf] Valery Petrushin, Gang Wei, Rayid Ghani and Anatole Gershman. Workshop on Pervasive Technology Applied: Real-World Experiences with RFID and Sensor Networks (2006)
Learning Individual Consumer Models for Personalized Promotions: A Data Mining Case Study. [powerpoint presentation] Chad Cumby, Andrew Fano, Rayid Ghani, and Marko Krema.
Workshop on Data Mining for Business — held with the European Conference on Machine Learning (ECML/PKDD 2005).Multiple Sensor Integration for Indoor Surveillance.
Valery Petrushin, Gang Wei, Rayid Ghani and Anatole Gershman. Multimedia Data Mining Workshop – held with 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2005)Price Prediction and Insurance for Online Auctions [pdf]
Rayid Ghani. 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2005)A Bayesian Framework for Robust Reasoning from Sensor Networks [pdf]
Valery Petrushin, Rayid Ghani and Anatole Gershman. 2005 AAAI Spring Symposium on AI Technologies for Homeland Security March 21-23, 2005Building Intelligent Shopping Assistant Using Individual Consumer Models [pdf]
C. Cumby, A. Fano, R. Ghani and M. Krema
Proceedings of the 2005 International Conference on Intelligent User Interfaces.Predicting the End-price of Online Auctions [pdf]
R. Ghani and H. Simmons
International Workshop on Data Mining and Adaptive Modelling Methods for Economics and Management held in conjunction with the 15th European Conference on Machine Learning (ECML/PKDDD 2004) Pisa, ItalyMining the Web to Add Semantics to Retail Data Mining
R. Ghani
Invited Paper. Web Mining: From Web to Semantic Web.
Springer Lecture Notes in Artificial Intelligence , Vol. 3209. Berendt, B.; Hotho, A.; Mladenic, D.; van Someren, M.; Spiliopoulou, M.; Stumme, G. (Eds.) 2004Predicting Customer Shopping Lists from Point-of-sale Purchase Data [pdf] [powerpoint presentation]
Chad Cumby, Andy Fano, Rayid Ghani and Marko Krema
10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2004)Building Minority Language Corpora by Learning to Generate Web Search Queries [pdf]
Rayid Ghani, Rosie Jones and Dunja Mladenic
Journal of Knowledge and Information Systems (KAIS), 2003Active Learning for Information Extraction with Multiple View Feature Sets [pdf]
Rayid Ghani, Rosie Jones, Tom Mitchell and Ellen Riloff
Workshop on Adaptive Text Extraction & Mining at the European Conference on Machine Learning (ECML 2003), Dubrovnik, CroatiaCombining Labeled and Unlabeled Data for MultiClass Text Categorization [pdf] [powerpoint presentation]
Rayid Ghani
International Conference on Machine Learning (ICML 2002), 8-12 July 2002, Sydney, AustraliaUsing Text Mining to Infer Semantic Attributes for Retail Data Mining [pdf] [powerpoint presentation]
Rayid Ghani and Andrew E. Fano
IEEE International Conference on Data Mining, December 9-12, 2002. Maebashi, JapanBuilding Recommender Systems Using a Knowledge Base of Product Semantics [pdf]
Rayid Ghani and Andrew Fano
Workshop on Recommendation and Personalization in ECommerce (RPEC 2002) at the Second International Conference on Adaptive Hypermedia and Adaptive Web-based Systems (AH 2002), 28 May 2002, Malaga, SpainAutomatic Training Data Collection For Semi-Supervised Learning of Information Extraction Systems [pdf] Rayid Ghani and Rosie Jones
Accenture Technology Labs Technical Report (2002)A Comparison of Efficacy and Assumptions of Bootstrapping Algorithms for Training Information Extraction Systems [pdf] [powerpoint presentation]
Rayid Ghani and Rosie Jones (Carnegie Mellon University)
Workshop on Linguistic Knowledge Acquisition and Representation: Bootstrapping Annotated Data at the Linguistic Resources and Evaluation Conference (LREC 2002), 27 May 2002, Las Palmas, SpainHypertext Categorization using Hyperlink Patterns and Meta Data [pdf]
Rayid Ghani, Sean Slattery and Yiming Yang
18th International Conference on Machine Learning (ICML 2001), 2001A Study of Approaches for Hypertext Categorization [pdf]
Yiming Yang, Sean Slattery and Rayid Ghani
Journal of Intelligent Information Systems—Special Issue on Automatic Text Categorization, 2001Using Error-Correcting Codes for Efficient Text Classification with a Large Number of Categories
Rayid Ghani
Masters Thesis. Center for Automated Learning & Discovery, Carnegie Mellon University (2001)Combining Labeled and Unlabeled Data for Text Classification with a Large Number of Categories [pdf] [powerpoint presentation]
Rayid Ghani
First IEEE International Conference on Data Mining, 2001Online Learning for Query Generation: Finding Documents Matching a Minority Concept on the Web [pdf] [powerpoint presentation]
Rayid Ghani, Rosie Jones and Dunja Mladenic. International Conference on Web Intelligence, 2001Using the Web to Create Minority Language Corpora [pdf] [powerpoint presentation]
Rayid Ghani, Rosie Jones and Dunja Mladenic
Tenth International Conference on Information and Knowledge Management (CIKM 2001), 2001Automatic Web Search Query Generation to Create Minority Language Corpora [pdf]
Rayid Ghani, Rosie Jones, and Dunja Mladenic
Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2001)Data Mining on Symbolic Knowledge Extracted from the Web [pdf]
Rayid Ghani, Rosie Jones, Dunja Mladenic, Kamal Nigam and Sean Slattery
Workshop on Text Mining at the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2000), 2000Analyzing the Effectiveness and Applicability of Co-Training [pdf]
Kamal Nigam and Rayid Ghani
Ninth International Conference on Information and Knowledge Management (CIKM 2000), 2000Understanding the Behavior of Co-Training [pdf]
Kamal Nigam & Rayid Ghani
Proceedings of the Workshop on Text Mining at the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2000)Learning a Monolingual Language Model from a Multilingual Text Database [pdf] [powerpoint presentation]
Rayid Ghani & Rosie Jones
Proceedings of the Ninth International Conference on Information and Knowledge Management (CIKM 2000)Automatically Building a Corpus for a Minority Language from the Web [pdf]
Rosie Jones & Rayid Ghani
Proceedings of the Student Workshop at the 38th Annual Meeting of the Association for Computational Linguistics (ACL-2000)Using Error-Correcting Codes for Text Classification [pdf]
Rayid Ghani
17th International Conference on Machine Learning (ICML 2000), 2000