[Edited Books]

Big Data and Social Science: A Practical Guide to Methods and Tools. Ian Foster, Rayid Ghani, Ron Jarmin and Frauke Kreuter, Julia Lane. Chapman and Hall/CRC Press, 2016.

Data Mining for Business Applications.
Editors: Carlos Soares, Rayid Ghani. IOS Press, 2010.

[Book Chapters]

Machine Learning
Rayid Ghani and Malte Schierholz. In Big Data and Social Science: A Practical Guide to Methods and Tools. Chapman and Hall/CRC Press, 2016.

Machine Learning and Semantic Technologies for Enterprise Knowledge Management.
Rayid Ghani and Divna Djordjevic.
Book Chapter, Context and Semantics in Knowledge Management. Springer 2011

Data Mining for Consumer Modeling and Personalized Promotions. [pdf]
Rayid GhaniChad Cumby, Andrew Fano, and Marko Krema.
Book Chapter – Data Mining Methods and Applications. Kenneth D. Lawrence, Stephan Kudyba, Ronald K. Klimberg (Eds.). Auerbach Publications. 2008

Extracting and using Attribute-Value pairs from product descriptions on the web. [pdf]
Katharina Probst, Rayid Ghani , Yan Liu, Marko Krema, and Andrew Fano.
Book chapter – Web Mining. 2007

[Edited Proceedings]

Big Data for Social Good. Special Issue of the Big Data Journal, 2015.
Charlie Catlett and Rayid Ghani, Editors.
Volume: 3 Issue 1: March 17, 2015

Proceedings of the KDD Workshop on Data Science for Social Good.
Arindam Banerjee, Lise Getoor, Rayid Ghani, Claire Montelioni, Matt Rattigan (Eds).
KDD 2014 Workshop.

The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2014.
Sofus Macskassy, Claudia Perlich, Jure Leskovec, Wei Wang, Rayid Ghani, Prem Melville  (Eds.)
New York, NY, USA,  2014.

The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013.
Inderjit S. Dhillon, Yehuda Koren, Rayid Ghani, Ted E. Senator, Paul Bradley, Rajesh Parekh, Jingrui He, Robert L. Grossman, Ramasamy Uthurusamy (Eds.)
Chicago, IL, USA, August 11-14, 2013. ACM 2013, ISBN 978-1-4503-2174-7

Proceedings of the 2011 Workshop on Data Mining for Medicine and Healthcare
Editors: Nitesh Chawla, Rayid Ghani, et al.
KDD 2011.

Proceedings of the 2011 Workshop on Machine Learning for Global Challenges
Editors: Arindam Banerjee, Rayid Ghani, Claire Monteleoni, Vikas Sindhwani
ICML 2011.

Proceedings of KDD Workshop on Data Mining for Business Applications.
Rayid Ghani , Carlos Soares, Fracoise Soulie-Fogelman Editors.
KDD 2008.

Proceedings of KDD Workshop on Data Mining for Business Applications.[pdf]
Rayid Ghani, Carlos Soares, Editors.
Proceedings of the KDD Workshop on Data Mining for Business Applications (2006).

Learning from Partially Classified Data.
M. Amini, O. Chapelle, R. Ghani, Editors.
Proceedings of ICML Workshop on Learning from Partially Classified Data (2005).

[Journal, Conference, and Workshop Papers]

Machine Learning for Social Services: A case study of prenatal case management in Illinois. Ian Pan, Laura Nolan, Rashida Brown, Paul van der Boor, Romana Khan, Rayid Ghani, Dan Harris. American Journal of Public Health. Forthcoming, 2017.

Early Intervention Systems: Predicting Adverse Interactions Between Police and the Public. Jennifer Helsby, Samuel Carton, Kenneth Joseph, Ayesha Mahmud, Youngsoo Park, Andrea Navarrete, Klaus Ackermann, Joe Walsh, Lauren Haynes, Crystal Cody, Major Estella Patterson, Rayid Ghani. Criminal Justice Policy Review. 2017.

Building Better Early Intervention Systems. Cody et al. Police Chief Magazine. 2016

Detecting fraud, corruption, and collusion in international development contracts. Emily Grace, Ankit Rai, Elissa Redmiles, Rayid Ghani. 2016 IEEE International Conference on Big Data.

The Legislative Influence Detector: Finding Text Reuse in State Legislation. Burgess et al. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2016).

Identifying Police Officers at Risk of Adverse Events. Carton et al. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2016).

Designing Policy Recommendations to Reduce Home Abandonment in Mexico. Ackerman et al. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2016).

Identifying Earmarks in Congressional Bills. Khabsa et al. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2016).

A Machine Learning Framework to Identify Students at Risk of Adverse Academic Outcomes.
Himabindu Lakkaraju, Everaldo Aguiar, Carl Shan, David Miller, Nasir Bhanpuri, Rayid Ghani, Kecia Addison. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining  (KDD 2015)

Predictive Modeling for Public Health: Preventing Childhood Lead Poisoning.
Eric Potash, Joe Brew, Alexander Loewi, Subhabrata Majumdar, Andrew Reece, Joe Walsh, Eric Rozier, Emile Jorgenson, Raed Mansour, Rayid Ghani. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining  (KDD 2015)

Early Prediction of Code Blue Using Electronic Medical Records.
Sriram Somanchi, Samrachana Adhikari, Allen Lin, Elena Eneva, and Rayid Ghani. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining  (KDD 2015)

Who, When, and Why: A Machine Learning Approach to Prioritizing Students at Risk of not Graduating High School on Time.
Everaldo Aguiar, Himabindu Lakkaraju, Nasir Bhanpuri, David Miller, Ben Yuhas, Kecia Addison, Shihching Liu, Marilyn Powell, and Rayid Ghani. 5th International Learning Analytics and Knowledge (LAK) Conference 2015.

Early Code Blue Prediction Using Patient Medical Records.
Sriram Somanchi, Samrachana Adhikari, Allen Lin, Elena Eneva, and Rayid Ghani. Workshop on Machine Learning for Clinical Data Analysis and Healthcare – held with NIPS 2013.

Online Active Learning with Imbalanced Classes. Zahra Ferdowsi, Rayid Ghani, Rafaella Settimi. IEEE International Conference on Data Mining (ICDM 2013).

Targeting and Influencing at Scale: From Presidential Elections to Social Good.
Rayid Ghani.
Talk Abstract – KDD’13. Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Top-10 Data Mining Case Studies. Gabor Melli et al. International Journal of Information
Technology & Decision Making Vol 11 issue 02. 2012.

Interactive Learning for Efficiently Detecting Errors In Insurance Claims.
Rayid Ghani and Mohit Kumar.
Proceedings of the Seventeenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2011).

A Machine Learning Based System for Semi-Automatically Redacting Documents.
Chad Cumby and Rayid Ghani. Proceedings of the 23rd Annual Conference on Innovative Applications of Artificial Intelligence (IAAI) 2011.

Framework for interactive classification problems.
Mohit Kumar, Rayid Ghani, Mohak Shah, Jaime Carbonell, Alex Rudnicky.
ICML Workshop on Combining Learning Strategies to Reduce Label Cost – held with ICML 2011

An Online Strategy for Safe Active Learning.
Zahra Ferdowsi, Rayid Ghani, Mohit Kumar. ICML
Workshop on Combining Learning Strategies to Reduce Label Cost – held with ICML 2011

Testing Software In Age Of Data Privacy: A Balancing Act.
Kunal Taneja, Mark Grechanik, Rayid Ghani and Tao Xie.
Joint meeting of the European Software Engineering Conference and the ACM
SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE 2011).

Inference Control to Protect Sensitive Information in Text Documents. [pdf]
Chad Cumby, Rayid Ghani.
ACM SIGKDD Workshop on Intelligence and Security Informatics held with ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2010). 2010

Data Mining to Predict and Prevent Errors in Healthcare Claims Processing. [pdf]
Mohit Kumar, Rayid Ghani, and Zhu-Song Mei.
Proceedings of the Sixteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2010).

Online Cost-Sensitive Learning for Efficient Interactive Classification. [pdf]
Rayid Ghani and Mohit Kumar.
Budgeted Learning Workshop at the 27 th International Conference on Machine Learning ICML 2010.

Toward Optimal Ordering of Prediction Tasks [pdf]
Abhimanyu Lad, Yiming Yang, Rayid Ghani and Bryan Kisiel.
SIAM International Conference on Data Mining (SDM09), 2009

Graph Structure Learning for Task Ordering. Yiming Yang, Henry Shu, Bryan Kisiel, Chad Cumby, Rayid Ghani, Katharina Probst. ICEIS 2009

Improving Knowledge Worker Productivity – the Active integrated approach
P. Warren, N. Kings, I. Thurlow, J. Davies, T. Buerger, E. Simperl, C. Ruiz, J. M. Gomez-Perez, V. Ermolayev, R. Ghani, M. Tilly, T. Bösser, A. Imtiaz
2009, BT Technologiy Journal (2009)

ACTIVE – Enabling the Knowledge-Powered Enterprise: Semantic Technology for Knowledge Worker Productivity.
Warren, P., Thurlow, I., Ghani, R., Probst, K., Jentzsch, E., Ermolayev, V.
In Proc 2nd European Semantic Technology Conference (ESTC 2008), Vienna, Austria, Sep. 29 – Oct. 3, 2008

Maximizing Privacy Under Data Distortion Constraints in Noise Perturbation Methods. [pdf]
Yaron Rachlin, Katharina Probst, Rayid Ghani.
The Second ACM SIGKDD International Workshop on Privacy, Security, and Trust in KDD held with ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2008). 2008

Trade-offs in the Use of Bayesian Filtering for Sensor Fusion. [pdf] [powerpoint presentation]
Anatole Gershman, Rayid Ghani, Damian Roqueiro, and Gang Wei.
International Workshop on Knowledge Discovery from Sensor Data (Sensor-KDD’07) –held with ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
 (KDD-2007).

Towards Interactive Active Learning in MultiView Feature Sets for Information Extraction. [pdf]
Katharina Probst, Rayid Ghani. European Conference on Machine Learning (ECML/PKDD 2007).

Semi-supervised Learning of Attribute-Value Pairs from Product Descriptions [pdf]
Katharina Probst, Rayid Ghani, Marko Krema, Andy Fano, Yan Liu.
Proceedings of the International Joint Conference in Artificial Intelligence 2007 (IJCAI-07).

Semi-Supervised Learning to Extract Attribute-Value Pairs from Product Descriptions on the Web [pdf]   [powerpoint presentation]
Katharina ProbstRayid Ghani, Marko Krema, Andrew Fano, and Yan Liu
Workshop on Web Mining at the European Conference on Machine Learning (ECML 2006)

Data Mining for Business Applications: KDD 2006 Workshop Report. [pdf]
Rayid Ghani, Carlos Soares.
SIGKDD Explorations December 2006 Vol 8 Issue 2 (2006).

Text Mining to Extract Product Attributes. [pdf]
Rayid Ghani, Katharina Probst, Yan Liu, Marko Krema, and Andrew Fano.
SIGKDD Explorations June 2006 Vol 8 Issue 1 (2006).

Using Bayesian Reasoning From Sensor Network for Indoor Surveillance. [pdf]
Valery Petrushin, Gang Wei, Rayid Ghani and Anatole Gershman.
Workshop on Pervasive Technology Applied: Real-World Experiences with RFID and Sensor Networks (2006)

Learning Individual Consumer Models for Personalized Promotions: A Data Mining Case Study. [powerpoint presentation]
Chad Cumby, Andrew Fano, Rayid Ghani, and Marko Krema. 
Workshop on Data Mining for Business — held with the European Conference on Machine Learning (
ECML/PKDD 2005).

Multiple Sensor Integration for Indoor Surveillance.
Valery Petrushin, Gang Wei, Rayid Ghani and Anatole Gershman.
Multimedia Data Mining Workshop – held with 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2005)

Price Prediction and Insurance for Online Auctions [pdf]
Rayid Ghani
11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2005)

A Bayesian Framework for Robust Reasoning from Sensor Networks [pdf]
Valery Petrushin, Rayid Ghani and Anatole Gershman
2005 AAAI Spring Symposium on AI Technologies for Homeland Security
March 21-23, 2005

Building Intelligent Shopping Assistant Using Individual Consumer Models [pdf]
C. Cumby, A. Fano, R. Ghani and M. Krema
Proceedings of the 2005 International Conference on Intelligent User Interfaces
January 9-12, 2005

Predicting the End-price of Online Auctions [pdf]
R. Ghani and H. Simmons
International Workshop on Data Mining and Adaptive Modelling Methods for Economics and Management held in conjunction with the 15th European Conference on Machine Learning (ECML/PKDDD 2004)
Pisa, Italy

Mining the Web to Add Semantics to Retail Data Mining
R. Ghani
Invited Paper. Web Mining: From Web to Semantic Web.
Springer Lecture Notes in Artificial Intelligence , Vol. 3209. Berendt, B.; Hotho, A.; Mladenic, D.; van Someren, M.; Spiliopoulou, M.; Stumme, G. (Eds.) 

2004

Predicting Customer Shopping Lists from Point-of-sale Purchase Data [pdf]   [powerpoint presentation]
Chad Cumby, Andy Fano, Rayid Ghani and Marko Krema
10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2004)

Building Minority Language Corpora by Learning to Generate Web Search Queries [pdf]
Rayid Ghani, Rosie Jones and Dunja Mladenic
Journal of Knowledge and Information Systems (KAIS), 2003

Active Learning for Information Extraction with Multiple View Feature Sets [pdf]
Rayid Ghani, Rosie Jones, Tom Mitchell and Ellen Riloff
Workshop on Adaptive Text Extraction & Mining at the European Conference on Machine Learning (ECML 2003), Dubrovnik, Croatia

Combining Labeled and Unlabeled Data for MultiClass Text Categorization [pdf]  [powerpoint presentation]
Rayid Ghani
International Conference on Machine Learning (ICML 2002), 8-12 July 2002, Sydney, Australia

Using Text Mining to Infer Semantic Attributes for Retail Data Mining [pdf] [powerpoint presentation]
Rayid Ghani and Andrew E. Fano
IEEE International Conference on Data Mining, December 9-12, 2002. Maebashi, Japan

Building Recommender Systems Using a Knowledge Base of Product Semantics [pdf]
Rayid Ghani and Andrew Fano
Workshop on Recommendation and Personalization in ECommerce (RPEC 2002) at the Second International Conference on Adaptive Hypermedia and Adaptive Web-based Systems (AH 2002), 28 May 2002, Malaga, Spain

Automatic Training Data Collection For Semi-Supervised Learning of Information Extraction Systems [pdf]
Rayid Ghani and Rosie Jones
Accenture Technology Labs Technical Report (2002)

A Comparison of Efficacy and Assumptions of Bootstrapping Algorithms for Training Information Extraction Systems [pdf]   [powerpoint presentation]
Rayid Ghani and Rosie Jones (Carnegie Mellon University)
Workshop on Linguistic Knowledge Acquisition and Representation: Bootstrapping Annotated Data at the Linguistic Resources and Evaluation Conference (LREC 2002), 27 May 2002, Las Palmas, Spain

Hypertext Categorization using Hyperlink Patterns and Meta Data [pdf]
Rayid Ghani, Sean Slattery and Yiming Yang
18th International Conference on Machine Learning (ICML 2001), 2001

A Study of Approaches for Hypertext Categorization [pdf]
Yiming Yang, Sean Slattery and Rayid Ghani
Journal of Intelligent Information Systems—Special Issue on Automatic Text Categorization, 2001

Using Error-Correcting Codes for Efficient Text Classification with a Large Number of Categories
Rayid Ghani
Masters Thesis. Center for Automated Learning & Discovery, Carnegie Mellon University (2001)

Combining Labeled and Unlabeled Data for Text Classification with a Large Number of Categories [pdf]  [powerpoint presentation]
Rayid Ghani
First IEEE International Conference on Data Mining, 2001

Online Learning for Query Generation: Finding Documents Matching a Minority Concept on the Web [pdf]  [powerpoint presentation]
Rayid Ghani, Rosie Jones and Dunja Mladenic
International Conference on Web Intelligence, 2001

Using the Web to Create Minority Language Corpora  [pdf]  [powerpoint presentation]
Rayid Ghani, Rosie Jones and Dunja Mladenic
Tenth International Conference on Information and Knowledge Management (CIKM 2001), 2001

Automatic Web Search Query Generation to Create Minority Language Corpora [pdf]
Rayid Ghani, Rosie Jones, and Dunja Mladenic
Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2001)

Data Mining on Symbolic Knowledge Extracted from the Web [pdf]
Rayid Ghani, Rosie Jones, Dunja Mladenic, Kamal Nigam and Sean Slattery
Workshop on Text Mining at the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2000), 2000

Analyzing the Effectiveness and Applicability of Co-Training [pdf]
Kamal Nigam and Rayid Ghani
Ninth International Conference on Information and Knowledge Management (CIKM 2000), 2000

Understanding the Behavior of Co-Training [pdf]
Kamal Nigam & Rayid Ghani
Proceedings of the Workshop on Text Mining at the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2000)

Learning a Monolingual Language Model from a Multilingual Text Database [pdf]  [powerpoint presentation]
Rayid Ghani & Rosie Jones
Proceedings of the Ninth International Conference on Information and Knowledge Management (CIKM 2000)

Automatically Building a Corpus for a Minority Language from the Web [pdf]
Rosie Jones & Rayid Ghani
Proceedings of the Student Workshop at the 38th Annual Meeting of the Association for Computational Linguistics (ACL-2000)

Using Error-Correcting Codes for Text Classification [pdf]
Rayid Ghani
17th International Conference on Machine Learning (ICML 2000), 2000