Contact Information

Computer Science Department
Colgate University
McGregory Hall, 3rd Floor
13 Oak Drive
Hamilton, NY 13346
(tel) 315.228.7719
Charlotte Jablonski, Administrative Assistant
cjablonski@colgate.edu

Event Detail

Department tea: Differentially Private Machine Learning

Start: Tuesday, November 17, 2015, 11:20 a.m.
End: Tuesday, November 17, 2015, 12:10 p.m.
Location: CS department research lounge

For our department tea on November 17, Cindy Han and Abeneazer Chafamo will talk about the research they did with Prof. Hay this past summer. Lunch will be available.

Title: Differentially Private Machine Learning: An Empirical Evaluation of Differentially Private Classifiers
Abstract: Machine learning is a subfield of artificial intelligence that focuses on recognizing and learning patterns from real data in order to make predictions. For our research project, we were particularly interested in classifiers. A classifier is a machine learning method that uses pattern matching to attempt to assign a label/class to an observation. For example, classifiers can be used to label an email as spam, to predict a patient’s risk level for a particular disease, etc. In certain cases, data that is used to build these classifiers is sensitive (e.g. medical data) and people need a privacy guarantee before they volunteer their data. One of the most prevalent methods of trying to ensure privacy is anonymization or the removal of personal identification information; however, anonymization doesn’t provide sufficient privacy, thus we need a more robust method of privacy. Differential privacy is a proposed alternative to anonymization. It ensures that computations be insensitive to changes in an individual’s record. Differential privacy achieves this by adding noise to the statistical computations. There has been much research in the past on differentially private classifiers; however, there has not yet been a comprehensive study of the existing differentially private classifiers. The goals of the project were to look at the current algorithms in the field and do an empirical comparison and to propose possible improvements to current algorithms.