Security and Privacy of Machine Learning - CIS 700 / Spring 2020
Overview
- Instructor: Ferdinando Fioretto [email]
- Location: Life Science Building 300
- Time: Mon and Wed: 5:15-6:35pm
- Office hours: Fridays 12:30-1:30pm
- Office location: 4-125 CST
- Initial Project Report Deadline: January 31
- Project Progress Report: February 28
Teams
- Team Alderaan: Mu Bai
- Team Coruscant: David Castello; Zuhal Altundal
- Team Kamino: Joel Yuhas; Vedhas Sandeep Patkar
- Team Mandalore: Amin Fallahi; Jindi Wu
- Team Naboo: Ankit Khare; James Kotari
- Team Onderon: Mengyu Liu; Lin Zhang
- Team Tatooine: Pratik Ashok Paranjape; Raman Srivastava
- Team Yavin: Cuong Tran; Hanyi Li
Schedule and material
Below is the calendar for this semester course. This is the preliminary schedule, which will be altered as the semester progresses. I will attempt to announce any change to the class, but this webpage should be viewed as authoritative. If you have any questions, please contact me.
Module 1: Evasion Attacks
Date | Topic | Reading / Assignment | Present | Report |
---|---|---|---|---|
Jan 13 | Overview & motivation | slides | ||
Jan 15 | Attacks | Mandatory Reading:
|
Team Alderaan | Team Naboo |
Jan 20 | No Class (Martin Luther Kind day) | |||
Jan 22 | Attacks and Adversarial Training | Mandatory Reading:
|
Team Coruscant | Team Naboo |
Jan 27 | Defensive Distillation |
Mandatory Reading:
|
Team Kamino | Team Naboo |
Jan 29 | Other Defensive Mechanisms |
Mandatory Reading:
|
Team Mandalore | Team Naboo |
Module 2: Poisoning Attacks
Date | Topic | Reading / Assignment | Present | Report |
---|---|---|---|---|
Feb 3 | Introduction |
Mandatory Reading:
|
Team Onderon | Team Coruscant |
Feb 5 | Attacks on ML systems |
Mandatory Reading:
|
Team Tatooine | Team Coruscant |
Feb 10 | No Class (AAAI) | |||
Feb 12 | No Class (AAAI) | |||
Feb 17 | Defense Mechanisms |
Mandatory Reading:
|
Team Yavin | Team Coruscant |
Module 3: Privacy Attacks
Date | Topic | Reading / Assignment | Present | Report |
---|---|---|---|---|
Feb 19 | Data Exposure |
Mandatory Reading:
|
Team Naboo | Team Alderaan |
Feb 24 | Privacy Attacks in Deep Learning |
Mandatory Reading:
|
Team Coruscant | Team Alderaan |
Feb 26 | No Class (CRA) |
Module 4: Differential Privacy
Module 5: Differential Privacy and Machine Learning
Project Review Session
Date | Topic | Reading / Assignment | Present | Report |
---|---|---|---|---|
Mar 30 | Project Review | All |
Module 6: Differential Privacy Model Extensions
Date | Topic | Reading / Assignment | Present | Report |
---|---|---|---|---|
Apr 1 | Local DP |
Mandatory Reading:
|
Team Kamino | Team Tatooine | Apr 6 | Temporal DP |
Mandatory Reading:
|
Team Naboo | Team Tatooine |
Module 7: Model Robustness
Date | Topic | Reading / Assignment | Present | Report |
---|---|---|---|---|
Apr 8 | Robustness in ML |
Mandatory Reading:
|
Team Coruscant | No Notes |
Module 8: Federated Learning
Date | Topic | Reading / Assignment | Present | Report |
---|---|---|---|---|
Apr 13 | Preliminaries |
Mandatory Reading:
|
Guest Lecture: Pranay | Team Onderon |
Apr 15 | Seminal papers |
Mandatory Reading:
|
Team Tatooine | Team Onderon |
Apr 20 | Privacy |
Mandatory Reading:
|
Team Yavin | Team Onderon |
Module 9: Fairness and Bias
Date | Topic | Reading / Assignment | Present | Report |
---|---|---|---|---|
Apr 22 | Preliminaries |
Mandatory Reading:
|
Team Mandalore | Team Yavin |
Apr 27 | Applications |
Mandatory Reading:
|
Team Kamino | Team Yavin |
Final Presentation
Date | Topic | Reading / Assignment | Present | Report |
---|---|---|---|---|
May 4 | Poster Session |
Assignments
LaTeX template for assignments
All class notes should be submitted using the AAAI Template.Paper presentation
In each class, a team of students will present the assigned papers. Different types of presentation are allowed (e.g., slides, interactive demos or code tutorials). The only requirements is that the presentation should (a) involve the class in active discussions, (b) cover all papers assigned for reading, and (c) last no more than 1:15h including discussions.Class notes
Another team of students will be charged with writing notes synthesizing the content of the presentation and class discussion.Research projects
Students will work on a course-long research project. Each project will be presented in the form of a poster on May 4 (tentative).Grading
Grading scheme
30% paper presentation, 20% class notes, 10% class participation, 40% research project.Class participation
Course lectures will be driven by the contents of assigned papers. All students are asked to participate in an active discussions of the paper content during each class.Lateness policy
The presentation material should be presented two days prior the day of presentation. A 10% per-day late-penalty will be applied for delays. If the presentation is not ready for the day in which the team is supposed to present all students in the team will be assigned 0 points.Integrity
Any instance of sharing or plagiarism, copying, cheating, or other disallowed behavior will constitute a breach of ethics. Students are responsible for reporting any violation of these rules by other students, and failure to constitutes an ethical violation that carries with it similar penalties.Ethics statement
In this course, you will be learning about and exploring some vulnerabilities that could be exploited to compromise deployed systems. You are trusted to behave responsibility and ethically. You may not attack any system without permission of its owners, and may not use anything you learn in this class for evil. If you have doubts about ethical and legal aspects of what you want to do, you should check with the course instructor before proceeding. Any activity outside the letter or spirit of these guidelines will be reported to the proper authorities and may result in dismissal from the class.
Final Projects Summary
Automotive Anomaly Detection with Resistence to Adversarial Samples
Members: Mengyu Liu, Lin Zhang
Summary:
Cyber-Physical system is a field integrate communication, controlling and computing with multiple sensors and actuators connected to the physical world. Attacks from invader and complex environment could lead to abnormal state of the system. Many previous works designed various anomaly detection algorithms to handle this problem. Recently, machine learning methods has been applied in many works but anomaly samples are extreme rare which makes the performance not good. An efficiency way is data augmentation, a few works applied Generative neural networks to achieve better performance, but none of them explain how the process related to physical world and did interpret it from a knowledge-based view. Our work proposed a novel perspective on how to understand how Generative Neural Network helped exploiting the knowledge of our collected data with different anomaly detection algorithm. Additionally, we built testbeds to simulate autonomous vehicles to see how the augmented data related to physical world.
Presentation
Comparing Model Accuracy for Differential Privacy and Machine Unlearning on Sensitive Data
Members: Vedhas S. Patkar and Joel W. Yuhas
Summary:
Machine Learning models, particularly those that employ neural networks, generally need large datasets for identification. Our aim is to employ existing models that use differential privacy and machine unlearning to determine the best possible outcomes for sensitive and identifiable data, both in terms of privacy and model accuracy. We will set out to find the limitations and advantages of existing architectures, and figure out the possible changes that can be made to them to provide better outcomes
Presentation
Asynchronous Federated Learning - Literature Review
Members: David Castello
Summary:
Federated learning is an emergent field of research that seeks to train performant machine learning models across a heterogeneous collection of mobile devices, each with access to private data. Despite the fact that such devices have uneven hardware capabilities and quantities of training data, and that network constraints and hardware failures can occur at any time, state of the art algorithms to conduct federated learning operate in synchronous rounds of communication. This work is a literature review of recent efforts to address this contradiction via novel protocols that relax the need to synchronize local training. Following a background study of the influences behind federated learning, six new research papers are surveyed to assess the progress of this new field towards developing robust and effective asynchronous federated learning algorithms.
Presentation
Learning a Fair Model under Privacy Constraints
Members: Cuong Tran
Summary:
In this class project we study binary classification problems under three angles: accuracy, privacy and fairness. We explore how to maximize model’s accuracy given the constraints that the models’ prediction should not be discriminative towards certain protected groups. At the same time, we aim the model should be secured in term of protecting training data.
Presentation
Improved StarGAN for Generating Adversarial Samples
Members: Jindi Wu, Mu Bai
Summary:
We propose a solution of adversarial samples generation based on the StarGAN, which is an image-to-image translation model and can smoothly generate a set of new images from source images. The images generated by StarGAN could be adversarial samples for attacking deep neural network classifier. Our main aim is to attack the targeted model with specific output label and pass the corresponding smooth detection simultaneously.
Presentation
Constrained Deep Learning for Fast Approximation of Scheduling Problems
Members: James Kotary
Summary:
Combinatorial optimization problems in planning and scheduling are known to lack efficient solution methods,while demand for the frequent and accurate solution of such problems is ever increasing across several engineering fields. Recent developments in machine learning promise new approaches for the fast approximation of constrained optimization problems at significantly reduced time-scales. This project will explore the application of constrained deep learning to build a system that predicts accurate solutions to challenging scheduling problems, with favorable properties not reflected in the original problem structures.
Presentation
Privacy Preserving with Preference Elicitation Process
Members: Pratik Ashok Paranjape and Raman Srivastava
Summary:
Preference elicitation is the process of developing a decision support system to generate recommendations for a user. Privacy preserving refers to the system of publicly sharing information about a dataset by describing the patterns of groups within the dataset without dis-closing the information about individuals in the dataset. The idea of this project is to implement differential privacy in a recommendation system.
Presentation