View on GitHub

Ethics, Fairness, Responsibility, and Privacy in Data Science (DATA 25900) at The University of Chicago

Website for DATA 25900 at UChicago

Schedule

Please, check out this schedule frequently as it will likely change a bit throughout the quarter.

Legend:

PA stands for Programming Assignment
IP stands for Individual Project
SR stands for Issue Report
R stands for reading assignment

#Lecture	Date	Lecture	Keywords	Readings	Important Dates
1	03/30	Course Overview. Introduction to the process of Data Science	introduction, descriptive statistics, inferential statistics	*Reading 1: Solon Barocas and Andrew D. Selbst. Big Data’s Disparate Impact. 104 California Law Review* 671, 2016.	IP Proposal assigned SR assigned
2	04/01	Data: Context and Quality	collection, preparation, cleaning, missing data
3	04/06	Pitfalls in Inferential Statistics	Multiple hypothesis, Bonferroni correction, false discovery rate, statistical vs practical significance	*Reading 2.1: danah boyd and Kate Crawford. Critical Questions for Big Data. Information, Communication, and Society.* 2012 *Reading 2.2: Stephen Stigler. Data Have a Limited Shelf Life. HDSR* 2019. *Reading 2.3: Michael Jordan Artificial Intelligence: The Revolution Hasn’t Happened Yet. HDSR* 2019.	PA1 assigned IP Proposal due R1 due
4	04/08	The design of experiments and protection of human subjects	human subjects, experimental design, AB testing
5	04/13	More experimental design. Causality	experiments, causality, observational vs experimental data	*Reading 3.1: Department of Health, Education, and Welfare. The Belmont Report. April 18, 1979. Reading 3.2: Michelle N. Meyer. Everything You Need to Know About Facebook’s Controversial Emotion Experiment. Wired, June 30, 2014. Reading 3.3: Robinson Meyer. Everything We Know About Facebook’s Secret Mood Manipulation Experiment. The Atlantic*, June 28, 2014.	R2 due
6	04/15	Causation vs Correlation. Causal Inference	pitfalls in communicating results, introduction to causal inference, quasi-experiments
7	04/20	Introduction to Machine Learning	optimization vs generalization, training and test data, models, learning		PA1 due R3 due
8	04/22	Machine Learning in the Wild	training data, feature engineering, information leakage, concept drift, algorithmic decision making		PA2 assigned
9	04/27	Fairness in Machine Learning	fairness definitions	*Reading 4.1: Julia Angwin, Jeff Larson, Surya Mattu, Lauren Kirchner. Machine Bias. ProPublica, May 23, 2016 Reading 4.2: Solon Barocas, Moritz Hardt, Arvind Narayanan. Fairness and Machine Learning Chapter 2: Classification. fairmlbook.org*, 2019
10	04/29	Visualization and Communication	packaging data products, reproducibility, repeatibility, visualization, communication		PA3 assigned
11	05/04	Philosophy of Privacy	privacy definitions, law, technology	*Reading 5.1: Daniel Solove. ‘I’ve Got Nothing to Hide’ and Other Misunderstandings of Privacy. San Diego Law Review* 44, 2007. *Reading 5.2: Arvind Narayanan and Vitaly Shmatikov. Robust De-anonymization of Large Sparse Datasets. In Proc. IEEE Symposium on Security and Privacy*, 2008.	R4 due
12	05/06	Anonymity	deanonymization		PA2 due
13	05/11	Statistical Data Privacy	differential privacy, sensitivity		R5 due
14	05/13	Data Lifecycles	provenance, right to be forgotten, data portability		PA4 assigned PA3 due
15	05/18	Economics of Data and Externalities 1	data ownership, value of data, data markets, markets for privacy, data brokers	*Reading 6:* Economic properties of data and the monopolistic tendencies of data economy: policies to limit an Orwellian possibility. 17 May 2020 by Hoi Wai Jackie Cheng ST/ESA/2020/DWP/164
16	05/20	Economics of Data and Externalities 2. Other topics	data unions, cooperatives, strikes		IP due
17	05/25	Project Presentation 1			PA4 due R6 due
18	05/27	Project Presentation 2			SR due PA Quiz (24 hour window opens)