View on GitHub

Ethics, Fairness, Responsibility, and Privacy in Data Science (DATA 25900) at The University of Chicago

Website for DATA 25900 at UChicago

Schedule

Please, check out this schedule frequently as it will likely change a bit throughout the quarter.

Legend:

#Lecture Date Lecture Keywords Readings Important Dates
1 03/30 Course Overview. Introduction to the process of Data Science introduction, descriptive statistics, inferential statistics Reading 1: Solon Barocas and Andrew D. Selbst. Big Data’s Disparate Impact. 104 California Law Review 671, 2016. IP Proposal assigned SR assigned
2 04/01 Data: Context and Quality collection, preparation, cleaning, missing data    
3 04/06 Pitfalls in Inferential Statistics Multiple hypothesis, Bonferroni correction, false discovery rate, statistical vs practical significance Reading 2.1: danah boyd and Kate Crawford. Critical Questions for Big Data. Information, Communication, and Society. 2012 Reading 2.2: Stephen Stigler. Data Have a Limited Shelf Life. HDSR 2019. Reading 2.3: Michael Jordan Artificial Intelligence: The Revolution Hasn’t Happened Yet. HDSR 2019. PA1 assigned IP Proposal due R1 due
4 04/08 The design of experiments and protection of human subjects human subjects, experimental design, AB testing    
5 04/13 More experimental design. Causality experiments, causality, observational vs experimental data Reading 3.1: Department of Health, Education, and Welfare. The Belmont Report. April 18, 1979. Reading 3.2: Michelle N. Meyer. Everything You Need to Know About Facebook’s Controversial Emotion Experiment. Wired, June 30, 2014. Reading 3.3: Robinson Meyer. Everything We Know About Facebook’s Secret Mood Manipulation Experiment. The Atlantic, June 28, 2014. R2 due
6 04/15 Causation vs Correlation. Causal Inference pitfalls in communicating results, introduction to causal inference, quasi-experiments    
7 04/20 Introduction to Machine Learning optimization vs generalization, training and test data, models, learning   PA1 due R3 due
8 04/22 Machine Learning in the Wild training data, feature engineering, information leakage, concept drift, algorithmic decision making   PA2 assigned
9 04/27 Fairness in Machine Learning fairness definitions Reading 4.1: Julia Angwin, Jeff Larson, Surya Mattu, Lauren Kirchner. Machine Bias. ProPublica, May 23, 2016 Reading 4.2: Solon Barocas, Moritz Hardt, Arvind Narayanan. Fairness and Machine Learning Chapter 2: Classification. fairmlbook.org, 2019  
10 04/29 Visualization and Communication packaging data products, reproducibility, repeatibility, visualization, communication   PA3 assigned
11 05/04 Philosophy of Privacy privacy definitions, law, technology Reading 5.1: Daniel Solove. ‘I’ve Got Nothing to Hide’ and Other Misunderstandings of Privacy. San Diego Law Review 44, 2007. Reading 5.2: Arvind Narayanan and Vitaly Shmatikov. Robust De-anonymization of Large Sparse Datasets. In Proc. IEEE Symposium on Security and Privacy, 2008. R4 due
12 05/06 Anonymity deanonymization   PA2 due
13 05/11 Statistical Data Privacy differential privacy, sensitivity   R5 due
14 05/13 Data Lifecycles provenance, right to be forgotten, data portability   PA4 assigned PA3 due
15 05/18 Economics of Data and Externalities 1 data ownership, value of data, data markets, markets for privacy, data brokers Reading 6: Economic properties of data and the monopolistic tendencies of data economy: policies to limit an Orwellian possibility. 17 May 2020 by Hoi Wai Jackie Cheng ST/ESA/2020/DWP/164  
16 05/20 Economics of Data and Externalities 2. Other topics data unions, cooperatives, strikes   IP due
17 05/25 Project Presentation 1     PA4 due R6 due
18 05/27 Project Presentation 2     SR due PA Quiz (24 hour window opens)