View on GitHub

Ethics, Fairness, Responsibility, and Privacy in Data Science (DATA 25900) at The University of Chicago

259, Spring 24 edition

Schedule

Please, check out this schedule frequently as it will likely change a bit throughout the quarter.

Legend:

#Lecture Date Lecture Keywords Readings Important Dates
1 03/19 Course Overview and Introduction to the Data Science Process Data science lifecycle. Ethics, fairness, responsibility, and privacy issues. Reading 1.1: John P. A. Ioannidis Why Most Published Research Findings Are False PLOS Medicine. 2005 Reading 1.2: Michael Jordan Artificial Intelligence: The Revolution Hasn’t Happened Yet. HDSR 2019. PA0 assigned R1 assigned P assigned
2 03/21 Pitfalls in Inferential Statistics Multiple hypotheses, Bonferroni correction, false discovery rate, statistical vs practical significance    
3 03/26 Data Context and Quality collection, preparation, cleaning, missing data Reading 2.1: Mark D. Wilkinson et al. The FAIR Guiding Principles for scientific data management and stewardship. Nature Scientific Data. 2016 Reading 2.2: Stephen Stigler. Data Have a Limited Shelf Life. HDSR 2019. R1 due PA0 due R2 assigned PA1 assigned
4 03/28 Causality and Experiments 1/2 causal models, experiments (RCT)    
5 04/02 Causality and Experiments 2/2 causal inference from observational data, human subjects, AB testing, experimental design Reading 3.1: Department of Health, Education, and Welfare. The Belmont Report. April 18, 1979. Reading 3.2 Robert Bond, Christopher Fariss et al.A 61-million-person experiment in social influence and political mobilization. Nature 2012. PA1 due R2 due R3 assigned PA2 assigned
6 04/04 IRB (Cheryl Danton)      
7 04/09 Discussion 1/3 optimization vs generalization, training and test data, models, learning Reading 4.1: Nithya Sambasivan et al. Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI. CHI 2021. 4.2 Wendy Parker Model Evaluation: An Adequacy-for-Purpose View. 2022 (read the introduction and (optionally) the rest) R3 due R4 assigned
8 04/11 Machine Learning in the Wild training data, feature engineering, information leakage, concept drift, algorithmic decision making   PA2 due
9 04/16 Fairness and Interpretability in Machine Learning fairness definitions Reading 5.1 Deirdre K. Mulligan, Joshua A. Kroll, Nitin Kohli, Richmond Y. Wong This Thing Called Fairness: Disciplinary Confusion Realizing a Value in Technology CSCW 2019 Reading 5.2 Julia Angwin, Jeff Larson, Surya Mattu, Lauren Kirchner. Machine Bias. ProPublica, May 23, 2016 R4 due R5 assigned PA3 assigned
10 04/18 Visualization and Communication packaging data products, reproducibility, repeatability, visualization, communication    
11 04/23 Discussion 2/3     R5 due
12 04/25 Introduction to Privacy 1/2 privacy definitions, law, technology   PA3 due
13 04/30 Introduction to Privacy 2/2 data anonymization and deanonymization, k-anonimity, attacks, indigenous data sovereignty Reading 6: Shoshana Zuboff. Big other: surveillance capitalism and the prospects of an information civilization. Journal of Information Technology 2015. R6 assigned
14 05/02 Statistical Data Privacy differential privacy, sensitivity   PA4 assigned
15 05/07 Data Flows, Lifecyles, Data Markets provenance, right to be forgotten, data portability, data brokers, data ownership, value of data, data unions, cooperatives, strikes Reading 7 . Edith Ramirez, Julie Brill, Maureen K. Ohlhausen, Joshua D. Wright, Terrell McSweeny Data Brokers: A call for transparency and accountability. Federal Trade Commission, May, 2014 (Read Executive Summary and then Section 4 “Types of Products”) R6 due R7 assigned
16 05/09 Discussion 3/3      
17 05/14 Summary of the quarter via a case study of LLMs Statistics and the Census Bureau     R7 due PA4 due
18 05/16 AMA      
  05/17 No class     P due