Schedule
Please, check out this schedule frequently as it will likely change a bit throughout the quarter.
Legend:
- PA stands for Programming Assignment
- SR stands for Issue Report
- R stands for reading assignment
#Lecture | Date | Lecture | Keywords | Readings | Important Dates |
---|---|---|---|---|---|
1 | 03/29 | Course Overview and Introduction to the Data Science Process | Data science lifecycle. Ethics, fairness, responsibility, and privacy issues. | Reading 1.1: John P. A. Ioannidis Why Most Published Research Findings Are False PLOS Medicine. 2005 Reading 1.2: Michael Jordan Artificial Intelligence: The Revolution Hasn’t Happened Yet. HDSR 2019. | PA0 assigned SR assigned R1 assigned |
2 | 03/31 | Pitfalls in Inferential Statistics | Multiple hypothesis, Bonferroni correction, false discovery rate, statistical vs practical significance | ||
3 | 04/05 | Data Context and Quality | collection, preparation, cleaning, missing data | Reading 2.1: Mark D. Wilkinson et al. The FAIR Guiding Principles for scientific data management and stewardship. Nature Scientific Data. 2016 Reading 2.2: Stephen Stigler. Data Have a Limited Shelf Life. HDSR 2019. | R1 due PA0 due R2 assigned PA1 assigned |
4 | 04/07 | Causality 1/2 | causal models, experiments (RCT) | ||
5 | 04/12 | Experiments with Human Subjects | causal inference from observational data, quasi-experiments | Reading 3.1: Department of Health, Education, and Welfare. The Belmont Report. April 18, 1979. Reading 3.2 Robert Bond, Christopher Fariss et al.A 61-million-person experiment in social influence and political mobilization. Nature 2012. | PA1 due PA2 assigned R3 assigned |
6 | 04/14 | Causality 2/2 | human subjects, AB testing, experimental design | ||
7 | 04/19 | Introduction to Machine Learning 1/2 | optimization vs generalization, training and test data, models, learning | Reading 4: Nithya Sambasivan et al. Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI. CHI 2021. | PA2 due R3 due PA3 assigned PA4 assigned R4 assigned |
8 | 04/21 | Machine Learning in the Wild 2/2 | training data, feature engineering, information leakage, concept drift, algorithmic decision making | ||
9 | 04/26 | Fairness and Interpretability in Machine Learning | fairness definitions | Reading 5.1 Deirdre K. Mulligan, Joshua A. Kroll, Nitin Kohli, Richmond Y. Wong This Thing Called Fairness: Disciplinary Confusion Realizing a Value in Technology CSCW 2019 Reading 5.2 Julia Angwin, Jeff Larson, Surya Mattu, Lauren Kirchner. Machine Bias. ProPublica, May 23, 2016 | PA3 due R4 due |
10 | 04/28 | Discussion 1/2 | packaging data products, reproducibility, repeatibility, visualization, communication | ||
11 | 05/03 | Visualization and Communication | packaging data products, reproducibility, repeatibility, visualization, communication | PA5 assigned | |
12 | 05/05 | Introduction to Privacy | privacy definitions, law, technology | ||
13 | 05/10 | Anonymization | data anonymization and deanonymization, k-anonimity, attacks | Reading 6: Daniel Solove. ‘I’ve Got Nothing to Hide’ and Other Misunderstandings of Privacy. San Diego Law Review 44, 2007. | PA5 due R6 assigned |
14 | 05/12 | Statistical Data Privacy | differential privacy, sensitivity | PA6 assigned | |
15 | 05/17 | Data Lifecycles | provenance, right to be forgotten, data portability | Reading 7 . Edith Ramirez, Julie Brill, Maureen K. Ohlhausen, Joshua D. Wright, Terrell McSweeny Data Brokers: A call for transparency and accountability. Federal Trade Commission, May, 2014 (Read Executive Summary and then Section 4 “Types of Products”) | PA4 due R6 due R7 assigned |
16 | 05/19 | Data Markets | data ownership, value of data, data markets, markets for privacy, data brokers | PA6 due | |
17 | 05/24 | Other topics | data unions, cooperatives, strikes | SR due R7 due | |
18 | 05/26 | Discussion 2/2 |