Spring 2024
In this course we explore how societal issues of ethics, fairness, responsibility, and privacy affect the data science lifecycle. The data science lifecycle includes data acquisition, cleaning and pre-processing, analysis and use of data (we explore techniques from machine learning, inferential statistics, and causal inference), and communication of data science results. We will consider what additional decisions we must make when data is about individuals and privacy matters. And we will think about data flows, their effect on the modern economy, and the interactions with the topics of the course. We will consider the steps one must follow to conduct data science tasks responsibly, and will delve into the details of fairness and privacy issues that arise along the way. The course has four components: lectures, programming assignments, readings, and a quarter-long project. Through the combination of these four, students who complete the course will learn how to conduct data science tasks responsibly, recognize fairness, privacy, and other important implications, and improve their programming and technical skillset, in addition to developing specific expertise in one of the quarter-long topics. In addition, students will be familiar with the evolving challenges of an increasingly data-driven world, and be capable of asking questions, and offering answers, to these pressing issues.
Course Information
Instructor: Raul Castro Fernandez (raulcf@uchicago.edu)
Staff: Steven Xia (TA) (stevenxia@uchicago.edu); Victor Brown (TA) (victorfbrown@uchicago.edu); Satadisha Saha Bhowmick (TA)(ssahabhowmick@uchicago.edu); Will Trimble (TA) (wltrimbl@uchicago.edu); Bill Trok (TA) (btrok@uchicago.edu); Amy Nussbaum (TA)(anussbaum@uchicago.edu); Su Karaca (Grader) (sudogakaraca@uchicago.edu); Memphis Cutchlow (Grader) (mcutchlow@uchicago.edu); Diana Chu (Grader)(dianachu@uchicago.edu);
Lectures: Tuesday and Thursday 3:30pm–4:50pm (Central Time), K 120 (Double check the location as it may change before the start of the quarter)
Prerequisites: CMSC/DATA 11900
Office Hours: Note office hours are thematic. Ensure you come to the office hours appropriate for your need.
- Lecture/Readings/General Interest: Raul [Thursdays 1pm - 2pm, JCL 245]
- Programming Assignments: Steven, Victor, Satadisha, Amy [Monday 3:30pm to 4:30pm, Ryerson 275B] [Friday 2:00pm-3:00pm, 257G]
- Projects: Bill [Monday 3pm-4pm in Ryerson 375A, Thursday 10am-11am in Ryerson 256], Will [Wednesday 2:30pm-3:30pm, Friday 2:30pm-3:30pm, Ryerson 257B]
Canvas Site: Go here. We use Canvas for announcements. You use canvas to submit your reading assignments and project weekly reports.
Offline/Asynchronous discussion: We will use Ed for offline discussion and for announcements. See the expectations on how to use Ed below.
Schedule
The schedule is available here.
The schedule includes a brief description of lecture topics, and it includes readings and dates where assignments are released and due. Check the schedule often as I expect there will be some adjustments to the dates throughout the quarter.
Grading
- Reading Responses (7 in total. 2% each. Total 14%) We will assign readings each week. At the end of the week (usually the Sunday before the assignment is due), we will ask a question about the readings and ask you to provide a brief answer. There are two dedicated discussion sections during the quarter where we will discuss the most interesting reading responses. There is a strict no-extension policy. We plan to share your answers with other students and instructors in the class. Let us know if you have concerns about this. No answer will be released in future courses or with people external to the class.
- Programming Assignment (5 in total, 33%). The programming assignment is an important component of your grade. You work on this individually, but you can discuss high-level ideas with other classmates. Your grade is based on the deliverables and a quiz delivered via gradescope. There is a strict no-extension policy.
- Quarter-Long Project (50%) The most important component of your grade. You will engage in a quarter-long group project (maximum of 2 people). You will deliver a short report and present your work to the class at the end of the quarter. The project will be graded with respect to: i) weekly progress reports ii) peer assessment iii) faculty committee assessment iv) written report v) presentation vi) quality of content.
- Class Participation (3%). A small part of your grade comes from participation. If your grade ends up on a border between two grades (e.g. B+ and A-) this can sway your grade. Participation can be earned in several ways: i) being active on Ed, i.e., answering and commenting on questions (asking questions on Ed does not count) ii) actively engaging in discussion and asking questions in class.
How to ask questions and seek help
Before posting on Ed:
- Make sure you’ve consulted the documentation and tutorials for the software you are using. If that does not help, make sure you check online for others who may have faced similar questions. You can use search engines for this, and stackoverflow is a great resources. Last, if your question is not resolved, you can use Ed.
- Aim to ask all questions publicly, so other students can answer them.
- Staff has limited budget for Ed. We will check it ~2 times a day. We will offer additional help during office hours.
- Aim to answer question in Ed, this will count towards participation in the class.
- Do not post code directly on Ed. And do not post screenshots of the output you obtain.
- Aim to include all relevant information with your post, so we can help you.
- We will not answer questions that do not follow the above guidelines.
Academic Integrity Policy
The University of Chicago has formal policies related to academic honesty and plagiarism, as described by the university broadly and the college specifically. We abide by these standards in this course. Depending on the severity of the offense, you risk being dismissed altogether from the course. All cases will be referred to the Dean of Students office, which may impose further penalties, including suspension and expulsion. If you have any question about whether some activity would constitute cheating, please feel free to ask. In addition, we expect all students to treat everyone else in the course with respect, following the norms of proper behavior by members of the University of Chicago community.
Student interactions are an important and useful means to master course material. We recommend that you discuss the material in this class with other students. While it is acceptable to discuss assignments in general terms, it is not acceptable to turn in someone else’s writing or code (or fragments thereof) as your own. When the time comes to write down your answer, you should write it down yourself from your own understanding. Moreover, you should cite any material discussions or written sources, e.g., “Note: I discussed this exercise with Jane Smith.” If one student “helps” another by giving them a copy of their assignment, only to have that other student copy it and turn it in, both students are culpable. If you have any questions about what is or is not proper academic conduct, please ask an instructor. (This description of academic honesty is derived in part from those of Stuart Kurtz and John Reppy).