Spring 2023
In this course we explore how societal issues of ethics, fairness, responsibility, and privacy affect the data science lifecycle. The data science lifecycle includes data acquisition, cleaning and pre-processing, analysis and use of data (we explore techniques from machine learning, inferential statistics, and causal inference), and communication of data science results. We will consider what additional considerations we must make when data is about individuals and privacy matters. And we will think about data flows, their effect on the modern economy, and the interactions with the topics of the course. We will consider the steps one must follow to conduct data science tasks responsibly, and will delve into the details of fairness and privacy issues that arise along the way. The course has four components: lectures, programming assignments, readings, and a quarter-long project. Through the combination of these four, students who complete the course will learn how to conduct data science tasks responsibly, recognize fairness, privacy, and other important implications, and improve their programming and technical skillset. In addition, students will be familiar with the evolving challenges of an increasingly data-driven world, and be capable of asking questions, and offering answers, to these pressing issues.
Course Information
Instructor: Raul Castro Fernandez (raulcf@uchicago.edu)
Teaching Assistant (TA): Zhiru Zhu (zhiru@uchicago.edu)
Lectures: Tuesday and Thursday 3:30pm–4:50pm, STU 102 (Central Time)
Prerequisites: CMSC/DATA 11900
Office Hours: You can use office hours to discuss any topic we cover in class. Zhiru OH: Monday 10:30-11:30am and Friday 2:30pm-3:30pm (Zhiru will announce the location for each session on Ed). Raul OH: Tuesday 1pm (JCL 245) and Thursday after class.
Canvas Site: Go here. We use Canvas mostly to link to this page, and sometimes for announcements.
Offline/Asynchronous discussion: We will use Ed for offline discussion and for announcements. See the expectations on how to use Ed below.
Schedule
The schedule is available here.
The schedule includes a brief description of lecture topics, and it includes readings and dates where assignments are released and due. Check the schedule often as I expect there will be some adjustments to the dates throughout the quarter.
Grading
- Reading Responses (7 in total. 2% each. Total 14%) We will assign readings each week. At the end of the week we will ask a question about the readings and ask you to provide a brief answer. There is a dedicated discussion section during the quarter where we will discuss the most interesting reading responses. There is a strict no-extension policy. We plan to share your answers with other students and instructors in the class. Let us know if you have concerns about this. No answer will be released in future courses or with people external to the class.
- Programming Assignment (1 in total, 15%). The programming assignment is an important component of your grade. You work on this individually, but you can discuss high-level ideas with other classmates. Your grade is based on the deliverables and a quiz delivered via gradescope. There is a strict no-extension policy.
- Data Science Project (1 in total, 10%) In this project, you will work towards investigating a research question. You are responsible for designing the experiments necessary to answer the question, writing software and other artifacts (e.g., user studies) needed to collect the data, perform the analysis, and write a report with your findings. There is one deadline and no extensions granted.
- Individual Quarter-Long Project (50%) The most important component of your grade. You will engage in a quarter-long individual project. You will deliver a short report and present your work to the class and invited guests during a poster session at the end of the quarter. The project will be graded with respect to: i) weekly progress reports ii) peer assessment iii) faculty committee assessment iv) written report v) poster presentation vi) quality of content.
- Issue Report (8%) At the beginning of the quarter we will ask you to prepare a brief report on a topic related to the content of the class, which you deliver at the end.
- Class Participation (3%). A small part of your grade comes from participation. If your grade ends up on a border between two grades (e.g. B+ and A-) this can sway your grade. Participation can be earned in several ways: i) being active on Ed, i.e., answering and commenting on questions (asking questions on Campuswire does not count) ii) actively engaging in discussion and asking questions in class.
How to ask questions and seek help
Before posting on Ed:
- Make sure you’ve consulted the documentation and tutorials for the software you are using. If that does not help, make sure you check online, for others who may have faced similar questions. You can use search engines for this, and stackoverflow is a great resources. Last, if your question is not resolved, you can use Ed.
- Aim to ask all questions publicly, so other students can answer them.
- Staff has limited budget for Ed. We will check it ~2 times a day. We will offer additional help during office hours.
- Aim to answer question in Ed, this will count towards participation in the class.
- Do not post code directly on Ed. And do not post screenshots of the output you obtain.
- Aim to include all relevant information with your post, so we can help you.
- We will not answer questions that do not follow the above guidelines.
Academic Integrity Policy
The University of Chicago has formal policies related to academic honesty and plagiarism, as described by the university broadly and the college specifically. We abide by these standards in this course. Depending on the severity of the offense, you risk being dismissed altogether from the course. All cases will be referred to the Dean of Students office, which may impose further penalties, including suspension and expulsion. If you have any question about whether some activity would constitute cheating, please feel free to ask. In addition, we expect all students to treat everyone else in the course with respect, following the norms of proper behavior by members of the University of Chicago community.
Student interactions are an important and useful means to master course material. We recommend that you discuss the material in this class with other students. While it is acceptable to discuss assignments in general terms, it is not acceptable to turn in someone else’s writing or code (or fragments thereof) as your own. When the time comes to write down your answer, you should write it down yourself from your own understanding. Moreover, you should cite any material discussions or written sources, e.g., “Note: I discussed this exercise with Jane Smith.” If one student “helps” another by giving them a copy of their assignment, only to have that other student copy it and turn it in, both students are culpable. If you have any questions about what is or is not proper academic conduct, please ask an instructor. (This description of academic honesty is derived in part from those of Stuart Kurtz and John Reppy).