Overview

This program will provide medical students with a comprehensive introduction to data science in the Python programming language. Students in this program will learn how to use powerful tools to analyze time-series data, process images, train machine learning models, and create visualizations. By the end of the program, students will come away with the skills and mindset needed to delve into tasks ranging from speech analysis to image classification.

Students joining this program are expected to have some familiarity with programming, ideally in the Python programming language. To see the prerequisite knowledge required for the program or to get a refresher in Python programming, students are strongly encouraged to look through the materials in the Review tab before the start of the first lecture.

We will be using Google Colab for most of the lecture materials, in-class exercises, and assignments.

Course Objectives

Fall: Data Processing

  • To review how to work with tabular data in Python
  • To learn the basics of digital signal processing with time-series data
  • To learn the basics of image processing with image data
  • To become comfortable working with popular Python libraries for working with time-series and image data (e.g., numpy, pandas, scipy, opencv)
  • To be able to generate useful visualizations using the matplotlib library

Winter: Machine Learning

  • To understand the terminology used to describe different machine learning approaches
  • To be able to execute and end-to-end pipeline for training and evaluating a machine learning model
  • To learn the basics of sklearn, the most popular Python library for machine learning

Course Logistics

  • Internal Course Website: Quercus
  • Instructor: Alex Mariakakis
  • Lectures: Select Wednesdays 12-2:30 PM, MSB 3281
  • Teaching Assistants: Dhruv Verma
  • Instructor Office Hours: Mondays 5-6 PM, Bahen 7266 or Zoom link in Quercus
  • Teaching Assistant Office Hours: Thursdays 3–4 PM, Zoom link in Quercus