New College students who complete the second year of their studies in any area of concentration are encouraged to follow the accelerated 3+2 curriculum provided below, if they are interested in completing both undergraduate and graduate programs in five years. Students who are interested in this option will be eligible only after entering the New College undergraduate program and showing strong academic performance. These applicants must satisfy the following minimum conditions before they can be admitted via the 3+2 pathway:
- Complete 2 years of study with Satisfactory evaluations in all academic undertakings.
- Complete prerequisite courses (see below)
- Be recommended for the 3+2 pathway by a faculty member
The regular curriculum for students who apply and enter with a Bachelor’s degree is available here.
The following courses must be completed during the first two years of undergraduate study:
- MATH 2400 – Calculus I
- MATH 3250 – Calculus II
- CSCI 2200 – Introduction to Programming in Python
- CSCI 3250 – Intermediate Python or CSCI 2400 – Object Oriented Programming
- MATH 2200 – Probability 1 (Mod 1)
- MATH 4550 – Probability 2 (Mod 2)
- MATH 2320 – Linear Algebra
These courses also count towards satisfying the IDC 5100 Introduction to Data Science Bootcamp course in the graduate program.
IDC 5204 – Applied Statistics I: A statistics course focusing on descriptive and inferential statistics, with topics on linear regression, confidence intervals and hypothesis testing, including probability theory and modern approaches such as resampling, with all methods illustrated in R and a focus on methods relevant for data science using industrial datasets.
IDC 5110 – Data Munging and Exploratory Data Analysis: A course on practical approaches for reshaping, reorganizing, and summarizing relationships in data through exploratory analysis. Principles and methods for preprocessing, normalizing, and validating data are covered, with an emphasis on collaborative and reproducible research.
IDC 5120 – Algorithms for Data Science: Fundamentals of algorithms and measures of performance. Taught in Python, the course includes an exploration of efficient algorithms for sorting and retrieving data, graph algorithms and combinatorial optimization, dynamic programming, randomized algorithms and approximation algorithms.
IDC 5130 – Databases for Data Science: Fundamentals of traditional database design and management. Various types and comparison of databases including SQL databases (eg. Postgre, SQLite), NoSQL databases, column-oriented databases (eg. HBase) and document-oriented databases (eg. MongoDb). Consistency, availability, scalability, efficiency and performance in data retrieval and storage.
IDC 5296 – Industrial Seminar Series I: The first offering of a three-semester long seminar series which hosts professionals and executives as guest speakers from a variety of industrial domains. Each weekly or biweekly seminar covers topics and applications to diverse problems in business via applications of various data science techniques.
IDC 5295 – Industrial Workshops: This course offers content modules complementary to the regular coursework of the graduate program in applied data science. Examples include, but are not limited to, topics such as Ethics, emerging or trending techniques in data science, domain-specific applications, industrial software platforms or tools, and professional certification modules and exams widely acknowledged in the industry.
IDC 5205 – Applied Statistics II: A course on statistical modeling, including multiple linear and logistic regression, and more generally, generalized linear models. Emphasis is placed on model formulation, building, assumptions, interpretations, predictions and assessments, with implementation carried out in R and a focus on methods and models relevant for data science using industrial datasets.
IDC 5112 – Data Visualization: A project-centered introduction to the visual display of quantitative information for both knowledge discovery and the communication of results. Students develop, over the course of the semester, a visual application in their interest with data collected from an industrial application or project.
IDC 5210 – Applied Machine Learning: Project-based course with a coverage of supervised and unsupervised learning and an emphasis on working with real industrial data. Bayesian analysis and other specific learning paradigms including regression, clustering, random forests, support vector machines, kernel methods, and neural networks.
IDC 5131 – Distributed Computing: Fundamentals concerning the design and maintenance of massively parallel data sets. Non-relational databases and their management. Algorithms for parallel architectures and associated software tools including the MapReduce/Hadoop framework and BigTable.
IDC 5297 – Industrial Seminar Series II: The second offering of a three-semester long seminar series that hosts professionals and executives as guest speakers from a variety of industrial domains. Each weekly or biweekly seminar covers topics and applications to diverse problems in business via applications of various data science techniques.
IDC 6293 – Industrial Practicum I: Intended as a summer internship or interterm applied project, this course is the first extensive real industry experience opportunity offered to students who would like to put their data science knowledge and skills to practical use. Must be completed with an industrial partner of the program or a company/organization the student chooses to work with, while under the supervision of a data science faculty.
IDC 6200 – Advanced Applied Statistics: A second statistical modeling course, with a mix of topics such as generalized additive models, models for longitudinal responses, time series models, survival analysis, statistical learning or Bayesian statistics, with a focus on models relevant for data science. Taught with a project-based focus using real industrial data in an applied business context.
IDC 6215 – Advanced Applied Computing: Advanced topics in computing, including such topics as image processing and object detection, text mining, natural language processing, recurrent neural networks, reinforcement learning. Taught with a project-based focus using real industrial data in an applied business context.
IDC 6250 – Practical Data Science: Analysis of data and creation of a data science pipeline and deliverable for industry. Working in small groups, students analyze an industry-submitted data set starting with exploratory analysis, followed by statistical or machine learning-based model building, and the construction and presentation of a data product to an industry partner.
IDC 6298 – Industrial Seminar Series III: The third and final offering of a three-semester long seminar series that hosts professionals and executives as guest speakers from a variety of industrial domains. Each weekly or biweekly seminar covers topics and applications to diverse problems in business via applications of various data science techniques.
IDC 6294 – Industrial Practicum II: A full semester working in industry as part of a data science team, while under the weekly supervision of and submitting reports to a Data Science faculty. This is the second and final stage of the industrial practicum where the student works in an industrial partner company or organization or in a company of their choice. Performance is assessed both by a faculty advisor and a company supervisor.