The field of Data Science is a combination of statistics and computer science methodologies that enable learning from data. A data scientist extracts information from data, and is involved with every step that must be taken to achieve this goal, from getting acquainted with the data to communicating the results in non-technical language. The Data Science Specialist program prepares students for work in the Data Science industry or government and for graduate studies in Data Science, Computer Science, or Statistics. Students in the program will benefit from a range of advanced courses in Computer Science and Statistics offered by the University of Toronto, as well as from a sequence of three integrative courses designed especially for the program.
The Data Science Specialist program comprises three fundamental and highly-integrated aspects. First, students will acquire expertise in statistical reasoning, methods, and inference essential for any data analyst. Seconds, students will receive in-depth training in computer science: the design and analysis of algorithms and data structures for handling large amounts of data, and best practices in software design. Students will receive training in machine learning, which lies at the intersection of computer and statistical sciences. The third aspect is the application of computer science and statistics to produce analyses of complex, large-scale datasets, and the communication of the results of these analyses, students will receive training in these areas by taking integrative courses that are designed specifically for the Data Science Specialist program. The courses involve experiential learning: students will be working with real large-scale datasets from the domain of business, government, and/or science. The successful student will combine their expertise in computer and statistical science to produce and communicate analyses of complex large-scale datasets.
Skills that graduates of the program will acquire include proficiency in statistical reasoning and computational thinking, data manipulation and exploration, visualization, and communication that are required for work as a data scientist, the ability to apply statistical methods to solve problems in the context of scientific research, business, and government, familiarity and experience with best practices in software development, and knowledge of current software infrastructure for handling large data sets. Graduates of the program will be able to demonstrate the ability to apply machine learning algorithms to large-scale datasets that arise in scientific research, government, and business, create appropriate data visualizations for complex datasets, identify and answer questions that involve applying statistical methods or machine learning algorithms to complex data, and communicating the results, present the results and limitations of a data analysis at an appropriate technical level for the intended audience.