Official Certificate Program in Data Mining

This program is offered by the Mathematical Sciences Department

Program Description

Prepares current professionals and students with an overview of data mining methods and models and how to apply these techniques to large data sets. Other topics covered include web and text mining, mining for genomics and proteomics, and current issues in data mining.

Learning Outcomes

Students in the program will be expected to:

  1. Approach data analysis using a scientific approach, that is, through a systematic process that avoids expensive mistakes by assessing and accounting for the true costs of making various errors.
  2. Apply data science using a systematic process, by implementing an adaptive, iterative, and phased framework to the process, including the research understanding phase, the data understanding phase, the exploratory data analysis phase, the modeling phase, the evaluation phase, and the deployment phase;
  3. Demonstrate proficiency with leading open-source analytics coding software such as R and python, as well as commercial platforms, such as IBM/SPSS Modeler;
  4. understand and apply a wide range of clustering, estimation, prediction, and classification algorithms including k-means clustering, Kohonen clustering, classification and regression trees, logistic regression, k-nearest neighbor, multiple regression, and neural networks; and
  5. learn more specialized techniques in bioinformatics, text analytics, and other current issues.