Computational statistics is an essential component of modern statistics that often requires efficient algorithms and programing strategies for statistical learning and data analysis. This course will introduce principles and techniques of statistical computing and data management necessary for computationally intensive statistical analysis especially for big data. Topics covered include management of large data (data structure, data query), parallelized data analyses, stochastic simulations (Monte Carlo methods, permutation-based inference), numerical optimization in statistical inference (deterministic and stochastic convex analysis, EM algorithm, etc.), randomization methods (bootstrap methods), etc. Students will use these techniques while engaging in hands-on projects with real data. Students who have taken the MA590 version of this course cannot also earn credit for MA 551.
No previous programming knowledge/experience is assumed. Some knowledge of probability and statistics, or MA511 equivalent is recommended.