Statistical Tools for Big Data (Lecture Series, Part 4)


Prof. Sanjay Chaudhuri, Department of Statistics and Applied Probability National University of Singapore

13 July, 201715:00-17:00Room 3999Survey sampling and big data
17 July, 201715:00-17:00Room 3999Regression modelling in statistics and big data analysis
25 July, 201715:00-17:00Room 3999Analysis of time series data
26 July, 201710:30-12:30 Room 4981 Penalized likelihood methods and empirical likelihood



With the increasing popularity of web based-interactions, fast computation and a massive reduction in the cost of storage, collecting information on a huge number of characteristics from a massive number of subjects has become easy. Such big datasets contain many useful information which can be gainfully exploited in producing more efficient and user-friendly products. Due to their massive sizes, storage, retrieval and analysis of big datasets require new scalable, fast, statistical and computational procedures. However, some basic concepts on which statistical theories are founded, are still relevant and could be used to devise such procedures for analysing big data sets. The set of lectures would discuss some foundational concepts in statistics and endeavour to connect those to possible methodologies in the analysis of big data. This course is targeted towards doctoral candidates and researchers who are interested in general statistical tools. No background in statistics will be assumed.


Sanjay Chaudhuri is an Associate Professor of Statistics at the Department of Statistics and Applied Probability in National University of Singapore (NUS). He received his B.Stat (Bachelor of Statistics) and M.Stat (Master of Statistics) degrees from Indian Statistical Institute in 1998 and 2000, respectively. He obtained his Ph.D. degree from University of Washington, Seattle in 2005. Sanjay has been with the NUS from August 2005. Web: