MATH 185 -- Introduction to Computational Statistics
Announcements
03-19:: Solution to Midterm 2 [answers]
03-13 :: Solution 6 posted. (Please check and report errors asap.)
03-13 :: Solution to Midterm 2 (Spring 2011) [answers]
03-08 :: Take-home Final posted. Due Tuesday, March 20th, by 11:59 PM. Datasets: [185-final-coins] [185-final-regression-train].
03-08 :: Have a look at Midterm 2 (Spring 2011).
03-05 :: Midterm 2 will be on Friday, March 16th, in place of lecture.
03-05 :: Solution 5 posted. (Please check and report errors asap.)
03-04 :: Homework 6 posted. Due Friday 03-09 by 11:59 PM.
02-24 :: Homework 5 posted. Due Friday 03-02 by 11:59 PM.
02-24 :: Solution 4 posted. (Please check and report errors asap.)
02-16:: Solution to Midterm 1 [answers] [code]
02-16 :: Midterm 1 grades [19 21 23 25 25 27 27 28 29 31 32 32] out of 35.
(A range 27-35; B range 20-26; C range 15-19)
02-13 :: Homework 4 posted. Due Wednesday 02-22 by 11:59 PM.
02-08 :: Solution 3 posted. (Please check and report errors asap.)
02-02 :: Solution to Midterm 1 (Spring 2011) [answers] [code]
02-01 :: Solution 2 posted. (Please check and report errors asap.)
01-30 :: Homework 3 posted. Due Tuesday 02-07 by 11:59 PM.
01-30 :: Have a look at Midterm 1 (Spring 2011).
01-30 :: Midterm 1 will be on Friday, February 10th, in place of lecture.
01-25 :: Solution 1 posted. (Please check and report errors asap.)
01-25 :: Homework 2 posted. Due Monday 01-30 by 11:59 PM.
01-18 :: Homework 1 posted. Due Monday 01-23 by 11:59 PM.
01-10 :: Extension students and auditors, please send an email to the instructor to be added to the class email list.
01-10 :: Go through the part of the code we covered in class.
01-10 :: Read the whole page. Download R and familiarize yourself with it.
Schedule and class materials
Introduction to R [code]
Univariate categorical data [notes] [code]
Bivariate categorical data [notes] [code]
Univariate count data [notes] [code]
Univariate numerical data [notes] [code]
Bivariate numerical data [notes] [code] [median.test.R]
Multivariate numerical data [notes] [code]
Simple regression [notes] [code]
Multiple linear regression [notes] [code]
Polynomial regression [notes] [code]
Categorical variables in regression [notes] [code]
Model selection [notes] [code]
Data
Some of the data used in lecture and homework is in the data folder (password required)
StatSci.org for datasets, tutorials and other resources.
Datasets from Stats, Data and Models.
Topics:Parametric and nonparametric tests for
univariate, bivariate and multivariate samples, numerical and
categorical. Including two sample t-test, Wilcoxon signed rank sum
test, analysis of variance, Kruskal Wallis, chi-square test of
independence for contingency tables, simple and multivariate linear
regression, shrinkage methods, logistic regression, principal component analysis, k-means clustering, multidimensional scaling, monte carlo simulations, permutation tests, bootstrap.
Prerequisites: basic introduction to statistics at the level of MATH 181A or MATH 183.
Meeting Time: MWF 1:00 - 1:50
Meeting Place: APM B412
Instructor:
Ery Arias-Castro {eariasca@math.ucsd.edu} (Please write "MATH
185" in the subject line)
Office Hours: TBA
AP&M 5141
Teaching Assistant(s):
If you cannot make those office hours, please contact either the instructor or a TA for an appointment.
Textbook: Though no textbook required, consulting a basic textbook on statistics will be helpful. The following books will be on reserves at the library.
[LM] An introduction to
mathematical statistics and its applications by Larsen and Marx, 4th edition
Stats : data and models by De Veaux, Velleman and Bock
Software:
We will use the free statistical package R, popular in academia and research institutions at large.
It is a clone of S-PLUS.
For an interface that resembles Matlab, check RStudio.
The following books are specific to the software R (the first few
are available online for free, the other ones will be put on reserves
at the library).:
An Introduction to R by W.N. Venables, D.M. Smith and the R Development Core Team
simpleR --Using R for introductory statistics by John Verzani
R for Beginners by
Emmanuel Paradis
Using R for introductory statistics by John Verzani
Introductory Statistics with R by Peter Dalgaard
Software for data analysis : programming with R by John M. Chambers
A first course in statistical programming with R by W. John Braun, Duncan J. Murdoch
The R book by Michael J. Crawley
Data analysis and graphics using R : an example-based approach by John Maindonald and W. John Braun
Grading: homework (25%), midterm 1 (25%), midterm 2 (25%), take-home final (25%)
Homework guidelines: Return a clean, concise copy with the R code you used for the homework, annotated (in particular divided according to problem) included at the very end. No need to print any figure. You can refer to the code. Always provide a (brief) comment on the results you get.