********** Announcements ************
· It has been in the UCSD catalog for a couple of
years at least, that the pre-requisite for this course is MATH 180A, and not any
other statistics or probability oriented courses like ECON 120A; however, our WebReg system remains outdated (at the writing of this
note) to imply that this might be a very recent change. Students with ECON 120A but not MATH 180A
generally will NOT obtain this instructor’s consent to enroll in the course.
[A related message: you
are strongly encouraged to take MATH 180A even after ECON 120A. We have recently given umbrella approval to
petitions for graduating with 2 fewer units due to this
partial duplicated credits for prob&stats
majors.]
· This course is intended to be
taken as a sequence with MATH 181B for any student. Applied math majors will be required to
either take both or take another stand-alone course. Students of any major are
discouraged from taking only 181A.
· In place of syllabus: the
core materials of this course are chapters 5, 6, 7 in the textbook (see below),
and we may or may not get to the start of Chapter 9. Materials in Chapter 3
that was not covered in MATH 180A, will be covered as
needed. Chapters 4 and 8 will be skipped.
The textbook will be followed pretty closely, with the addition of
asymptotic theory of the maximum likelihood estimator (mostly from the Rice
book below) as well as a bit more in depth on likelihood ratio test (uniformly
most powerful).
· Statistical programming language and environment
R will be taught during TA sessions, and will be used in
homework assignments.
· I will be
collecting feedback slips every 2 weeks on Fridays for students to tell
me: 1) what you have learned in the past 2 weeks; 2) what you don’t understand
about what we have discussed so far; 3) any other feedbacks in general. Students who turn in their slips at the
scheduled times will receive 2 bonus points towards their total scores for the
quarter. Schedule: section A01 – week 2, section A02 – week 4, section A03 –
week 6, section A04 – week 8.
· Midterm
will be Monday, May 8, in class. A cheat sheet is allowed (this is true for all
exams). Bring sheets of paper (or a blue book) to write the exam on.
· Wednesday
4/26 in place of lecture, there will be an extra TA session for all in the
lecture room (CTR 105) from 1-1:50pm.
· For practice (not required, instead the emphasis
is on understanding the concepts and statistical thinking, and not ‘do as many
problems as possible’) please use the textbook problems of the materials that
we have covered in class.
· See
bottom of page for updated grading scheme.
· Final
week my office hour: Wed 6/14 1-3pm, APM 5856. (I do not have office hours on M
Tu, but the TA’s do)
· Be sure
to bring your ID to the final exam.
Lecture: MWF 1
Instructor:
Ronghui (Lily) Xu
Office: APM 5856
Phone: 534-6380
Email: rxu@ucsd.edu
Office
Hours:
M: 4-5pm
F: 3-4pm
Or, by
appointment
Teaching Assistants:
Jue
(Marquis) Hou, Andrew Ying
Email: j7hou@ucsd.edu, anying@ucsd.edu
Office Hours: see announcements in TritonEd
(TED)
Textbook:
Larsen and
Marx, "An Introduction to Mathematical Statistics and Its Applications"
(6th edition)
Reference
books:
1.
Rice, "Mathematical
Statistics and Data Analysis";
2.
Wackerly et al., "Mathematical
Statistics with Applications"
Additional reading (not required):
Efron, B.
and Hinkley, D.V. (1978) Assessing
the accuracy of the maximum likelihood estimator: observed versus expected
Fisher information. Biometrika, 65, 457-487. (Introduction section only)
Lecture 1 slides (motivation and background mainly, not “required”
material)
Weekly
topics:
Week 1: estimation; method of moments estimator
(MME), case study
5.2.2 (4th ed);
Week 2: maximum likelihood estimator (MLE), interval estimation;
Week 3: unbiasedness, efficiency, mean squared
error (MSE); Cramer-Rao lower bound, Fisher
information;
Week 4: convergence in probability, consistency and asymptotic normality
of the MLE;
Week 5: confidence intervals based on the MLE, asbestos data;
Week 6: hypothesis testing paradigm, type I error, rejection region;
one-sample normal with known variance; p-value; tests for Binomial including
exact;
Week 7: type II error, sample size; tests for non-normal (and
non-Binomial) data; duality
of CI and hypothesis testing, Wald test;
Week 8: likelihood ratio test, Neyman-Pearson
lemma, uniformly most powerful test (see Rice and Wackerly
books for additional materials);
Week 9: distributions related to normal; one-sample t-test;
Week 10: inference about variance of one-sample normal (adjustable
mortgage data); Bayesian inference (notes).
Homework: due each (following) week at TA
sessions or in TA dropbox by end of that day
–
be sure to append your R program codes at the back of
your assignments, but you need to summarize the relevant results in the ‘main’
part, as opposed to have the grader look for them among the codes. Good
presentation is important for any work.
Week1 (due
4/13): 5.2.18, 23, 26 (only do the estimation part),
R
simulation: for
n=10, simulate a random sample of size n from N(μ, σ2),
where μ = 1 and σ2 = 2; plot
the histogram and superimpose by the N(1, 2) density function, and compute your
estimate of μ and σ2. Repeat for n=100 and 1000. Describe what you
observe as well as what you expect.
Week 2 (due
4/20): 5.2.10, 14; 5.3.2, 8, 12, 17, 27;
R
simulation: for
n=10, simulate a random sample of size n from N(μ, 2),
where μ = 1. Plot in different figures: 1) the likelihood function of μ, 2) the
log-likelihood function; mark the maximum likelihood estimate in both plots.
Week 3 (due
4/27): 5.4.18 (add: show that the two estimators in the problem are unbiased),
19; 5.5.7, 2, 3
Week 4 (due
5/4): 5.7.1, 3(a), 4;
Construct
(i.e. give an example of) a sequence of real functions gn(x) converging
to a function g(x), such that the corresponding sequence of maximizers
of gn(x) converges to that of g(x).
R
simulation: for
n=10, simulate a random sample of size n from N(μ, σ2),
where μ = 1 and σ2 = 2;
compute the same mean. Repeat the above simulation 500 times, plot the histogram of the 500 sample means. Now
repeat the 500 simulations for n=1,000. Compare these two sets of results for
different sample sizes, and discuss it in the context of consistency.
Week 5 (due
5/11): For X ~ B(n, p) [think of it as equivalent to a
random sample of size n
from
Bernoulli distribution with probability p], derive the 95% CI for p based on
the MLE, and compare it to the one obtained in Section 5.3 of Larsen&Marx book. Derive also the observed Fisher
information.
R
simulation: in
class we derived confidence intervals (CI) for the parameter λ of Poisson(λ). Now take sample size n=100, λ=1, and carry out
500 simulation runs. For each simulation, compute the MLE and the 95% CI based
on the MLE, and see if the 95% CI contains the true λ=1. Report out of the 500
runs, how many times the 95% CI contains the true λ=1; explain whether and why
your simulation result is as desired. (Hint, this is similar to the Larsen and
Marx example 5.3.1 with Table 5.3.1 on page 296.)
Week 6 (due
5/18): 6.2.7, 1, 8; 6.3.3, 7, 9
Week 7 (due
5/25): 6.4.4, 7, 10, 16 (I think it’s “X>=5” instead of “k>=5” at the end
of the problem), 21;
Discuss:
based on the duality of CI and hypothesis testing,
1)
how
testing against a one-sided alternative hypothesis corresponds to a one-sided
CI (i.e. with one end of the CI at + or – infinity);
2)
derive
the Wald test based on the MLE for the Poisson problem used in R simulation of
Week 5 assignment.
Week 8 (due
6/1): 6.5.1, 2, 5; do
also:
1)
Show
that for testing simple versus simple hypothesis H0: μ = μ0
against H1: μ = μ1 where μ1 > μ0,
based on a random sample Y1, …, Yn from N(μ, σ2) where σ2
is known, the likelihood ratio test rejects for large values of the sample mean
Ybar.
2)
For a random sample of size n=25 from Normal (μ, 1), under the null
hypothesis: μ=0. Plot the power curves in the same figure of the three Z-tests
(given in Theorem 6.2.1 of Larsen&Marx) for the
following three alternative hypotheses: 1) μ>0; 2) μ<0; 3) μ≠0. Assume a 0.05 significance level. Comment on the relation of your plots with
the UMP test.
Week 9 (due
6/8): 7.3.4, 5, 13; 7.4.7, 19; also the following:
1)
For
a random sample of size n from Exponential(λ), and the
hypotheses H0: λ = λ 0 versus H1: λ ≠ λ 0
(similar to problem 6.5.2 from week 8), derive a) the Wald test, b) the
likelihood ratio test using its asymptotic distribution, both assuming a
significance level α.
2)
Show
that the one-sample t-test of Section 7.4 in Larsen&Marx
is equivalent to the likelihood ratio test for a random sample from N(μ, σ2) and H0: μ = μ0
versus H0: μ ≠ μ0 with unknown σ2.
Week 10 (not
due): 7.5.9, 16;
For additional practice
(not required, see top of the page) please use the textbook problems of the
materials that we have covered in class.
Grading
(updated): max(30% Homework + 30% Midterm, 35% Homework + 25% Midterm)
+ 40% Final
[1
lowest homework score will be dropped]