The course aims at providing the cornerstones of inferential statistics: the concept of statistical model, the tools of point estimation, of interval estimation and of statistical hypotheses testing. The simple linear regression is also introduced.
Larsen, R.J. and Marx M.L. An Introduction to Mathematical Statistics And Its Applications. Pearson, >= 5th edition
Statistics deals with collecting, organizing and interpreting numerical data. Statistical literacy is an essential skill for understanding and making sensible decisions based on the analysis of numerical information. Within this framework, the course aims at providing the cornerstones of inferential statistics: the concept of statistical model, the tools of point estimation, of interval estimation and of statistical hypotheses testing.
1) Mathematical concepts: basic operations and properties; capital sigma (generalized sum) and pi (generalized product) operators and their properties; functions; special functions (power, exponential, logarithm); derivatives; basic notions of series and integrals
2) At least 9 CFU in Statistics
3) At least the basic notions of probability: random experiment, sample space, events, probability and its properties, conditional probability and its properties
4) At least the basic notions on random variables (rv): definition; discrete vs continuous rv's; cumulative distribution function (cdf), probability mass function (pmf), probability density function (pdf); expectations, with special focus on mean and variance.
5) At least the basic notions on multiple rv: definition; joint, marginal and conditional distributions; expectations, with special focus on covariance and correlation coefficient.
Students without this background can review the specific chapter of the textbook
Traditional lessons using a pen tablet. Each lessons is uploaded on the Moodle page.
A preliminary short course is supplied before the starting date
All material is available on the Moodle page.
Type of Assessment
Exam consists of two parts:
1) A written test
with exercises. The student can use formulas written
behind the 9 sheets of the statistical tables. Duration 1h:30'; weight 60%.
2) Oral exam with questions on the theory. Duration
30'; weight 40%.
Preliminary short course
1) Random Variable (r.v.): definition; examples; domain of a r.v.; discrete and continuous r.v.'s.
Discrete r.v.: distribution of a discrete r.v. via probability mass function (p.m.f.); properties; examples.
2) Discrete r.v. The distribution of a discrete r.v. via cumulative distribution function (c.f.d.). Properties of the c.d.f. Examples.
Expectations of discrete r.v.'s: mean, variance, standard deviation (s.d.).
3) Discrete r.v. The mean and the variance of some transformations of a r.v. X: mean and variance of a constant (c), of the de-meaned r.v. (X -
mu), of the standardized r.v. (X - mu)/sigma, of a linear tranformation (a + b X). Continuous r.v. Motivations: why the p.m.f. does not make sense
while the c.d.f. can still play a role. Using the c.d.f. to compute probabilities.
4) Continuous r.v. Definition, interpretation and properties of the p.d.f. Link between c.d.f. and p.d.f of the same r.v. Expectations of continuous
r.v.'s, with a specific emphasis on the mean and on the variance.
5) Multiple r.v.: definition; examples; domain of a multiple r.v.; discrete, continuous and mixed multiple r.v.'s. Multiple discrete r.v.: definition of
the joint p.m.f.; relationships with the marginal p.m.f. and the conditional p.m.f.; properties.
6) Multiple discrete r.v. Expectations involving multiple discrete r.v.'s: mean, variance and standard deviation of the marginal components;
covariance and correlations between couples of random variables and their interpretation. Multiple continuous r.v. Definition of the joint p.d.f.
7) Multiple continuous r.v. Properties of the joint p.d.f. Joint, marginal and conditional p.d.f.'s. Expectations involving multiple continuous r.v.'s:
mean, variance and standard deviation of the marginal components; covariance and correlations between couples of random variables.
Multiple r.v. Independence of r.v's; independence versus absence of correlation. Examples.
8) Multiple r.v. Properties of covariance and correlation coefficient. Mean and variance of a portfolio (linear combination) of random variables and
some useful special cases. Special r.v.'s. Summary of the points touched in handling special r.v's: definition (in terms of p.m.f. or p.d.f.); main
expectations (mean and variance); properties; some practical examples (when possible).
9) Special r.v.'s. The Bernoulli r.v. The Binomial r.v.; The Poisson r.v.
10) Special r.v.'s. The Continuous Uniform r.v. The Normal (or Gaussian) r . v .
11) Special r.v.'s. The use of the Standard Normal tables to compute probabilities and intervals with Normal r.v.'s
12) Special r.v.'s. The Gamma r.v., the Chi-squared r.v., the Student-T r.v., the Fisher-F r.v.
13) Point Estimation. Introduction to the problem and to the concepts of population, sample, parameter, statistic and estimator, statistic value
and estimate, sample distribution of a statistic and related synthetic indices.
14) Point estimation. Properties of estimators: the mean squared error (MSE) and the concept of relative and absolute efficiency. In quest of the
most efficient estimator: motivations for applying some restriction to the set of possible estimators taken into account; decomposition of the MSE
as variance plus bias^2; unbiased estimators.
15) Point estimation. In quest of the most efficient estimator: the Cramer-Rao bound as benchmark for checking the absolute efficiency of unbiased estimators. The Maximum Likelihood (ML) method: definition of likelihood, log-likelihood, score vector. The ML method at work: the estimation of p in the Bernoulli model.
16) Point estimation. The ML method at work: the estimation of lambda in the Poisson model; the estimation of mu and/or sigma^2 (depending on
if one or both parameters are unknown) in the Normal model.
17) Point estimation. Derivation of the properties (sample distribution, bias, variance, MSE, check of the Cramer-Rao bound for unbiased
estimators) of the ML estimators computed.
18) Point estimation. ML estimation of parameters of the Gamma model as a motivation for introducing asymptotic properties. Asymptotic
properties: consistency, asymptotic unbiasedness, asymptotic efficiency, asymptotic sample distribution.
19) Point estimation. ML estimators as C.A.N.E. (Consistent Asymptotically Normal Efficient) estimators. Interval Estimation.
Introduction to the statistical problem by comparing interval estimation with point estimation.
20) Interval Estimation. Definition of interval estimate (confidence interval), confidence level, size of the interval. The Pivot method for
finding confidence intervals: definition of pivot quantity and illustration of how the method works in practice. Interval Estimation. Pivots and
corresponding intervals for: the mean of a Normal r.v. (variance known).
21) Interval Estimation. Pivots and corresponding intervals for: the mean of a Normal r.v. (variance unknown); the variance and the s.d. of a
Normal r.v. (mean known and unknown).
22) Interval Estimation. Pivots and corresponding intervals for: the probability of a Bernoulli r.v.; the mean of a Poisson r.v. How to use the
theory behind interval estimation for computing the sample size of a survey aiming at estimating a probability or a mean.
23) Testing Hypotheses. Motivations, framework, definition of statistical hypothesis (simple and composite), definition of statistical test.
24) Testing Hypotheses. Table of decisions, type I and type II errors, significance level and power of a test. The Neyman-Person lemma and
ensuing remarks. Examples.
25) Testing Hypotheses. Comparison of different specifications of the alternative hypothesis (pointwise, unidirectional, bidirectional) and
consequences on the rejection region. More on the role of the power of a test. The factors influencing the power a test.
26) Testing Hypotheses. Testing hypotheses concerning: the mean parameter of a Normal r.v. (cases sigma^2 known and sigma^2
unknown); the probability parameter of a Bernoulli r.v. The p-value: definition, computation and interpretation.
27) Testing Hypotheses. Testing hypotheses concerning: the variance of a Normal r.v. (cases mu known and mu unknown); the difference
between the probabilities of two independent Bernoulli distributions (and remarks on point estimation and interval estimation in the same
28) Testing Hypotheses. Testing hypotheses concerning: the difference between the means of two Normal r.v.'s by means of independent
samples (with the two variances known; with large samples and the two variances unknown; with the two variances unknown but equal and,
related to this case, the pooled sample variance).
29) Testing Hypotheses. Testing hypotheses concerning: the difference between the means of two Normal r.v.'s, by means of independent
samples, with the Satterthwaite-Welsh statistic; the difference between the means of two Normal r.v.'s by means of paired data.
30) Linear Regression Model. Introduction; model definition and corresponding properties; Ordinary Least Squares (OLS) estimators of the
parameters; fitted values and residuals. Linear Regression Model.
32) Properties of OLS estimators: their sample distribution; Best Linear Unbiased Estimators (BLUE) and discussion of the Gauss-Markov
33) Linear Regression Model. Deviance decomposition and R^2 index; predictions of the conditional mean and of the dependent variable for a
given value of the independent variable.