Licence Mi2E 2012/2013

 Computational and exploratory statistics

Ch. Robert

Goals/Buts

This computing course aims at providing students with programming abilities in the programming language R, an open-source and free computer language available on all platforms. This is not intended as a computer science course but on the opposite as a way to understand basic statististical and simulation techniques via computer experiments. Learning how to program is thus a way to build statistical intuition.

The course operates solely via computer classes in small groups and exposes the basic notion of simulation and exploratory statistics through computer-based exercises. An
R manual is available, however students are highly encouraged to check on further references, either through R books or on-line documents.

The evaluation of the course will be done via two on-line exams (partial and final exam) undergone with anonymous accounts and corrected on the basis of saved script files. Only printed documents will be allowed during those exams. Practice exercises for the partial mid-term exam are available here (in French), along with a former exam. Here are both versions of 2012 (A and B), along with both solutions (A and B).

(vieille version :) Ce cours vise à apprendre aux étudiant(e)s l'emploi (aisé) d'un logiciel appelé  R, version libre (et gratuite) du logiciel S-plus, le "S" se rapportant à "Statistics". Plutot que de faire un cours d'informatique "pur(e)", nous avons préféré fonder cet apprentissage sur des notions de base de Statistique exploratoire, c'est à dire d'analyse statistique de données sans hypothèse(s) forte(s) de modelisation.

Le cours emploiera donc le logiciel R à profusion, mais les bases de programmation en R seront abordées uniquement durant les premiers TPs. Les etudiant(e)s seront encourage(e)s à télécharger le logiciel, disponible sur le site de R, sur leur propre machine (versions Linux, Unix, Windows et Mac disponibles). Une introduction sommaire a R est fournie dans un poly, mais les etudiant(e)s sont vivement encourage(e)s a acheter [ou telecharger] les references donnees ci-dessous. (Investissement recommande : ce logiciel est suffisant pour le traitement de la plupart des problemes statistiques !!!)

L'evaluation des connaissances se fera par un examen en ligne (version 2009) début janvier 2011 (rattrapage en septembre): l'examen se fera en salle surveillée et en temps limité et consistera en des questionnaires à choix multiples argumentés par des programmes R.

Contacts:    Enseignants : Marco Banterle, banterle [arobas] ceremade.dauphine.fr, Merlin Keller, merlinkeller [arobas] gmail.com Robin Ryder, Bureau B627,   ryder[arobas] ceremade.dauphine.fr, et Christian Robert, Bureau B638, xian [arobas] ceremade.dauphine.fr


Plan

The slides used by Christian Robert (but not necessarily the other instructors) are available here. And here are the initial exercises (feuilles de Tp).  And the introduction manual to R.

1. Basics for non-uniform simulation 

2.  Monte Carlo methods for integration

3.  Boostrap methods for estimation and tests 

4. Non-parametric methods for estimation and tests 


Classes and recitation classes/Cours et Tps

There is a single introductory lecture followed by small group classes in the computer labs. Students must attend the group they have been assigned to or ask for an authorisation to switch group.


Refrence books/ Livres de reference:

Many books and manuals are available on-line. See e.g. "The R manuals" on the R webpage.

  • R. Drouihlet, P. Lafaye de Micheaux et B. Liquet (2010) Le logiciel R Springer, Paris
  • C. Robert et G. Casella (2007) Introduction to Monte Carlo Methods with R Springer, New York
  • C. Robert et G. Casella (2010) Méthodes de Monte-Carlo avec R Springer, Paris
  • W. Venable (1992) Notes on S-PLUS: A Programming Environment for Data Analysis and Graphics. Disponible on-line   
  • W. Venables  and B.D. Ripley (1999) Modern Applied Statistics with S-PLUS, Third edition, Springer, New York, NY