Dipartimento di Scienze della Vita e dell'Ambiente - Guida degli insegnamenti (Syllabus)
Knowledge of the topics of the courses on Mathematics and Informatics.
The course consists of theoretical lectures (6 credits, 48 hours) and computer laboratory pratical work carried out at small groups of 2-3 students. An e-learning didactic activity is available in parallel to the normal frontal course. It includes: the didactic material, the self-assessment tests, data and instructions for the experimental exercises, booking for the experimental exercises in the computer laboratory, attendances to lectures and laboratory exercises, results of examinations.
The course enables students to acquire the theoretical and methodological fundamentals of univariate and multivariate statistical analysis as applied to the study of experimental sciences. In particular, the student should know the fundamentals of statistics, the hypothesis testing, the analysis of variance and the procedures of cluster analysis, principal component analysis, nearest neighbour rule, canonical variate analysis (discriminant analysis).
Ability to apply the knowledge:
At the end of the course, the student should have acquired the ability of performing the computer procedures required for data statistical analysis using commercial statistical packages, as well as to interpret correctly the results obtained.
The execution of experimental exercises (alone or in-group), as well as the discussion of the results obtained, contribute to improve for the student the degree of judgement autonomy in general, the communicative capacity (which derives also from the teamwork), and the ability to draw conclusions from experimental data.
Content (lectures, 5 CFU, 40 hours). Theoretical and methodological fundamentals of the main techniques of univariate and multivariate statistical analysis as applied to the study of experimental sciences. Data and data distribution. Descriptive statistics. Normal distribution. Inference. Confidence interval. Hypothesis testing. Analysis of variance. Linear regression. Multivariate data and information. Ungrouped data analysis: cluster analysis, principal component analysis (PCA). Grouped data analysis: k nearest neighbour rule (KNN), canonical variate analysis (CVA), discrimination and classification. Examples of case studies referred to biological, archaeological (paleobiological) and chemical problems. Computer laboratory activity for the study of a few real cases considered during the course.
Laboratory exercises (1 CFU, 8 hours/student). Computer exercises are carried out in small groups (2-3 students/computer). Used statistical packages are the following: Unistat, SIMCA, S-Plus, Parvus, Statgraphics. Exercise n. 1: Histograms, Frequency tables, Summary statistics, Confidence interval, Hypothesis testing. Ex. n. 2: Cluster analysis I. Ex. n. 3: Cluster analysis II, Method of k nearest neighbour rule (KNN). Ex. n. 4: Principal component analysis (PCA). Ex. n. 5: Canonical variate analysis (CVA) (or Discriminant analysis).
Methods for assessing learning outcomes:
The assessment method is a written classwork (open questions) and subsequent revision/discussion of the script. Thirty open questions are provided for the examination. These include also exercises on hypothesis tests and questions on the interpretation of results obtained from the analysis of a case study obtained using one of the statistical packages used during the course. To each question a score included between zero and one is assigned. To the sum obtained other two points are added to obtain the final result of the written classwork. The exam is passed when the final score is higher or equal to 18. During the course of lectures it is also foregone the possibility of participating to “in itinere” written classwork (1st and 2nd partial test). The result of a partial test may be mediated with the other provided the obtained score be at least 15. In case of negative or unsatisfactory result in one of the two partial tests, it can be retrieved in the immediately following session.
Criteria for assessing learning outcomes:
In the written classwork, the student will have to demonstrate to have acquired a sound knowledge of basics and methods of the univariate statistics (data distributions, inference, hypothesis testing) and multivariate statistics (cluster analysis, principal component analysis, k nearest neighbour rule, canonical variate analysis). The capacity to apply the acquired knowledge is assessed also through the written answers to the questions related to the exercises on the hypothesis tests and on the case study presented in the “practical” part of the written classwork.
Criteria for measuring learning outcomes:
The final mark is attributed in thirtieths. Successful completion of the examination will lead to grades ranging from 18 to 30, and 30 with laud.
Criteria for conferring final mark:
The final mark is attributed by summing the scores obtained on the 30 questions of the written classwork (after its public revision/discussion) and adding two points to the sum. The laud is attributed when the score obtained by the previous sum exceeds the value 30 and contemporaneously the student demonstrates complete mastery of the matter.
O. Vitali. Statistica per le Scienze Applicate. Vol. 2. Cacucci Editore, Bari, 1993.
O. Vitali. Principi di Statistica. Cacucci Editore, Bari, 2003.
M.C. Whitlock, D. Schluter. Analisi statistica dei dati biologici. Zanichelli, Bologna, 2010.
W.W. Daniel. Biostatistica. Edises, Napoli, 1996.
R.R. Sokal, F.J. Rohlf. Biometry. The Principles and Practice of Statistics in Biological Research, W.H. Freeman, San Francisco, 1995.
G. Norman, D. Steiner. Biostatistica. Seconda ediz., Casa Editrice Ambrosiana, Milano, 2015.
W.J. Krzanowski. Principles of Multivariate Analysis. A User’s Perspective, Second edition, Oxford University Press, 2000.
I.T. Jolliffe. Principal Component Analysis, Second edition, Springer-Verlag, New York, 2002.