Acronyms used in CheMoocs











principal component analysis



discriminant factor analysis



hierarchical ascending classification



classification and regression tree



cross validation



k closest neighbors



multiple linear regression



near infrared



near infrared spectroscopy



Principal component regression



Partial Least Squares Regression





sum of the squares of the prediction errors



in cross-validation leave-one-out



root-square of the calibration error



square root of cross validation error



square root of the prediction error



decomposition into singular values




Basis of a vector space

A base of a vector space of dimension P is made up of P vectors: fu1; u2; ::: uP g linearly independent, that is to say that none can be written as a linear combination of the others. Similarly, we de ne a base of a vector subspace of RP, of dimension A, by A vectors of nis in RP and linearly independent.
A base is not unique: very many bases (an in nite) can be used to deny the same vector space. An orthonormee base contains only vectors of norm 1 and all orthogonal between them. If the matrix P of dimensions (P A) contains in column the vectors of an orthonormed base, then P P = IA. The loadings of the ACP form an orthonormed base

Factorial map

The factorial map or score plot in English represents the coordinates of the observations on the plane formed by two main axes, generally axes 1 and 2.
In this representation, each point represents an observation. The points are therefore distinct from each other, as are the samples they represent

Correlation circle

The correlation circle is used in PCA. It consists in representing the correlations of each of the initial variables on a plane formed by two main components, often the first two.
Figure 1 {The circle of correlations for 4 variables: Var1, Var2, Var3 and Var4 represented on the plane of principal components 1-2, or axes 1-2.
According to the example in gure 1, Var1 is well explained by axis 1, with a strong positive correlation; Var2 is well explained by axis 2, with a strong negative correlation; Var3 is well explained by axes 1 and 2, due to its proximity to the circle; in n Var4 is not explained at all by the first two components, it must be explained by other components.

Correlation coefficient

The Pearson correlation coefficient allows, like the covariance, to measure how two variables represented here by the vectors x and y vary in the same sense, or not. It is noted r and its value is between 1 (strong positive correlation) and 1 (strong negative correlation). A value of 0 indicates that the variables vary independently of each other. The correlation between a variable and itself is 1.
Let x and y be the means of x and y, xi and yi their values for the index i.
The coe cient of determination R2, between 0 and 1, is the square of the coe cient of correlation.
Rx2; y = r2 (x; y)

Coefficient of determination

 => Voir corrélation


Regression coefficients

Let be a matrix of X spectra of dimensions (N P) and a quantitative quantity Y (ex: gluten) whose values predicted from X will give yb. The regression coe cients, or b-coe cients, form a vector of dimension (P 1) note b which verifies:
yb = Xb + E
Being the error. The formula is also written with and instead of b and E:
 yb = X +

Linear combination

Vectors fu1; u2; ::: uP g are connected by a linear combination if the numbers fa1 exist; a2; ::: aP g such as:
a1u1 + a2u2 + ::: + aP uP = 0
! 0 being the null vector. Otherwise, the vectors are said to be independent.

Colinearity of vectors

Two vectors x1 and x2 are linear if we can find a number k such that: x1 = kx2. Two collinear vectors point to the same direction in space, but not necessarily the same direction.



The cosine is used to measure the angle between two vectors. It is between 1 (two linear vectors in opposite directions) and 1 (two linear vectors in the same direction). It is worth 0 for two orthogonal vectors.
The measurement of the cosine is illustrated with gure 2. The two vectors u and v are used to give two directions, on which two dimensions of a right triangle ABC, rectangle in B.

See also

Modification date : 18 July 2023 | Publication date : 29 April 2020 | Redactor : ChemHouse