Fingerprinting: principal component analysis and linear combination fitting

Webb, S. M.

doi:10.1107/S1574870720016705

RELATED SITES: IUCr | IUCr Journals

International
Tables for
Crystallography
Volume I
X-ray absorption spectroscopy and related techniques
Edited by C. T. Chantler, F. Boscherini and B. Bunker

International Tables for Crystallography (2024). Vol. I. ch. 5.17, pp. 709-715
https://doi.org/10.1107/S1574870720016705

Chapter 5.17. Fingerprinting: principal component analysis and linear combination fitting

Samuel M. Webb^a ^*

^aStanford Synchrotron Light Source, SLAC National Accelerator Laboratory, 2575 Sand Hill Road, Menlo Park, CA 94025, USA
Correspondence e-mail: [email protected]

Complex, heterogeneous systems can be difficult to analyse using the techniques that are often used for pure-phase systems. Often, it is the complexity that is important in the system of interest, and an analysis of the number and types of species present is desired. Fingerprinting techniques are often employed, using various combinations of principal component analysis and linear combination fitting to determine the species that are present. This chapter will introduce the concepts behind these methods, discuss some of their advantages and disadvantages, and discuss several of the related concepts behind fingerprinting and combinatorial analysis of X-ray spectra.

Keywords: XANES; principal component analysis; linear combination fitting; least-squares fitting.

1. Introduction

In many complex, heterogeneous systems, the element of interest may be present as more than one chemical species. In these types of mixed systems, the traditional extended X-ray absorption fine-structure (EXAFS) analysis methods, involving Fourier transforms and theoretical fitting of various atomic shells, generally may not provide easy-to-interpret answers. This can be due to the fact that the complexity of mixing several systems leads to too many independent variables in a shell-by-shell fit, or the phase and amplitude of the many species may overlap and destructively interfere with each other to obscure the presence of atomic shells in the first place. The approach of fingerprinting, consisting of combinations of principal component analysis and linear combination fitting, can be useful for analysing and decomposing multicomponent spectra.

This approach generally assumes that the experimentally obtained X-ray absorption near-edge structure (XANES) or EXAFS data can be described as some linear combination of the spectra of the components that are contained within the sample. These standards can either be idealized model compounds that have been characterized previously and are believed to be possible candidates for the unknown compound or, in some cases, spectra from a simply varying set of unknown data sets that represent the extrema or `end-members' of the series. The linearity assumption is generally assumed, but is rarely rigorously proven. This linear combination fitting can be a useful approach for both XANES and EXAFS analysis and is generally relatively easy to perform. It can suffer from the same degree of uncertainty errors as in any typical EXAFS analysis, with errors in composition typically assumed to be at the 5–10% level.

The statistical procedure of principal component analysis (PCA), and similar families of algorithms that are used to cluster or decompose data sets, can be used along with linear combination fitting in the fingerprinting process. Generally, PCA methods can help to determine which standard compounds may best describe the variation in a series of X-ray spectra or may help to extract and determine which types of spectroscopy signals are changing in a complex series of data. In the former usage the derived PCA components rarely have a physical meaning, but can be used to examine, as a collection of mathematical abstractions, the number of unique components in a data series and whether other standard compounds can be `described' as a function of the components that describe the data series. PCA will be often be used first in analysis of a data series to help to further guide the linear combination analysis.

2. Principal component analysis

PCA has been widely used for decades across many fields, including chemistry (Malinowski, 2002 ; Malinowski et al., 1970 ), signal processing (Stark & Woods, 1986 ), meteorology (Weare & Nasstrom, 1982 ) and astronomy (Yip et al., 2004 ). The first paper to use PCA for XANES analysis was published in 1992 (Fay et al., 1992 ) and PCA has been used extensively in the analysis of both EXAFS and XANES (Ressler et al., 2000 ; Wasserman et al., 1996 , 1999 ) and many others. The underlying concept behind PCA is to take a set of n samples and to represent this set as a linear combination of orthogonally constrained component vectors which describe the variation of the data set. The desired result in a spectroscopic analysis is that a smaller set of C < n of these components can describe the important features in the data series and the remaining (n − C) components will effectively remain to describe the noise. This statistical procedure is often used as a tool in exploring relationships in a series of closely related yet varied spectra. That is, if one uses PCA to try to analyse samples that are all original (or all the same) then the analysis will not prove very effective. One of the challenges in this description of the procedure, of course, is to effectively evaluate what comprises a sufficient amount of important features in the data series.

Mathematically, PCA is defined as an orthogonal linear transformation that transforms the original data set into a new coordinate system such that the greatest variance of the data lies on the first coordinate axis (or the first principal component), followed by the second greatest variance along the second coordinate axis, and so on (Jolliffe, 2002 ). There are several methods for implementing the PCA procedure, typically either by taking the eigenvalue decomposition of the covariance or correlation matrix or by the singular value decomposition of a data matrix. With the latter formulation, the rectangular m × n matrix X, with n samples of data length m, can be decomposed as $[{\bf X} = {\bf U}\boldlambda {\bf W}, \eqno (1)]$ where U is an m × m matrix, the columns of which are orthogonal unit vectors that are the left singular vectors of X. This matrix can be considered to be the set of components into which X is decomposed. λ is a diagonal matrix containing the singular values, or eigenvalues, of X. These values indicate how much of a contribution the corresponding component makes to the data set as a whole. W is an n × n matrix, the columns of which are the orthogonal unit vectors that are the right singular vectors of X and are considered to be weights that show how much of the corresponding component is used to construct the column in the data set in X.

2.1. PCA to determine the number of components

Since it will generally be the case that not all of the components are significant, one can set the weights of the insignificant components to zero. In the correct usage, this process will maintain the signal required to reconstruct the data set and eliminate components that correspond to noise. This is analogous to the step in Fourier image compression in which small-amplitude components are zeroed (Manceau et al., 2002 ). Once PCA has been performed, a typical question that is asked in the analysis is how many of the components are actually needed to reproduce the observed spectra in the data set? At a first glance, one can use methods in which components are added to the reconstruction set until the reconstruction agrees with the original data set to within some degree of statistical error. One can also look at the series of components and examine each component to qualitatively determine whether it looks like real signal or noise. The following sections cover several methods that attempt to quantitatively assess the number of components in the PCA analysis. It is important to note that all of these metrics of PCA evaluation do not directly measure the number of physical components in the system, but rather measure the effective dimensionality of the set of data. For instance, if two species are covarying in the system at the same proportions, this would result in a single variable of variation and would not be extracted as two separate components. This limitation should always be considered when using these types of statistical tools as a potential caveat to strict physical interpretations of the number of components determined by PCA.

2.1.1. Variance and scree plots

As a first measure, the number of components required could be determined by examining the amount of cumulative variance explained by the Cth component. This can be set at some level that seems reasonable to explain the data versus the noise, but is again often arbitrarily set to some value, such as 90%. A slightly more analytical procedure would be to set the variance level such that the Cth component explains less than 1/n of the data-set variance; that is, an additional component explains less variance than if all of the components had equal weight.

Another criteria involves using a scree plot (Fig. 1 ), which plots the eigenvalues against the component number. This plot shows the decreasing rate at which additional components contribute to the overall variance of the data set. In the ideal case, a scree plot will have a steep curve followed by a bend and then a flat or horizontal line. The name of this test originates from the geological term `scree slope' or constant slope, which is the pile of landslide debris that accumulates at the bottom of a steep cliff and is characterized by a gently sloping slope of debris adjacent to the cliff face (Huggett, 2017 ). To determine the appropriate number of components, one looks for the bend or `elbow' in the plot, or the point at which the remaining eigenvalues are of roughly equal and relatively small size (Cattell, 1966 ). There is, however, no strict definition of the scaling of the plot or a determination of whether a linear or a semi-log plot is best. The latter can often help to show the break in slope more clearly, and can help in determining the importance of the lower values in the context of a much larger initial eigenvalue. This also leads to the subjectivity of this test, as the decision on where the bend is located is not always clear for every system, and some may show multiple apparent bends, as also seen in Fig. 1 .

Figure 1

Scree plot of eigenvalues for a PCA analysis. The inset shows the full range of eigenvectors, while the main plot displays a detailed region. The position of the break in the slope suggests the number of components. For this data set, the number of components could best be interpreted as 3.

2.1.2. Indicator function

The variance and skew methods, while somewhat more quantitative than the qualitative method of `does the component look like noise?', still lack a definable measurable quantity. Malinowski derived a criterion called the `indicator' or `IND' function (Malinowski, 1977 ). The IND function is given by $[{\rm IND} = \left({{\textstyle \sum \limits_{\alpha=C+1}^p \lambda_\alpha ^2} \over {n(p - C)^5}} \right)^{1/2},\eqno (2) ]$ where C, the number of considered components, may be varied. The number of components that should be considered for further analysis is that at which the IND function is at a minimum. The function is empirical but does seem to work in many data situations, particularly those in which the errors in the data are relatively uniformly distributed and are free from systematic and erratic errors (Malinowski, 1977 ; Manceau et al., 2002 ).

2.1.3. Normalized sum squared difference (NSS) statistic

As mentioned previously, many of the statistical analysis parameters work well in data sets that are relatively free of noise, have at least a uniformly distributed set of noise and are free from systematic errors. These parameters can often grossly underestimate or overestimate the number of principal components, often the latter when it comes to the analysis of even low-noise XANES data series. The NSS statistic includes an analysis of the experimental error in its process and is generally a very effective statistic for determining the number of principal components (Manceau et al., 2014 ).

The concept behind the NSS statistic parameter, as stated by Manceau et al. (2014 ), is to quantify the degree to which the fit of a series of PCA abstract components to a denoised data spectrum represents all of the spectra in the original data set about equally well in comparison to the noise level of the spectra. The denoised spectra can be obtained by several procedures, with one of the main issues being that the major features of the spectra need to be maintained, i.e. given the experimental energy resolution and the intrinsic core-hole lifetime of the measurement at the specific edge of the element of interest. The NSS statistic is broken into two parts, with the first being the goodness of fit of C PCA components to fit the denoised spectra: $[{\rm NSS}({\rm denoised}) = \textstyle \sum \limits_{i=1}^m ({\rm denoised}_i - {\rm fit}_i)^2\big/\textstyle \sum \limits_{i=1}^m ({\rm denoised})^2. \eqno(3)]$ This parameter is calculated for each of the n spectra in the set, being summed over the m data points in the spectra using the notation used previously for each addition of a new component. The second factor is the normalization of the denoised data to the original data set: $[{\rm NSS}({\rm data}) = \textstyle \sum \limits_{i=1}^m ({\rm data}_i - {\rm denoised}_i)^2\big/\textstyle \sum \limits_{i=1}^m ({\rm data}_i)^2.\eqno (4)]$ These parameters can be visualized as a function of each spectrum in the data set and the number of PCA components, C, to be used. A overall estimator can then be calculated by determining the ratio of these goodness-of-fit parameters, NSS(denoised) and NSS(data), and averaging this over all of the data set. A standard log-based scree plot can again be used to determine the critical number of components, plotting the NSS statistic aggregate as a function of the number of components used. As proposed by Manceau et al. (2014 ), it was found that the NSS statistic is generally less sensitive to the noise in the system than other statistical factors, and also tends to be less sensitive to nonstatistical noise, which is often present in XAS spectra.

2.2. PCA as a method of analysis

As described above, the components that are often extracted from PCA are mathematically abstract and do not have a physical meaning. For XANES, the first component resembles the overall average edge shape and the remaining components resemble the remaining deviation. These additional components are often abstract and difficult to assign meaning. In contrast, in the analysis of EXAFS Wasserman et al. (1999 ) suggested that since the components are orthogonal, the PCA tends to separate components based on the EXAFS shell and thus the components may be analysed using normal EXAFS methods. As noted above, while the PCA analysis does separate orthogonal components, there is not necessarily any direct physical meaning of a component other than to denote the greatest variation in a series of data. Thus, in order for a meaningful interpretation to be extracted for one of the components, there must be variance in the data set in that EXAFS shell, so the entire content of the data set may not be easily separated in this manner. This method of interpretation is limited to data sets that have clearly and continuously changing species across the series, which is only applicable to a limited number of chemical situations, such as following a chemical reaction through time or spatial gradients.

In these cases, such as examining a chemical reaction, PCA can be useful. As one example, Ganio et al. (2018 ) examined the reaction of lapis lazuli sulfur species in the X-ray beam and monitored the change in the XANES pre-edge region to characterize the change of sulfur speciation. PCA was able to extract the change in species, with negative portions of the resulting component representing mass loss of the species and positive portions of the component representing mass gain of the species. This presumes that the loadings of these species all increase with increasing time in a positive manner; otherwise, the sign of the mass loss/gain would be reversed. The magnitudes of the loadings can also be plotted against the variables used in the reaction (i.e. time, dose etc.) to extract real kinetic information from the PCA results, assuming that a simple kinetic rate law can be applied (Ganio et al., 2018 ).

In the next sections, PCA analysis will be used as a foundation to help the eventual goal of fitting the data set to some list of reference spectra. Fig. 2 (a) shows an example of a data set of 16 spatially resolved spectra in a relatively simple set of XANES data. Since the actual components of the PCA may be abstract and may not have an apparent physical meaning, as seen in Fig. 2 (b), interpretation of the PCA results can still be used to obtain useful information as a whole on the data series. The PCA analysis can be used to find out (i) which spectra in the data set are the most unique, or in other words could be used as end-member spectra, and/or (ii) which subset of real reference compounds might most accurately be used to describe the variation in the data set. One can also apply one of several rotation methods to attempt to find a basis series that is easier to interpret than the initial PCA results. These methods will be discussed next.

Figure 2

Example results from PCA analysis. (a) Set of normalized As edge XANES spectra. (b) The resulting first four components. Note that the first, largest component (red) is similar to the overall edge shape. Components in red, green and blue all appear to have significant features of reasonable magnitude, whereas the fourth (black) is approaching noise levels. (c) End-member spectra as determined by the PC loadings plot in Fig. 3 .

2.2.1. Using PCA to find end-member spectra

The process of examining PCA components to determine end-member spectra is typically most useful when using a data set that contains several spectra that may be considered to be relatively chemically pure for the element of interest. This can be performed effectively when examining data sets from highly spatially resolved XAS, such as a microprobe, where the spectral information can be obtained from a relatively homogeneous region. The method can also be used when following chemical reactions, where there may be a clearly defined initial reactant and final product but the presence of an intermediate is unknown.

To determine the characteristic end-member spectra, one needs to understand how each of the data-set spectra contributes to each of the principal components. This is performed by examining the W matrix of weights, or loadings. Positive loadings indicate that a spectrum and a principal component are positively correlated: an increase in one results in an increase in the other. Negative loadings indicate the reverse: a negative correlation. Large loadings, either positive or negative, indicate that a variable has a strong effect on that principal component, so one wants to find which spectra have large loadings. Plots of the loadings of the major principal components, commonly termed biplots, show the distribution of the samples within the principal component eigenvector space. This can help to identify the maxima and minima, as well as other major clusters that delineate important groups of species. Fig. 3 shows an example of how loading plots can be used to help to pick out end-member spectra. The data in the biplot show three distinct extrema clusters in the two-component space: principal component 2 (PC2) versus principal component 3 (PC3). These clusters are denoted by the groupings coloured in green, blue and red. Each of the clusters is well separated from the other data spectra; the green and blue clusters are located at the extreme end of the PC2 axis, and the red cluster, while central in PC2, is separated by a large negative value in PC3. The data points on this plot form a triangle with a cluster of potential end-members at each vertex of the triangle. The cluster of points in cyan lies on a direct line between the green and blue clusters, and can ideally be represented by a mixture of these two end-members. The chosen representative end-member spectra are shown in Fig. 2 (c).

Figure 3

Principal component loadings for the data set in Fig. 2 . The loading of PC2 against PC3 shows the trend of three major components, represented in blue, green and red. The cyan points show the mixing between the populations in green and red. The end-member spectra denoted by the blue green and red groups are shown in Fig. 2 (c)

This method, while easy to employ, is also rather subjective. In more complicated systems one would also need to potentially view the clustering of PCA weightings in a multi-dimensional space. This can be attempted with 3D plots and/or by examining a series of successive biplots that look at the influence of higher order components and which of these break up clusters or correlations of points that are grouped together in other biplots.

Another step after initial PCA analysis is to perform iterative target-factor analysis (ITFA; Rossberg et al., 2003 ; Scheinost et al., 2005 ). This process attempts to manipulate the components in a manner where they are able to provide some type of meaningful physical representation of real species. The concept is to create a new basis set of components that are linear combinations of the components from the initial PCA, and have the weights of the linear combination between 0 and 1. The process starts with a Varimax rotation, which adjusts the components by an orthogonal rotation that maximizes the sum of the variances of the squared loading vectors (Kaiser, 1958 ). This effectively tries to break complex correlations of the various samples that are composed of multiple components of various intensities; that is, it tries to rotates the basis-vector space such that the samples are represented by the smallest number of high-magnitude vectors. ITFA is applied next as a non-orthonormal rotation to normalize the magnitude of the loadings of the vectors between 0 and 1. The result of these processes is that the final component spectra often appear to have highly physical and interpretable representations.

2.2.2. Target transformations

The target transformation is a tool that is commonly used to identify whether a reference compound spectrum is suitable for describing the variance observed in the data-set spectra. The determining factors for this are the significant principal components as defined by the criteria and procedures described in Section 2.1 . If the reduced set of components $[\hat U]$ defines the orthogonal basis for the data set, then in order for a reference compound to be deemed suitable it must lie within the vector space defined by this basis set. If there are features in the reference compound that are not found in any of the data-set spectra or combinations of spectra, then this reference will not fit suitably in the basis set defined by $[\hat U]$ and this reference should not be considered as a possible component. The target transform $[\tilde r]$ is thus a projection of the reference spectrum r onto the subspace spanned by the vectors in $[\hat U]$ (Manceau et al., 2002 ), and in a perfect case the target transform of the reference will match the original reference. Effectively, the target transform is an unconstrained linear combination fit (i.e. no positivity constraint and no need for the sum of spectra to be equal to the sum) of the reference spectra to the set of components ( $[\hat U]$ ).

The trick in this target transform is to define an objective way to decide whether the target transform of the reference compound candidate is truly similar enough. Malinowski used the theory of error to derive the SPOIL value (Malinowski, 1978 ), which measures how replacing one of the abstract principal components with the candidate reference spectrum would increase the fit error. The SPOIL value is a non-negative, dimensionless number. Rule-of-thumb criteria based on model sets of data have shown that SPOIL values that are less than 1.5 are considered to be excellent matches, values of 1.5–3 are good, values of 3–4.5 are fair, values of 4.5–6 are poor and values greater than 6 are unacceptable. The function is given by $[{\rm SPOIL} = \left[{{n(p - C)\textstyle \sum\limits_i ({\tilde r}_i - r_i)^2} \over {(n - C)\textstyle \sum\limits_{\alpha=C+1}^p \lambda_\alpha^2\sum\limits_{a=1}^C \left (\lambda_a^{-1}\sum\limits_i U_i^a r_i \right)^2}} - 1 \right]^{1/2},\eqno(5)]$ where the sums over i are over the data points in each spectrum (Manceau et al., 2002 ). If the value of the radicand is negative, which can happen in cases where the fit is extremely good, then the SPOIL value is defined as 0. The numerator represents the quality of the fit between the target-transformed spectrum and the input reference spectrum, and the denominator represents the amount of noise or error present in the components that are not used in the transform.

3. Linear combination fitting

When performing a fingerprinting analysis, one is often analysing a sample, or a set of samples, that are a mixture of different, presumably pure-phase, compounds. This leads to using the principles of fitting the unknown, multicomponent sample(s) to a set of known, measured reference compounds, often referred to as a spectral decomposition. The methods in the previous section (i.e. PCA and its related interpretations) will often be used to help to define the proper reference library to be used in the fitting analysis. A healthy knowledge of the samples, i.e. their provenance and what might be expected, is often useful as well and helps to prevent chasing after chemical species that are not possible or relevant.

3.1. General spectral decomposition methodology

The basic problem presented for an unknown sample spectrum d is to perform the spectral decomposition over a series of C reference components, $[d = \textstyle \sum \limits_{\alpha=1}^C f_\alpha r_\alpha,\eqno(6)]$ where f_α is the fraction of the αth component r_α. This can be written in matrix form as $[{\bf d} = {\bf fR},\eqno (7)]$ where d is the vector of the data spectrum, f is the vector of fractional compositions and R is the matrix of reference compounds. Since there are typically fewer components than the number of data points in the spectrum, the solution to the problem is overdetermined and is solved in a least-squares sense. Typically, this also requires the non-negativity constraint $[\forall \alpha: {f_\alpha } \ge 0]$ , since we would not expect that the mass loading of any component could be negative. In a practical sense, this usually means that any negative results are removed from the component list, or a non-negative constraint is directly applied in the least-squares fitting routine. A second constraint that is often applied is the sum-to-one restriction, such that the sum of all f_α is 1. If this constraint is omitted, then the resultant sum of fractional components can be used as a check on the meaningfulness of the fit. If there is a significant deviation from unity then there is an apparent problem with the assumptions behind the fit. Typically, these assumptions may include completeness of the reference matrix R, artefacts in the data collection leading to decreased quality of the data sets, and shifts in the energy calibration between samples and references. In this latter case, where there can be an offset between the energy calibration of the sample with respect to the references, one can also use an energy shift as a free parameter in the fit. One should be careful when applying this type of extra parameter, as a variable energy shift can lead to nonsensical or misleading results. A careful examination should be performed to check that the degree of energy shift is rational and well justified by the experimental conditions. In all cases, all of the resulting parameters of the fit should be checked to make sure that they provide a rational result.

3.2. Combinatorial analysis

While PCA can be used to help to define an appropriate set of reference compounds for a particular assemblage of data spectra, it cannot be applied meaningfully in all situations. This is particularly true in situations where there is not a large set of data spectra to be fitted and the PCA decomposition does not present a well-posed solution. In such cases, combinatorial analysis or brute-force fitting methods can be applied.

3.2.1. Traditional least-squares fitting

Typically, linear combination fitting is performed with an unknown sample and a list of l references. One performs the linear least-squares fit as described in Section 3.1 , following the constraints of the system, and looks at the resulting fit. At this point, if non-negativity is not applied as a constraint then the largest negative fitted reference is removed and the process is repeated. Once the negative values have been removed, one removes the smallest fitted references stepwise and examines the fit to determine whether a significant change in the fit quality occurs on the removal of a reference. This will typically remove those references that are very small and within the error of the measurement, and fitting approximately <5% of the total speciation of the element of interest. This approach is the general starting point when conducting XANES or EXAFS linear combination fitting, as often presented in many computer programs used for XAS data fitting (Newville, 2013 ; Ravel & Newville, 2005 ; Webb, 2005 ), and generally reaches satisfactory answers as long as the reference list is complete. As an example, a determination of the average manganese valence states of mixed-valence oxides was performed and used this type of approach with an appropriate reference list (Manceau et al., 2012 ). When this type of fit reaches an unsatisfactory fit, the general idea is to increase the number of references in the list and/or apply a more thorough investigation of the parameter space defined by the reference list, as described in the more brute-force approaches below.

3.2.2. Matrix fitting

A matrix fit is typically defined as performing a series of fits with a reference compound list in which all combinations of the reference list are used. The fit will then compare the figure of merit of the series of fits: either an R factor or χ². The fit with the best figure of merit will be selected as the correct fit of the unknown. For a reference list of l references the full combinatorial matrix fit computes 2^l − 1 fits, which even for small reference lists can lead to long times to calculate the entire set of fits. The procedure can be shortened in conjunction with a PCA analysis of the number of principal components C determined for the data set as in Section 2.1 . In this situation, one can perform the combinatorial fitting by performing all possible fits up to a maximum of C reference compounds in the reference list.

3.2.3. Cycle fitting

The time involved in matrix-fitting large reference lists was one of the issues that led to development of the `cycle fit' (Kim et al., 2013 ). Cycle fitting will initially fit the experimental spectrum with all l one-component fits for a set of l references and report the associated R factor. The best-fit reference is then selected as the first component for the next cycle with the remaining l − 1 references. This cycle process is repeated, adding the next yth reference with the best-fit R factor and then performing fits with the remaining l − y references, until adding another reference component no longer makes a significant improvement to the figure of merit for fitting. If non-negative least-squares fitting is not being performed, any fits that lead to negative fractions are rejected. Significance for finishing the fit could be considered as a reference that must be >10% of the total fit or one that must cause the R-factor figure of merit to decline by >10%. Generally, one will converge on a best fit much faster than a full combinatorial matrix fit with similar best-fit results, although one is not guaranteed to reach the best answer. An example is where a reference A may be the best fit for a single component, but a set of two or more references B and C together may prove to be a better fit than A alone. In this case, the cycle fit would never reach B and C without A, as A would always effectively be included in the fit. Ideally, the combination of A, B and C would be reached in the cycle fit and the importance of A would be removed. As in all linear combination fits, the sensitivity of the final answer is always worth exploring to ensure that the method utilized is not adding an unexpected bias.

4. Conclusions

Linear combination fitting is the one of the fundamental cornerstones of fingerprinting sample analysis. This is most often used to obtain the approximate composition on an element mass balance of the sample in terms of a library of known reference spectra. This technique is only as good as the completeness of the reference spectra that are being used. Principal component analysis has many uses in the same goal of sample fingerprinting, including examining the trends in series of spectra, determining end-member spectra and determining the suitability of potential reference compounds. Principal component analysis and linear combination fitting are both powerful tools for the basics of fingerprinting analysis and are often used together for validation of the fingerprinting process.

References

Cattell, R. B. (1966). Multivariate Behav. Res. 1, 245–276.Google Scholar

Fay, M., Proctor, A., Hoffmann, D., Houalla, M. & Hercules, D. M. (1992). Mikrochim. Acta, 109, 281–293.Google Scholar

Ganio, M., Pouyet, E. S., Webb, S. M., Schmidt Patterson, C. M. & Walton, M. S. (2018). Pure Appl. Chem. 90, 463–475.Google Scholar

Huggett, R. (2017). Fundamentals of Geomorphology, 4th ed. Abingdon: Routledge.Google Scholar

Jolliffe, I. T. (2002). Principal Component Analysis. New York: Springer.Google Scholar

Kaiser, H. F. (1958). Psychometrika, 23, 187–200.Google Scholar

Kim, C. S., Chi, C., Miller, S. R., Rosales, R. A., Sugihara, E. S., Akau, J., Rytuba, J. J. & Webb, S. M. (2013). Environ. Sci. Technol. 47, 8164–8171.Google Scholar

Malinowski, E. R. (1977). Anal. Chem. 49, 612–617.Google Scholar

Malinowski, E. R. (1978). Anal. Chim. Acta, 103, 339–354.Google Scholar

Malinowski, E. R. (2002). Factor Analysis in Chemistry, 3rd ed. New York: John Wiley & Sons.Google Scholar

Malinowski, E. R., Weiner, P. H. & Levinstone, A. R. (1970). J. Phys. Chem. 74, 4537–4542.Google Scholar

Manceau, A., Marcus, M. A. & Grangeon, S. (2012). Am. Mineral. 97, 816–827.Google Scholar

Manceau, A., Marcus, M. & Lenoir, T. (2014). J. Synchrotron Rad. 21, 1140–1147.Google Scholar

Manceau, A., Marcus, M. A. & Tamura, N. (2002). Rev. Mineral. Geochem. 49, 341–428.Google Scholar

Newville, M. (2013). J. Phys. Conf. Ser. 430, 012007.Google Scholar

Ravel, B. & Newville, M. (2005). J. Synchrotron Rad. 12, 537–541.Google Scholar

Ressler, T., Wong, J., Roos, J. & Smith, I. L. (2000). Environ. Sci. Technol. 34, 950–958.Google Scholar

Rossberg, A., Reich, T. & Bernhard, G. (2003). Anal. Bioanal. Chem. 376, 631–638.Google Scholar

Scheinost, A. C., Rossberg, A., Marcus, M., Pfister, S. & Kretzschmar, R. (2005). Phys. Scr. 2005, 1038.Google Scholar

Stark, H. & Woods, J. (1986). Probability, Random Processes and Estimation Theory for Engineers. Upper Saddle River: Prentice-Hall. Google Scholar

Wasserman, S. R., Allen, P. G., Shuh, D. K., Bucher, J. J. & Edelstein, N. M. (1999). J. Synchrotron Rad. 6, 284–286.Google Scholar

Wasserman, S. R., Winans, R. E. & McBeth, R. (1996). Energy Fuels, 10, 392–400.Google Scholar

Weare, B. C. & Nasstrom, J. S. (1982). Mon. Weather Rev. 110, 481–485.Google Scholar

Webb, S. M. (2005). Phys. Scr. 2005, 1011.Google Scholar

Yip, C. W., Connolly, A. J., Vanden Berk, D. E., Ma, Z., Frieman, J. A., SubbaRao, M., Szalay, A. S., Richards, G. T., Hall, P. B., Schneider, D. P., Hopkins, A. M., Trump, J. & Brinkmann, J. (2004). Astron. J. 128, 2603–2630.Google Scholar

International Tables for Crystallography (2024). Vol. I. ch. 5.17, pp. 709-715
https://doi.org/10.1107/S1574870720016705