International
Tables for
Crystallography
Volume I
X-ray absorption spectroscopy and related techniques
Edited by C. T. Chantler, F. Boscherini and B. Bunker

International Tables for Crystallography (2022). Vol. I. Early view chapter
https://doi.org/10.1107/S1574870720007296

## Statistical measure of confidence

Corwin H. Bootha*

aLawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
Correspondence e-mail: corwin.booth@gmail.com

The effect of random errors in EXAFS data are considered when estimating fit-parameter uncertainties. Reviewed methods include conventional χ2 minimization and the F-test.

Keywords: accuracy; errors.

### 1. Introduction

The EXAFS technique is not a count-rate-limited technique; that is, even if one could collect an infinite number of photons, there would still be limitations on the accuracy of the measurement. Although this is ultimately the case due to limitations of sample preparation (Bridges, 2022), beam issues (for example harmonic contamination; Chantler, 2022) and even errors in the lineshapes used for fitting (Booth, 2021), there are many cases where a given measurement (as opposed to the technique) is limited by the number of photons collected simply because the absorbing species in a sample is dilute. There are other cases where a source of error is partially random, such as time-dependent beam-stability issues coupled with sample non-uniformity. In cases such as these, a proper treatment of the random (assuming normally distributed) errors in EXAFS can, in fact, give an accurate estimate of the uncertainties (standard deviations) sj in the parameters pj used to determine the fit function fi at each data point i. The discussion below focuses on conventional error-analysis techniques as applied to EXAFS, much of which is discussed more completely in the literature (see, for example Filipponi, 1995). Note that other methods are in use, most notably those utilizing reverse Monte Carlo techniques (Gurman & McGreevy, 1990; Curis & Bénazeth, 2005).

Assuming that an estimate of the variance on a given measurement xi is representative of the actual variance of the distribution of xi, one may calculate the statistical χ2,where the usual formula for χ2 has been scaled to account for the typical issue in spectroscopy that the total number of data points ntot is not necessarily the same as the number of independent data points nind, as will be shown below. With this definition, one may use standard techniques for error analysis, where χ2/ν → 1 for large degrees of freedom, ν = nindnpar, and npar is the number of parameters used in the fit. In this section, ways of estimating the uncertainties from random errors will be presented. Since determining errors requires an understanding of the information content in EXAFS spectra to a certain degree, some Fourier transform concepts will also be presented.

### 2. Information content

Typical EXAFS fitting techniques are limited in their information content both by the range in k of the data themselves and by the range in r of any model fitted to these data. This limitation exists whether a spectrum is fitted in k-space or r-space, since data are effectively limited in their r range by the number of scattering shells used in a given model. When fitting in r-space with defined ranges in k and r, the number of independent data points nind iswhere the fit ranges are inclusive; that is, the 2 accounts for the data points at the beginning and the end of the fit range. This formula is a variant of the Shannon–Nyquist theorem, but is generally referred to in EXAFS work as Stern's rule' (Stern, 1993), and has also been verified in simulations (Booth & Hu, 2009).

The limitation on nind is in fact a mathematical identity, a fact that is sometimes misunderstood due to the practice of interpolating data onto a fixed grid in k-space with a constant separation between data points δk, which is a requirement of the fast Fourier transform (FFT) method (Cooley & Tukey, 1965; Gauss, 1866). When applying a particular model to the fit of an EXAFS spectrum, any statistical estimate of the fit-parameter uncertainties in that model depends on not trying to fit to data with more fitting parameters than independent data points. In otherwords, the degrees of freedom ν of a fit should not be negative if the fit is to be unique.

A corollary to the number of independent data points is the limitation on the number of fit parameters within any particular r range within the total fit range. This limitation is more difficult to quantify owing to the extended range in r of any specific scattering peak, so the usual rule of thumb is to not attempt to fit two scattering shells with equal coordination and pair-distribution widths if they are close enough together that the first beat frequency is outside the fit range, otherwise the difference in pair distance, δr = r2r1, will be too strongly correlated to the Debye–Waller factors. This limit occurs whenwhere kmax is the upper limit of the fit range. In practice, it is important to realize that this limit is a best-case scenario and that, for instance, if the coordination numbers of the two proposed shells are not equal, the correlation between N, r and σ will be stronger. These correlations should be observable in a conventional error analysis.

### 3. Estimates of parameter errors from multiple spectra

Bearing in mind the principles in Section 2 when constructing a fit model, the best and easiest method for determining truly random uncertainties of the final fit parameters is to make multiple fits to many spectra and to average the parameter pj results to determine sj. This method has real advantages over other methods relying on a single spectrum because the uncertainties in the background functions, especially the so-called post-edge' atomic background absoption function μ0(E), can be better included. For instance, one might mistakenly assume that random noise is strictly proportional to the square root of the collected number of photons and that, therefore, relative uncertainties in χ increase with k; however, errors in μ0(E) are generally largest at low k due to background-subtraction issues. Such errors also translate to larger errors at low r, as depicted in Fig. 1. It is important to understand that errors determined in this way still do not account for systematic errors or for errors that are not normally distributed.

 (a) Fit results for sample copper-foil data. Data were transformed between 2.5 and 15.8 Å−1 with a 0.3 Å−1 wide Gaussian window. Data are from the average of eight scans. Error bars [difficult to discern in this plot, see (b)] were determined from the standard deviation of the mean (sdom) of these scans. (b) Estimated error (sdom) of the modulus in (a).

### 4. Estimates of parameter errors utilizing χ2 methods

While sophisticated approaches to error estimates in EXAFS data analysis based on a Bayesian formalism that generalizes the least-squares method in multi-parameter space have been illustrated (Krappe & Rossner, 2000), conventional χ2 methods remain the dominant tool for determining the fit-parameter errors sj.

In order to perform a standard statistical χ2 analysis to obtain sj, one first needs to have some estimate of ei. A good method is to average many spectra and determine the standard deviation of the mean as an estimate of ei at each value of k or, alternatively, to do the same on the real and imaginary parts of the Fourier transform in r-space, as depicted in Fig. 1. One may also propogate the k-space-measured errors into the Fourier transform (Curis & Bénazeth, 2000). Unfortunately, most fitting routines do not allow an estimated error as a function of k or r. This limitation is not considered to be very significant, since systematic sources of error generally make estimates of parameter uncertainties somewhat unreliable. An alternate method is to estimate data uncertainties by the magnitude average over some range in r at high r of a Fourier transform where the oscillations in χ(k) have presumably been overwhelmed by multiple-scattering interference and the factor of 1/r2 in the EXAFS equation. This estimate does not account for the larger errors due to background corrections, but can still be an overestimate in some spectra because EXAFS oscillations from the structure may still contribute. A copper-foil spectrum is an extreme example, where for a single spectrum for the data in Fig. 1 one would estimate an uncertainty of about e ≃ 1 in these k3-weighted units from the high-r data, whereas the standard deviation (sd) between multiple spectra in the typical fit range between about 1 and 5 Å is about sd ≳ (8)1/2 × 0.02 = 0.06, given the approximate average in the fit range of 0.02 from the figure and the eight measurements that were averaged.

Once the uncertainty per data point has been determined, one may perform a standard error estimate using statistical χ2 minimization methods. Once one has obtained the best fit, for instance by using the Levenberg–Marquadt method, the parameter errors sj are typically obtained from the diagonal elements of the inverse curvature (Hessian or approximate Hessian) matrix, which gives the covariance matrix (Press et al., 1992). This method works well for high-quality fits and variations of χ2(pj) around its minimum where the parameters pj = pj0 are well behaved, i.e. χ2(pj) is at a global minimum and is a quadratic function in the vicinity of pj0. However, this method can underestimate parameter uncertainties in many EXAFS models because the models and parameter correlations are often such that χ2(pj) is not well approximated around its minimum by the low-order Taylor expansion A clear example occurs for moderately large values of a Debye–Waller factor parameter σj, which necessarily has asymetric uncertainties around σj0. In these cases, a profiling method may be used that is less sensitive to the expansion approximation above, where one first determines the best-fit parameter pj0 and then determines the error around it by freezing all pkj = pk0 and varying pj around pj0 until (Arndt & MacGregor, 1966; Bevington & Robinson, 1992), at which point the standard deviation for pj0 isThis method naturally allows the determination of both the positive and negative uncertainties around a given pj0 and accounts for parameter correlations.

### 5. Using the F-test for model evaluation

As described above, the estimation of uncertainties in the data is critical to determining parameter uncertainties in experiments limited by random variations in the data. If one is trying to determine whether one fitting model is statistically more significant than another, the χ2 test is best eschewed for the F-test (Bevington & Robinson, 1992) in EXAFS methodologies (Joyner et al., 1987; Freund, 1991; Michalowicz et al., 1999; Klementev, 2001; Piazza, 2002; Downward et al., 2007), which is not as dependent on explicitly determining the data errors, although it still assumes they are normally distributed. F is defined aswhere the subscript 0 denotes the better of the two fits. The assumption in equation (4) is that the fit parameters are independent between ν0 and ν1. A formulation that is commonly applied in crystallography (Hamilton, 1965) considers that only some of the parameters are actually different; that is, model 1 is nested within model 0 (for instance, includes an additional scattering shell):Since the estimated data errors are the same for and and they approximately cancel in F, the fit residual can be used instead as long as is defined such that in equations (4) and (5). Note that some fitting routines, in particular IFEFFIT, define as the square of the definition used here, that is, . An alternative formulation (Hamilton, 1965) that makes explicit the number of degrees of freedom that have changed between the two fit models utilizes nind, the number of fit parameters in the better fit m, and b = ν1ν0 or the number of fit parameters that have changed, depending on the situation:Once F has been defined, one can calculate the probability α that the experimentally determined is actually smaller than the actual F distribution for the model to give the degree of confidence of the fit (Bacchi et al., 1996), where is the incomplete beta function and For to represent a significantly better fit than , α needs to be greater than 67%, and is generally not said to have passed the F-test until α ≥ 95%.

A simple example of applying the F-test is in testing whether adding a scattering shell has a significant effect on the fit. Assuming a fit with m = 7 (a two-shell fit with a single ΔE0 and each shell having individual N, r and σ parameters), b = 3 (testing whether the second shell is necessary), nind = 20, and gives α = 99.9%, which passes the F-test. Other examples of applying these equations to EXAFS are given by Downward et al. (2007).

### References

Arndt, R. A. & MacGregor, M. H. (1966). Methods Comput. Phys. 6, 253.Google Scholar
Bacchi, A., Lamzin, V. S. & Wilson, K. S. (1996). Acta Cryst. D52, 641–646.Google Scholar
Bevington, P. R. & Robinson, D. K. (1992). Data Reduction and Error Analysis for the Physical Sciences, 2nd ed., ch. 11. Boston: WBC/McGraw-Hill.Google Scholar
Booth, C. H. (2021). Int. Tables Crystallogr. I, https://doi.org/10.1107/S1574870720007442 .Google Scholar
Booth, C. H. & Hu, Y.-J. (2009). J. Phys. Conf. Ser. 190, 012028.Google Scholar
Bridges, F. (2022). Int. Tables Crystallogr. I. In the press.Google Scholar
Chantler, C. T. (2022). Int. Tables Crystallogr. I. In the press.Google Scholar
Cooley, J. W. & Tukey, J. W. (1965). Math. Comput. 19, 297.Google Scholar
Curis, E. & Bénazeth, S. (2000). J. Synchrotron Rad. 7, 262–266.Google Scholar
Curis, E. & Bénazeth, S. (2005). J. Synchrotron Rad. 12, 361–373.Google Scholar
Downward, L., Booth, C. H., Lukens, W. W. & Bridges, F. (2007). AIP Conf. Proc. 882, 129–131.Google Scholar
Filipponi, A. (1995). J. Phys. Condens. Matter, 7, 9343–9356.Google Scholar
Freund, J. (1991). Phys. Lett. A, 157, 256–260.Google Scholar
Gauss, C. F. (1866). Werke, Vol. 3, pp. 265–320. Göttingen: Akademie der Wissenschaften.Google Scholar
Gurman, S. J. & McGreevy, R. L. (1990). J. Phys. Condens. Matter, 2, 9463–9473.Google Scholar
Hamilton, W. C. (1965). Acta Cryst. 18, 502–510.Google Scholar
Joyner, R. W., Martin, K. J. & Meehan, P. (1987). J. Phys. C Solid State Phys. 20, 4005–4012.Google Scholar
Klementev, K. V. (2001). Nucl. Instrum. Methods Phys. Res. A, 470, 310–314.Google Scholar
Krappe, H. J. & Rossner, H. H. (2000). Phys. Rev. B, 61, 6596–6610.Google Scholar
Michalowicz, A., Provost, K., Laruelle, S., Mimouni, A. & Vlaic, G. (1999). J. Synchrotron Rad. 6, 233–235.Google Scholar
Piazza, F. (2002). J. Phys. Condens. Matter, 14, 11623–11634.Google Scholar
Press, W. H., Teukolsky, S. A., Vetterling, W. T. & Flannery, B. P. (1992). Numerical Recipes in Fortran 77: The Art of Scientific Computing, 2nd ed., ch. 15. Cambridge University Press.Google Scholar
Stern, E. A. (1993). Phys. Rev. B, 48, 9825–9827.Google Scholar