|
International
Tables for Crystallography Volume I X-ray absorption spectroscopy and related techniques Edited by C. T. Chantler, F. Boscherini and B. Bunker © International Union of Crystallography 2024 |
International Tables for Crystallography (2024). Vol. I. ch. 5.15, pp. 702-704
https://doi.org/10.1107/S1574870722005511 Chapter 5.15. Bayesian techniques: an overviewaRetired from Helmholtz-Zentrum Berlin für Materialien und Energie GmbH, Hahn-Meitner-Platz 1, 14109 Berlin, Germany, and bGraduate School of Advanced Integration Science, Chiba University, 1-33 Yayoi-cho, Inage, Chiba 263-8522, Japan The standard single-scattering formula gives the experimental extended X-ray absorption fine-structure (EXAFS) cross section as function of the wavenumber in terms of a set of model parameters, including the average distances of the atoms involved in producing the EXAFS signal. To solve the inverse problem of determining the model parameters from the cross section, measured over some range of wavenumbers, a Bayesian approach is used. It allows the subspace of the model-parameter space in which the data essentially determine the model parameters to be determined. Keywords: EXAFS; model parameters; Bayesian approach. |
The analysis of X-ray absorption fine-structure (XAFS) data is traditionally based on least-squares fitting methods. However, if there are more parameters than the data alone allow to be determined, this leads to an ill-conditioned system of equations (Stern, 1988
; Rehr & Albers, 2000
). We instead propose a Bayesian approach (Krappe & Rossner, 1985a
,b
, 2004
). This allows the subspace of the total model-parameter space
in which the data decide predominantly upon the outcome of the fit to be determined, whereas in complementary space a priori assumptions essentially fix the result.
We start the discussion with the direct problem, in which the cross section for photoelectron absorption μexp(k) on the K edge or an isolated L edge is calculated as a function of the wavenumber k of the absorbed X-ray. The calculation is based on the single-scattering formula for monoatomic, unoriented samples or samples with cubic symmetry, valid for k > kcut, where the energy k2ℏ2/(2m) is sufficiently larger than the threshold energy for the photoeffect (Stern, 1988
; Zabinsky et al., 1995
; Bunker, 1983
; Tröger et al., 1994
), with the length correction δRj = δRj∥ + δRj⊥ (Fornasini et al., 2004
; Rossner et al., 2006
), where and truncating the sum beyond the Jth term, related to the parameter kcut. The model parameters in this approach are the following: the correction factor S0 for many-electron effects, the half-path distances of the J scattering paths, Rj, j = 1, …, J, the projected Debye–Waller (DW) parameters and the anharmonicity parameters,
and C3;j, respectively. The correction factor S0 does in fact depend slightly on the path j and on k. We define
as the average of the actual
averaged over j and over k in the relevant k-range, with the remaining k and j dependence being absorbed in the scattering amplitudes fj(k, Rj). The absorption coefficient for the embedded absorbing atom μ0(k), the amplitudes fj(k, Rj), the scattering phases φj(k) and the damping parameter λ(k) follow from an (approximate) solution of the n-electron scattering problem and are, for example, calculated by the FEFF code (Zabinsky et al., 1995
; Ankudinov, 1996
; Rehr & Albers, 2000
; Kas et al., 2024
).
To solve the inverse problem, that is to determine the model parameters from a given set of measured values μexp(El) at energies El, we first obtain wavenumbers kl = [2sm(El − E0)]1/2/ℏ, where E0 is the effective threshold energy. Since the latter is known only approximately, it is often treated as another of the model parameters to be determined by the fit. As usual, a smooth background contribution μback(k) is subtracted from μexp(k), which is obtained from a polynomial extrapolation of the pre-edge μexp to the post-edge region as described by Victoreen (1948
) and Milledge (1962
). Unfortunately, the precise extrapolation recipe influences the final fit parameters somewhat. The μexp(k) are measured with an energy-dependent efficiency A(k). As in Krappe & Rossner (2004
), we obtain from μexp(k) by a polynomial smoothing procedure, described in detail in the appendix to Krappe & Rossner (2004
). Similarly, is obtained from μ0(k). The ratio A(k) =
for k > kedge is then interpreted as the overall efficiency of the experimental setup in the EXAFS energy range k > kcut.
The FEFF result for μ0 needs corrections (Krappe & Rossner, 2004
). We therefore write μ0(k) = +
, where δμ0(k) is represented by a cubic spline on an equally spaced grid of support points, the number of which T is to be chosen to make the spline just sufficiently flexible for the purpose for which it is introduced (Krappe & Rossner, 2004
). The ordinates δμt, t = 1, …, T are also treated as model parameters to be determined in the fit together with all other model parameters.
We have therefore to fit the function for l = 1, …, L, where χ is given for k > kcut by equation (1
). The set of model parameters is
We give the experimental data a Gaussian probability distribution
, characterized by the quadratic form
It is usually assumed that the matrix F, which is the inverse of the variance matrix, is diagonal:
. One also has to associate errors with the FEFF code because the electron multiple-scattering problem can only be treated approximately, for instance by including in the sum in equation (1
) only terms which contribute more than 4% of the total, and integrals have to be approximated by finite sums. We again associate with these errors a Gaussian probability that z′ is true for a given x, i.e. , with
where the matrix B is the inverse of the variance matrix.
The conditional probability that the outcome of the observation is
, once x is given, may be expressed in terms of
and
by
The integral can be evaluated analytically and yields a Gaussian in g(x),
, with
in terms of the L × L matrix
We expand g(x) around a first-guess value x(0) for the solution of the inverse problem where the L × N matrix G is defined as
Inserting into equation (6
) and calling x − x(0) in the following x to simplify the notation, one obtains a second-order polynomial in x, in terms of the N × N matrix Q = GTCG and the vector
. The matrix Q is the inverse of the variance matrix of the distribution Pcond in terms of the variable x.
In order to find the probability distribution for the parameter values x, once the are given,
, we use Bayes' theorem
Bayes' theorem therefore solves the inversion problem in probability theory, but at the price of introducing the prior probability Pprior(x), which expresses the knowledge that we have about the model parameters before the experiment is made. Let us assume for the moment that we have an average value x(prior) and a variance matrix A−1 so that
with
and let us further restrict the matrix A to be diagonal,
and choose x(prior) = x(0). Maximizing Ppost yields the normal equations
An optimal choice of the diagonal matrix A must obviously take the quality of the data into account. Turchin and Nozik (Turchin & Nozik, 1969
; Turchin et al., 1970
; Turchin, 1985
) assume that there is a probability distribution of αn which depends on . They first define the conditional probability
where equations (10)
and (12)
have been used to obtain the last equation. The normalization parameter c of this equation only contains terms that do not depend on A, Q or b. The dependence on the matrices A and Q is shown explicitly, including contributions from the normalization factors of Pcond and Pprior.
However, instead of the inverse conditional probability
is needed. It is obtained by using Bayes' theorem once more:
Very often the function
defined in equation (14)
is sharply peaked in α-space at a point . One can then choose
very broadly without affecting the α dependence of
around the peak. Therefore, close to
one has
. One may use this peak value
as the regularization vector in equation (13
). With the condition , equation (14)
yields N nonlinear equations for the vector of eigenvalues :
for n = 1, …, N.
Note that the regularization method sketched above does not require an a priori restriction of the number of model parameters. Instead, it automatically determines that subspace of the whole model-parameter space
in which the data determine the outcome of the fit. In the complementary space the a priori values
determine the fit. Strong error correlations between two model parameters indicate that the data do not determine them independently.
More extended versions of this article, which includes applications to some typical EXAFS and magnetic EXAFS examples, can be found in Krappe & Rossner (2004
) and Krappe et al. (2014
).
References
Ankudinov, A. L. (1996). PhD thesis. University of Washington, USA.Google Scholar
Bunker, G. (1983). Nucl. Instrum. Methods Phys. Res. 207, 437–444.Google Scholar
Fornasini, P., a Beccara, S., Dalba, G., Grisenti, R., Sanson, A., Vaccari, M. & Rocca, F. (2004). Phys. Rev. B, 70, 174301.Google Scholar
Kas, J. J., Vila, F. D. & Rehr, J. J. (2024). Int. Tables Crystallogr. I, ch. 6.8, 764–769
.Google Scholar
Krappe, H. J., Holub-Krappe, E., Konishi, T. & Rossner, H. H. (2014). XAS Research Review, Vol. 13. The International X-ray Absorption Society.Google Scholar
Krappe, H. J. & Rossner, H. H. (1985a). Advanced Methods in the Evaluation of Nuclear Scattering Data, edited by H. J. Krappe & R. Lipperheide, pp. 215–222. Berlin, Heidelberg: Springer-Verlag.Google Scholar
Krappe, H. J. & Rossner, H. H. (1985b). Advanced Methods in the Evaluation of Nuclear Scattering Data, edited by H. J. Krappe & R. Lipperheide, pp. 242–248. Berlin, Heidelberg: Springer-Verlag.Google Scholar
Krappe, H. J. & Rossner, H. H. (2004). Phys. Rev. B, 70, 104102.Google Scholar
Milledge, H. J. (1962). International Tables for X-ray Crystallography, Volume III, edited by C. H. MacGillavry & G. D. Rieck, pp. 171–173. Birmingham: The Kynoch Press.Google Scholar
Rehr, J. J. & Albers, R. C. (2000). Rev. Mod. Phys. 72, 621–654.Google Scholar
Rossner, H. H., Schmitz, D., Imperia, P., Krappe, H. J. & Rehr, J. J. (2006). Phys. Rev. B, 74, 134107.Google Scholar
Stern, E. A. (1988). X-ray Absorption: Principles, Applications, Techniques of EXAFS, SEXAFS and XANES, edited by D. C. Koningsberger & R. Prins, pp. 3–52. New York: John Wiley & Sons.Google Scholar
Tröger, L., Yokoyama, T., Arvanitis, D., Lederer, T., Tischer, M. & Baberschke, K. (1994). Phys. Rev. B, 49, 888–903.Google Scholar
Turchin, V. F. (1985). Advanced Methods in the Evaluation of Nuclear Scattering Data, edited by H. J. Krappe & R. Lipperheide, pp. 33–49. Berlin, Heidelberg: Springer-Verlag.Google Scholar
Turchin, V. F., Kozlov, V. P. & Malkevich, M. S. (1970). Usp. Fiz. Nauk, 102, 345–386.Google Scholar
Turchin, V. F. & Nozik, V. Z. (1969). Izv. Akad. Nauk. SSSR Ser. Fiz. Atm. Okeana, 5, 29.Google Scholar
Victoreen, J. A. (1948). J. Appl. Phys. 19, 855–860.Google Scholar
Zabinsky, S. I., Rehr, J. J., Ankudinov, A., Albers, R. C. & Eller, M. J. (1995). Phys. Rev. B, 52, 2995–3009.Google Scholar