SIXPack is a software package that encompasses the entire range of XAS analysis. Sam's Interface for XAS Package, or SIXPack for short, is the unification of the previously named SamXAS and SamView programs into a single analysis package. Thus the package can guide the user through data averaging and calibration, background removal, and many aspects of fitting.
The interface builds on Matt Newville's IFEFFIT engine. It contains six basic modules:
SamView - an interface for averaging, calibrating, and deadtime correcting raw data.
Background subtraction of data to create normalized mu(E) and chi(k) functions.
A FEFF periodic table interface to create single scattering phase and amplitude paths - compatible with FEFF6 through FEFF8.
A GUI for the fitting of experimental EXAFS to theoretically derived phase and amplitude files.
A GUI for linear combination fitting of EXAFS or XANES to experimentally obtained reference spectra.
A principal component analysis routine that decomposes data into a set of orthogonal components. Also features a target transformation feature to check if potential reference spectra are in the component space defined by the samples.
The SamView module is a general purpose XAS data preprocessing program. It reads raw data collected from a beamline and allows the user to view raw data, average scans, calibrate energy, and perform deadtime corrections for solid state detectors. Current data formats supported at this time are:
SSRL XAScollect binaries
SSRL XAScollect ASCII files
DND-CAT quick-scanning XAS binary files
DND-CAT ASCII files from step XAS scans
DND-CAT ASCII files from multi-element detectors
Generic ASCII format for other formats
Formats for ALS data files is coming soon. Future ability to include other formats is planned, including possible plug-in type support. A generic ASCII file loader is provided to accommodate other file formats.
The background subtraction module adds the ability to perform the subtraction of gaussian type pre-edge removal to the standard complements of background removal and spline fitting. The gaussian method is a useful approach to remove the scattering contributions of low concentration samples collected with solid-state detectors. A special feature of the XANES fitting routine is the ability to make corrections for self-absorbance to spectra obtained using fluorescence yield detectors (i.e. Lytle cells or Ge solid state arrays).
It is important to realize that although SIXPack, IFEFFIT and other programs are powerful tools for possessing amd analyzing XAS data, nothing replaces experience and sound judgment. This is especially true for EXAFS fitting. It is important to make sure that the results of fit are physically meaningful and the model is appropriate before publishing results.
The program is still in its early stages of development and is constantly undergoing evolution and change. The website executables are updated often (around once a month) and can be redownloaded and installed over previous versions. All questions and particularly comments for improvement are welcomed from users.
NOTE: Any directories containing programs and/or data cannot have the '-' character in them! Data files should not contain spaces. These characters will cause the program to malfunction. Use the underscore characters instead. Also, data files must not begin with numbers as the first character.
Installation of SIXPack (Win32) can be done in one of two ways.
Direct installation of the SIXPack zip file. Download the zip file and extract it into any directory.
Installation of Matt Newville's IFEFFIT installer. If this option is chosen, the version of SIXPack should be updated from the website to get the latest release. This is done by copying the new SixPACK executable (sixpack.exe) to the IFEFFIT install directory. Additionally, the xbms folder, mnexafsfit.iff, and sixpack.gif should be recopied into the sixpack directory in the IFEFFIT install directory.
After installation, program modules are started from the main menu screen after running the SIXPack executable (sixpack.exe).
USAGE NOTE: Many of the plots are created using IFEFFIT in the PGPLOT (or GrWin under Win32) window. This window must stay open while running the program. If the window is closed, new graphs cannot be created and the program will crash if a plot is requested.
A tutorial that goes through some of the basic functions of SIXPack and highlights some of the special features is being developed. Click here for a link to the tutorial page.
SamView is a basic preprocessing module for XAS data. It uses raw XAS data scan files for input. After loading, the individual data columns (typically I0, I1, I2, IF, etc.) can be observed and inspected, as well as all the data columns of a multi-element detector if used. Several types of manipulation can be performed, including detector deadtime corrections, energy calibration, and data averaging. After manipulation of the input, the processed data can be saved as a two column ASCII text file. The SamView window is large due to the many functions that it contains. On computers with small screen resolutions, the window appears with slider bars on the sides so all of the functions of the module may be accessed.
SIXPack utilizes several data file formats. The package is programmed to automatically recognize SSRL binary and ASCII formats, DND-CAT CS-XAS binaries, step-scan ASCII, and multi-element detector ASCII formats. Additionally, the data loader will present a generic ASCII loader if the format is not recognized. Future plans include ability to read ALS data files and NetCDF formats. NOTE: All directories containing programs and/or data cannot have the '-' character in them! This will cause the program to malfunction. Use the underscore characters instead. Also, data files must not begin with numbers as the first character.
Files are loaded into SamView by clicking on the "Add File" or "Add Many Files" button. The "Add File" command allows for adding a single file to the project files list, whereas the "Add Many Files" allows the user to click on several file names for multiple file loading. Files loaded in this manner need only be clicked once to select/deselect. The shift/command keys need not be pressed. If the file formats are recognized, they will be automatically entered in the project files list. Otherwise, the generic file loading dialogue will be presented.
The generic loader shows a text box containing the file contents. Comment and header lines may be denoted by '#', 'A', '%', '*', or '!'. SamView will automatically try to count these lines and suggests the number of lines to skip. This number can be changed in the entry box below. Next, the columns must be assigned their data type. XAS data typically consists of columns containing the energy, the real-time counter (RTC) that gives the time spent collected data at each point, and the counters. The counters are I0, I1, I2, IFs, and ICRs. ICR channels are the incoming count rates for each element of a solid-state detector. The IF and ICR channels consist of checkboxes rather than radiobuttons so that multiple data columns may be selected. The 'None' option may also be selected to ignore the datatype. If a datatype is not assigned to a column it will be assigned values of '1'.
Data files may be eliminated from the project files list by selecting the file in the list and clicking on the "Remove File" button. This simply removes the file from resident memory. The entire project file list may be cleared by clicking the "Clear All" button. All of these options may also be chosen from the "File" menu on the menu bar.
Data sets can be viewed by selecting the filename in the project file window. The data will be automatically displayed in the graphing window. The graph coordinates corresponding to the cursor are shown in the lower right hand corner of the screen. Zooming-in on the graph window is accomplished by left-clicking and dragging the new bounds of the window. A right-click and drag will zoom-out on the graph proportionally to the size of the window defined. The graph can be returned to its default axis range by clicking on the "Rescale" button.
The graph type displayed in the window is chosen by selecting the appropriate data type in the plot type panel. Individual columns can be displayed by selecting the column type and column type of choice. Data plots of all files in the project window can also be overlaid by clicking on the "Stack Plots" button. Derivatives of the data and smoothing can be applied to further examine the files.
When plotting IF columns, the IF column plotted is selected by clicking the channel element at the right. One can scan through all the channels and verify that all of the fluorescence channels are suitable for analysis. If a channel is corrupted, it can be removed from consideration by clicking the "Zero Sel. Chan." button. This only removes the channel from the current selected data file. All the fluorescence channels of the data file can restored by clicking on the "Reload Channels" button.
Data files can also be re-binned on a new energy scale. This is particularly useful for continuous or quick scan XAS data files stored in ASCII format. A common need is to place the data on a "traditional" energy scale that takes into account averaging all measured data points. The algorithm averages all the data points that lie within the limits of each defined bin, weighted by the collection time. This option is chosen by selecting the "Change Data Spacing" option under the "CS-XAS" menu option. The program asks for the constant pre-edge and edge energy spacing, the beginning of k-space, the k-space spacing, and the edge of the energy spectrum. If the edge energy has been defined in the energy correction section (see below) this value will automatically be inputted. The resulting values are placed into the data arrays for the selected file.
After examination of the loaded data files is completed (including energy calibrations below) the data can be averaged and saved by clicking on the "Average Scans" bar. The saved data is in the form of a two-column ASCII file. The format of the saved file (IF, IT1, or IT2) is determined by the plot type selection. All the files in the project files list will be averaged using the energy scale of the first entry on the list. All of the data files that are to be averaged must have the same number of data points. After the files are averaged, the graph window will display the averaged data and ask the user if they wish to save the averaged file. The default filename is the selected datafile with the extension "_fl", "_tr", or "_t2" for IF, IT1, or IT2 data plots respectively.
Energy calibrations can be easily performed in SamView. Clicking on the "Find E0" button locates the first ten local maxima in the spectra. If the inflection point in the edge is the desired energy, the user can click on the first derivative radiobutton, apply smoothing, and locate the maxima. The direction button can be used to toggle through the local maxima. Once the actual edge is entered into the entry field and the return key is pressed, the energy differential is calculated and displayed in the "shift" field below it. The "Apply Shift to Data" button will shift the selected data spectrum by the calculated amount. If the project files are to be averaged and saved, this must be done to the first datafile in the project list.
Deadtime effects can be a very important source of amplitude loss in XAS spectra collected with solid state detectors, such as germanium arrays. The electronic deadtime is primarily associated with loss of counts occurring in the shaping amplifiers of data collection system. This method for deadtime correction depends on collection of the incoming count rate (ICR) which generally has a much faster discriminator than the single channel analyzer (SCA) and therefore does not suffer the same effects as the shaped pulses that are fed to the (SCA). The corrections that are applied by this method are for the type of deadtime commonly described as "paralyzing" deadtime. This is caused by time that it takes for the amplifier to recover from counting an incident photon. If another photon arrives during this period, the deadtime is extended again and the photon is not counted. When high incoming fluxes are being measured, this can lead to an extensive number of missed events. At even larger photon fluxes, this method becomes inaccurate, as the ICR also becomes impacted by deadtime effects.
However, these effects can generally be compensated if the detector deadtime response curve is collected. This is a simple procedure. The response curve is simply the relationship between the windowed counts for the flourescence line being monitored (SCA) and the total incoming count rate (ICR) for each of the elements in the detector array. In order to apply these corrections to the sample data, the ICR statistics must also be collected along with the data. The data collection of the deadtime curve can be performed simply by scanning a slit blade through the X-ray beam. The output of the detectors is monitored during this process. It is best that a concentrated sample is located in the beam to insure detector saturation. This effectively collects both the SCA and ICR at a variety of different incident beam intensities, from low intensity to detector saturation.
Mathematically, these deadtime effects can be described by the following equation: r = kr exp(-rt), where t is the deadtime (msecs), r is the count rate of the SCA, r the ICR, and k is a constant of proportionality. The deadtime curve is fitted to obtain values to t and k which can then be used if the ICR counts are known to correct the sample data. In practice, this is done by clicking on the "New DT File" button in the SamView module, which brings up the Deadtime Fitting Dialogue panel. A new deadtime file can be loaded by clicking on the "Load File" button and selecting the appropriate data file. This will load the data and display each of the fluorescence channels in the window. A cutoff can be placed by changing the "ICR Cutoff" entry field. This defines the upper count rate in the ICR channel to analyze. The default is 400,000 cts/sec, which is a typical range at which the deadtime curve begins to deviate from ideal behavior. Clicking on each channel will display the ICR vs. SCA counts in the plot window. The curves are fit by clicking on the "Fit Deadtimes" button and the results are displayed in the fitting results window. These results are saved into a file that can be used to correct the data using the "Save DT Fit" button. At this point, the Deadtime Fitting Dialogue panel window may be closed. The filename is automatically placed into the deadtime file entry blank in the SamView module. Corrections are applied or removed to the data by simply clicking the "Do Deadtime Corrections?" checkbutton. The file can be used at a later time by entering it into the deadtime file entry or selecting it with the file selection button.
Background subtraction in SIXPack is done using the Background Removal module. The module performs the basic functions of normalizing the XANES data to a unit step edge, spline fitting to isolate the EXAFS, and saving the mu, chi, and R-space data. All of the background removal functions are performed utilizing IFEFFIT.
Loading data is performed in a similar manner as in the SamView module. Buttons for adding a single file and multiple files correspond to the "Add File" and "Add Many Files" buttons. The input for this module is any two column ASCII file, corresponding to either fluorescence or transmission XAS data. Once the file is loaded, the file name will be displayed in the project files list. Selected files can be removed with the "Remove File" button, and the list cleared by the "Clear All Files" button. The project files list also serves as the means to "select" data files. A double-click in the filename field selects the file, placing an "X" to the left of the name and highlighting in bold the filename in the file parameters panel. The selection of files is utilized mostly for the plotting of the normalized data, as noted below.
The background removal procedure is performed automatically every time a graph is created or data is saved with new fitting parameters. The values that parameterize the background fit are set in the file parameters panel. A fit can also be forced by selecting the "Run Fitting" option under the "Parameters" menu. The first set of parameters are general parameters for the AUTOBK routine of IFEFFIT. These define parameters such as the edge position, k-weighting, and windowing for the spline fits. The Rbkg parameter used to determine the number of knots in the spline and is an approximation of the lower limit of information in R-space. AUTOBK fits the spline in order to minimize contributions of background leakage into the data from oscillations of lower frequencies than Rbkg. More information on how the background removal procedure is implemented can be found in the AUTOBK reference documentation.
The second set of parameters is for edge normalization of the XANES spectrum. The module fits a linear first order polynomial to the pre-edge region and a second-order polynomial to the post-edge region. Applying these fitted curves to the data results in a flat "zero" background in the pre-edge and a unit-step edge with a constant post-edge of one. The entry field in this part of the panel are the regions in which the program attempts to fit the polynomials. All values are relative eV compared to the edge value. A second option for pre-edge fitting can be chosen by selecting a gaussian pre-edge fit. This is useful for fitting dilute samples that have been collected with a solid state detector. In these cases, there is often "leakage" of Compton scattering into the SCA windowed energy region. This leakage can be best fit with a gaussian profile rather than a first-order polynomial.
The last set of parameters defines the k-ranges in which the EXAFS spline will be fit. There are entries for these values in both k-space and energy (relative to the edge). These values are linked. Entering a value in the k-space entry will update the appropriate value in the energy entry and vice-versa. The enter key should be pressed whenever entering values to make sure that all boxes are updated.
When fitting many data files, it is often desirable to use the same set of fitting parameters. To reduce repetitive filling of entry fields, all the data parameters of the current selected file can be copied into the other data files in the project file list. This is accomplished with the option under the "Parameters" menu. Selecting "Set All to Current" copies the current parameters to all the data files, whereas the "Set Marked to Current" will do the same to only the data files marked with an "X". The default options (including the default edge selection) can be restored by selecting "Restore Defaults" under this menu option.
The various stages of the background removal can be plotted in a PGPLOT (or GrWin under Win32) window. Selecting one of the plot buttons in the plot type panel creates the graph. Choices of graphs consist of the raw data, pre-edge normalization curves (data, pre- and post- edge fits), spline fits (data and EXAFS background), mu(E) (the unit step normalized data), chi(k) (EXAFS), and R (radial distribution function (RDF)). Similarly, if one of the button is selected with the right mouse button, the graph will overlay all data files of the chosen type which are selected with and "X" as mentioned above. Zooming in can be performed the plot window by selecting the "Plot-Zoom" function under the menu bar. With the mouse, select two points that define the new zoom region. The zoom-in graph will be created. To zoom out, replot the graph using the plot type buttons.
The chi ranges and Fourier transform (FT) parameters can be selected in the plot options panel. These entries define the k-range and k-weighting for plotting data as well as the FT window, windowing parameters, and R-ranges to plot.
Background subtracted data can be saved into two-column ASCII files for further analysis in other modules or programs. Buttons at the bottom of the module can be used to save the unit normalized XANES (mu), EXAFS (chi), or RDF (R). Selection of these options brings up a "save as file" dialogue.
The least squares fitting module of SIXPack is one of the major data analysis portions of the program. It is utilized to fit experimental data, either XANES or EXAFS, as linear combinations of standard reference compounds. This function is primarily used when fitting data that are mixtures of various compounds, which is common in environmental and geological systems. Various fitting constraints can be also be applied. Additionally, special features include allowing the energy scale of the data to shift and a fluorescence self-absorption correction algorithm to be applied. Minimization is performed using Matt Newville's IFEFFIT 'minimize' algorithm.
The primary inputs to the module are the data file to be fit and the set of reference spectra to be considered in the fit. Before files are loaded, the x-limits and weighting should be entered in the fit parameters panel. The data files and component files are chosen by clicking on the file folder icons to the right of the file entries. The data file is loaded into program memory by clicking on the "Load" button. The "Load Components" button similarly loads all of the selected components into memory. Selected components are activated by the checkbox next to the component file entry. By past observation, it has been noticed that it is a good idea to reload the sample and component files often during the analysis to refresh the storage arrays.
In addition to the reference spectra, a linear component can be added. This is accomplished by clicking the seventh linear component active. It adds a uniform linear line to the component fitting. The linear option is useful if there are slight differences in the normalization (usually for XANES spectra) between the various components and data samples. The fitted parameters for the linear component are the slope and intercept of the fitted line.
To insure that the data file and all component files are on the same grid, they are interpolated when loaded into memory. XANES data are interpolated on a 0.1 eV grid and EXAFS on a 0.05 k-space grid.
Fits can be performed in two ways. First, they can be constructed by taking the fractional values in each of the active component slots by pressing the "Construct" button at the top of the fitting window. This option does not fit the data directly, but reconstructs the fit from the user defined values in the "fComp" entries in each of the respective component fields. This can be useful to check previous fit values and for seeding values for the minimization routine.
The second option is to run a true minimization on the system by clicking the "Fit" button. This runs the routine and places the optimized values in the "fComp" entry fields.
Several options can also be used to constrain the fit. These are located in the fit parameters field. The non-negative fit checkbox option insures that none of the fitted components give a negative value. This constraint is commonly used because a component can not have a negative composition percentage in the sample. The next constraint is to insure that all the components that are fitted sum to one. This is commonly done to make sure that all of the "mass" of the sample is accounted for. This restraint is not implemented perfectly -- but rather applies a penalty for the components when they do not sum to one. The penalty factor that it applies for missing the target sum is filled into the "Chisq wt STone" entry field. The other advanced fitting options are discussed below.
The special fitting options for the SIXPack least squares fitting module are the floating data energy scale and the self-absorption correction. These topics are discussed in this section.
Floating Energy Scales: Due to poor energy calibrations, it is sometimes desirable to allow the energy scale to float. This routine was written initially to give both data and components that ability. However, in practice, this is unfeasible. As a result, only the data energy scale is allowed to float, if the option is chosen by clicking the "Allow Data E to Float?" checkbox under the fit parameters panel. When this option is selected, the energy slider toward the bottom of the screen becomes active. The limits for shifting energy are +/-5eV. If a shift is required that needs more than a few tenths of an eV, the files used should be recalibrated. Fitting with the "Fit" button will now allow the energy scale to float, and at the end of the fit, will move the energy slider to the appropriate position. Graphing of the data now should be performed with the "E shifted" graph option.
Self-Absorption Correction: Fluorescence self-absorption is a more complicated option. Fluorescence self-absorption is caused by significant attenuation of the incident beam by the leading, upstream portions of the sample. The transmitted beam thus acquires an energy-dependent shape corresponding to the absorption spectrum of the sample. This shape in I0 offsets the spectrum of the downstream portion of the sample. Note that this effect depends on the local concentration of the absorber in the beam path, with pure mineral grains having the greatest amount of self-absorption. Thus, even "bulk" dilute samples can have substantial self-absorption effects. The primary observed effect in the data is a dampening of XANES white-lines and resonances and decreases in EXAFS amplitudes. The result in the analysis is a decrease in coordination numbers of the EXAFS and an underestimation of the major components. Since many environmental and geological samples are measured in fluorescence mode, this can be a very common artifact in the data.
The SIXPack implementation of self-absorption (SA) corrections is different from other program algorithms in that it does not require knowledge of the sample matrix before performing the correction. This is particularly important for environmental and geological samples where it may be impractical to characterize the composition and density of the sample matrix. The equations describing the SA phenomena can be reduced into two parameters, which can then be fitted directly in the minimization routine. The mathematics of the derivation are discussed below.
First we need to calculate the fluorescence emission at any given depth in the sample. This is given by:
If(t) ~
I0’(t)ms(E)rsdt
Where ms(E)
is the cross section of the sample and rs is its density. I0’(t)
is related to the incident intensity/unit area (photons/sec/cm2) by:
I0’(t) = I0e-msrt
Integrating over the depth of the sample to get the total fluorescence intensity, one gets (t = 0 to t):
ò0t If(t)dt = ò0t I0e-msrt ms(E)rsdt
= ms(E)rsAI0 ò0t e-msrt dt
However, as the fluorescence is emitted from
the sample, it will also be attenuated by the sample matrix. Assuming that
the sample sits at 45° angle relative to the incident beam, this factor is
given by = e-mkarÖ2t
where mka is the sample
absorption cross section at the fluorescence line of interest.
With specular detection at 45°,
the path length of I0 is also Ö2t.
Thus, we can rewrite a total integral from 0 toÖ2t
and given by:
ò0Ö2t If(t)dt = ò0Ö2t I0e-msrt ms(E)rs e-mkart dt
= ms(E)rsI0 ò0Ö2t e-msrt e-mkart dt
The integral term above can be reconstructed as:
ò0Ö2t e-msrt e-mkart dt = ò0Ö2t e(-msrt - mkart) dt
= ò0Ö2t e-rt(ms + mka)dt
Using the relation òeAdt
= eA/b, A=bt (A=-rst(ms
+ Ö2mka),
B=-rs(ms
+ Ö2mka)),
the integral evaluates to:
ò0Ö2t
e-rt(ms
+ mka)dt = [e-rt(ms
+ mka) / -rs(ms
+ mka)]0Ö2t
= e-Ö2rt(ms + mka) / -r(ms + mka) + 1/rs(ms + mka)
= (1- e-Ö2rt(ms + mka)) / rs(ms + mka)
Hence:
ò0Ö2t
If(t)dt / I0 ~ ms(E)
(1- e-Ö2rt(ms
+ mka)) / (ms(E)
+ mka)
We can now make further simplifications, assuming that out sample is essentially “infinitely” thick. In this case, the exponential term can be neglected, as (ms + mka) + >> t-1. Thus the problem simplifies to:
If
/ I0 ~ ms(E) / (ms(E)
+ mka)
Since ms(E) corresponds to the absorption cross section of
the sample, it is also representative of the undistorted, "ideal"
ratio of If / I0. By defining Df
as the measured ratio of If / I0, the
correction factor we need to apply is given by solving the above statement
for ms(E).
ms(E)
= Df mka
/ (1 - Df
)
Note that then we can parameterize our correction in terms of two variables: Df and mka. When this is performed in the program, it is normally done on data that has already been normalized to a unit step. This means that although the parameters are correct in terms of scaling the data for the SA effect, they no longer truly correlate to the real values of Df and mka.
In practice, the SA corrections are done in an iterative manner in SIXPack. First an initial fit to the SA impacted data is performed using the normal "Fit" function. Next, a fit is done to determine the SA coefficients by clicking the "SA to Fit" button. This only fits the SA parameters applied to the data to the fit derived from the components. Then, a fit is performed with the components to the SA modified data. This is accomplished by pressing the "Fit to SA" button. Note that pressing the "Fit" button here will fit directly to the data, and not the SA corrected data! This process of "SA to Fit" and "Fit to SA" is repeated until the SA coefficients and "fComp" values stabilize and do not change with further iterations. As the SA coefficients become smaller, the SA effect becomes larger. If the SA parameters begin to explode to high values, then the effect of SA on the sample is either very small and cannot be corrected, or the sample is impacted by artifacts of a different nature. At all steps in the procedure, the "new" SA corrected data can be plotted by selecting the "SA Corrected" graph option.
Data plots are created by pressing the "Plot" button. This will bring up the currently selected graphs in a plot window. The type of data that can be plotted are located in the plot options panel. Multiple types of graphs can be plotted on the same set of axes. The program will plot all of the graphs that are selected by the checkboxes. These are discussed briefly next.
Data: This plots the spectrum contained in the data file entry.
Fit: This plots the constructed fit, either from the reconstruction method or direct fitting.
SA Corrected: This option plots the self-absorption corrected data spectrum. Note that the self-absorption correction applies its corrections to the loaded data, not the fit. Thus this spectra should be compared to the fitted spectrum when comparing results.
E Shifted: This option plots the energy-shifted results of the data spectrum. This is similar in case to the SA correction - the shift is applied to the loaded data, and thus should be compared to the fitted spectrum.
Residuals: This plots the difference between the loaded data file and the fitted results.
Waterfall: This option plots all of the components involved in the fit, multiplied by their fractional amounts. All of the plots are overlaid.
Two types of data can be saved from this module: the fitted data spectrum and the residuals. These are saved into two-column ASCII files. The option to save these data types is selected from the "Save" item in the menubar.
It can often be time consuming to rebuild a set of components each time a fit is desired. Thus, one can save and load parameter files, or "workspaces". These parameter files save the most important parts of the fitting variables into a text file. SIXPack can also differentiate the parameter files for the various analysis modules and will warn the user if they are trying to load the wrong parameter type. For the least squares fitting module, the parameter files contains the names and paths of all of the input files, the values of the "fComp" entries, and all of the variables in the fit parameters panel. To load or save a workspace, select "Save-Load Workspace" or "Save-Save Workspace" under the menubar.
Principal component analysis (PCA) can be used to decompose a set of data files mathematically into the minimum number of components needed to describe the variance in the data. These primary, or principal, components are those which contain the signal and are mathematically sufficient to reconstruct each of the experimental spectra in some linear combination. The other components in the system are those that refer to the noise. The basic result of this procedure is to determine how many components or reference spectra are needed to describe the set of data files within experimental error. PCA supplements the traditional approach through a global view of speciation within the entire series of spectra. The results from PCA offer constraints that can then be applied to the traditional non-linear least-squares analysis. The PCA window is large due to the many functions that it contains. On computers with small screen resolutions, the window appears with slider bars on the sides so all of the functions of the module may be accessed.
Sample data are loaded in a similar manner to the other modules by clicking on the "Add File" or "Add Many Files" buttons in the sample file list panel. Data files should be of the same type, i.e. all XANES or EXAFS spectra. Before loading, the x-limits of the data must be entered into the file parameters entries. This will determine the limits to analyze and the x-weighting to apply to the data. XANES typically have a weighting of zero, whereas EXAFS have weightings of 2 to 3. After files have been loaded, the limits or weighting can be changed in the file parameter entries and the "Reload All" button pressed. To insure that all input files are on the same grid, they are interpolated when loaded into memory. XANES data are interpolated on a 0.1 eV grid and EXAFS on a 0.05 k-space grid.
Details of the PCA algorithm can be found in Ressler, et. al., Environmental Science & Technology, 34 (2000) p. 950-958 and references therein. To summarize, PCA is essentially a singular value decomposition (SVD) problem from linear algebra. The SVD method states that any m x n matrix, A, having a greater or equal number of rows than columns, can be written as the product of a m x n column orthogonal matrix, E, an n x n diagonal matrix V with positive or zero elements, and the transpose of an n x n orthogonal matrix w. Thus A = E · V · wt. This process constructs an orthonormal basis set given as the set of eigenvectors, E, from the data input A matrix. Often in the SVD, the eigenvalues (V) are zero or close to zero. This means that the eigenvector that corresponds to the near-zero eigenvalue is not an important part of the orthonormal basis set construction.
PCA analysis is run on the data files loaded by pressing the "Do PCA Run" button bar. After the PCA has been completed, the reuslts will be presented in the PCA control and PCA results panels. All of the computed components are listed in the PCA control panel. Each entry in this list contains an "X" if it is active, the component number, the eigenvalue, and variances. The reported variance is broken up into two categories. The first is the amount of the total variance of the decomposition that is accounted for by the given component. A large variance signifies a dominant component. The second is the running cumulative total variance for all components up to the selected component. As the number of components increases, the variance attributable to each component generally decreases, and the cumulative variance increases to approach one. The breakdown of the linear combinations to create each data sample from the components is displayed in the PCA results panel. The values in the table are required, unscaled numbers to reproduce the sample spectra from the components. There are no units attached to these values. The components are in the rows of the table and the each of the samples are listed in the columns. The relative contributions of each component to the appropriate sample can be observed from the table. Further examination of the results can be accomplished by observing the several graphing options described below.
The components in the PCA control list can be toggled by double-clicking on the component name. This will switch the placement of the "X" before the component name. The active feature is used primarily for graphing the reconstructions of the samples with the active components as described below. The first component, which is essentially the average of all the sample data, commonly dominates the analysis. This is particularly true with XANES spectra, where the first component describes the edge jump and the others account the small fluctuations between the samples. Since the edge step contains the bulk of amplitude in the case, it will typically explain more than 90% of the variance in the data. Thus, double-clicking on the first component make is inactive, and a second-double click removes Component 1 from the variance totals. Thus the variance totals now are relative to the remaining variance in the data sets if the average edge-step is not considered. To denote this, the line is colored red. A third double-click will bring the variance value back to normal and make the component active again.
Plotting the PCA results can help make sense of the analysis. All plots are chosen using the plot control panel. The panel is filled with buttons to select the various types of plot options. The plots can also be selected from the "Plot" option in the menu bar. Zooming in the graph window is accomplished by clicking and dragging the desired window. Zooming out is performed by a right-click or by the "Plot-Rescale Plot" option under the menu bar. This section will describe each of the plotting options.
Scree Plot: A scree plot is simply a plot of component number versus the corresponding eigenvalue. One can visually see the components progressively become less important by watching the eigenvalues decrease as more components are considered. A break in the slope of this graph as the values come close to zero represents the minimum number of components in the system.
Variance Plot: The variance plot shows the cumulative variance explained as a function of the number of components. As in the scree plot, a break in the slope of the variance plot as it approaches one represents the minimum number of components in the system.
Component Plot: This option plots the currently highlighted component.
All Components: This option plots all of the selected active components (marked by an "X") in the PCA control list.
Sample Plot: This option plots the currently highlighted sample in the sample file list.
All Samples: This option plots all of the data spectra in the sample file list.
Reconstruction/Sample Pair: The reconstruction option will plot the highlighted data from the sample file list as well as its corresponding reconstruction from the principal components and provides a simple visual inspection of the PCA. The reconstruction will be made from only those components which are active. If all of the components are active, then the reproduction will be perfect. As components are removed from the reconstruction, it will gradually become less perfect. The minimum number of required components can also be determined as the number of components needed to be able to accurately reproduce all of the data files with in reasonable experimental error. When too few components are selected, there will be obvious mismatches in between the samples and the reconstructions.
Target Transform: This option plots the data pair for the target transformation procedure. This option is described in detail below.
The target transformation procedure attempts to determine if a chosen reference spectrum (i.e., from a given model compound) can be considered as a legitimate "end-member" component. Mathematically, this means that it can be represented in the same mathematical space as defined by the components of the sample spectra. This is done by multiplying the reference spectrum by the eigenvector column and row matrix. If this resultant spectrum matches well with the reference (c.a. 1% error or experimental limits) then the reference spectrum is a possible species in the unknown data sets.
This analysis is done by first selecting a reference spectrum by entering the filename or clicking on the file folder icon in the target entry field. Once again, the x-limits must be entered in the file parameters panel before performing the analysis. If changes are made to the x-limits or weighting, the "Reload All" button will also make the appropriate changes with the target data. Before running the target transform, the components that define the sample space must be selected as active. Ideally, these will be the same components that define the principal component basis. Once the analysis is complete, the original data and the target transform will automatically be plotted. The goodness of fit (by chi square and r-value) is also reported below the target file entry. The chi-square is a traditional sum of the squared differentials, whereas the r-value is a measure of the percent misfit.
The save functions are not currently implemented in release 0.30. They will be present in a future release.
The results of the PCA analysis can be saved in a file. The text results of the analysis (the eigenvalues, variance, and reconstruction from the eigenvectors) are saved by selecting "Save-Results" from the menu bar. The eigenvectors corresponding to the components are saved in an ASCII file in a similar manner by selecting "Save-Components" from the menu bar.
The advanced fitting module of SIXPack is the core for EXAFS fitting. It acts as an interface to IFEFFIT, which performs fits of FEFF derived amplitude and phase functions to experimental data. One of the distinct advantages of the IFEFFIT fitting procedure is that it is extremely flexible in the model that can be built. While performing a fit can be complicated, the module makes arranging and organizing the steps relatively simple. The most complicated portion of the fit is to determine and build an appropriate model for the experimental system. These can be very simple shell-by-shell models, or extremely complicated and constrained structural models. For more information on using IFEFFIT to model structures, see Bruce Ravel's course materials on EXAFS Analysis Using FEFF and FEFFIT. This is an extremely powerful introduction and guide to EXAFS fitting that I highly recommend.
The main data file to load for this module is the chi(k) data to fit. The loading of the phase and amplitude FEFF files will be discussed in the next section. The filename can be entered into the data file entry, or the file folder icon to the right of the entry can be clicked to browse through the disk directories. The file must be loaded after the filename is entered by pressing the "Load" button.
This is often the most time consuming step in the procedure. To save time in latter fitting sessions, the fitting model can be saved into a parameter file or workspace. These options are discussed in more detail in later sections. There are several steps required in the fit building process. The guide that follows is not a completely comprehensive instruction set, but should get the user started in the correct general direction. Setting up a fit requires creation of FEFF files, definition of paths, and definition of variables. The FEFFIT Variables panel is separated into three tabs. One each for defining paths and variables, and one for examining results.
Creating FEFF Files: The first step before executing a fit is to create the FEFF files needed. For simple computations, single-scattering paths may be all that is required for the fit. In this case, the SS FEFF Path Maker module of SIXPack can be extremely useful (see next chapter). For more complicated calculations on well-defined structures, the ATOMS program by Bruce Ravel can help build FEFF input files from crystallographic coordinates. In practice, the user can mix well defined paths from crystal structures with simple single scattering paths to create the most applicable model for the system of interest.
If FEFF is run independently, it is necessary to force the output of the individual path files. This requires the "PRINT" card of FEFF to read "PRINT 1 0 0 0 0 3" for FEFF8 or "PRINT 1 0 0 3" for FEFF6/7. This creates the feffnnnn.dat files that IFEFFIT uses as input.
Defining FEFF Paths: The FEFF paths can be observed by selecting the "Paths" tab in the FEFFIT Variables panel. Each path that is to be used in the fitting procedure has its own "card" in the path listbox. Clicking on a path name will bring up that paths card. On startup, the first path has already been created. New paths can be added by clicking on the "Add Path" button. Each path has a series of variables that need to be defined.
The entries can be filled with values or variables. As a personal preference, I prefer to use series of variables in defining the paths, and do the more complicated mathematical relations between variables in the variable definition page.
Several shortcuts exist for filling out the entries. These are all located under the "Path" menu in the menubar.
Certain shortcuts also exist for the variables. These are located under the "Variables" menu in the menubar.
Defining Variables: Selecting the "Variables" tab in the FEFFIT Variables panel brings up a list of variables used in the fit. This listbox will be initially blank. New variables can be added by clicking on the "Add Variable" button. A faster way of adding variables is to click on the "Add All Variables" button. This searches through the entry fields of the path definitions for variables and enters them automatically in the listbox. Once the variables are added, their conditions and expressions must be set.
By cleverly defining variables and relations between variables and paths, complicated and well-constrained models can be built to effectively examine EXAFS data.
The fit parameters panel also contains the data range limits, weighting schemes, and windowing functions to be applied in the fit. The "kmin"/"kmax" entries are for the limits in k-space for the fit to consider. The same applies for "Rmin"/"Rmax" in R-space. The "dk" entry refers to the strength of the windowing function. The window is applied in order to remove "ringing" effects from the finite Fourier transform range. Finally, the user can choose whether the fit minimization should be performed in k-space, R-space, or q-space (back-transformed k-space).
A fit can be constructed in several ways. Clicking the "Fit" button will start a minimization with the given seed values. IFEFFIT will output the various status reports on the fitting progress in the standard output window. A complicated fit may take quite a while to run! When the fit is complete, the quality of fit parameters (chi-square and R-factor) will be printed at the bottom of the window. Before running a fit, it is a smart idea to check to make sure the seed value are close to what may be expected in the fit. This helps to reduce the chances of falling in to local minima. A fit can be reconstructed from the values in the variable window without going through the minimization routine by clicking the "ff2chi" button. It is often also useful to examine the contribution to the EXAFS signal of one individual path. This can be accomplished by clicking the "path ff2chi" button. This will compute the EXAFS fit of the currently selected path in the FEFF path listbox.
After the fit is complete, the final values of all guessed variables are replaced in their respective entries in the "Variables" tab. A comprehensive result summary is also given in the textbox under the "Results" tab in the FEFFIT Variables panel. If the checkbox for "Print All FEFF Paths" is checked, then the first portion of the output will list each path and the resultant values of each of the path parameters. The second part of the output lists all of the variables and their final values. If a variable was guessed, it will also include an estimation of the error of that parameter.
If the checkbutton in the lower right portion of the window titled "Display Correlations?" is checked, then a window will be displayed that shows the inter-variable correlations. The first several lines summarize the data being fitting and the quality of fit parameters. Also included are the number of independent points in the dataset and the number of variables being fit. The number of variables should always be less than the number of independent points in order to obtain a meaningful result. After these headers, all variable pairs that have a correlation will be listed. Those pairs with a absolute correlation of less than 0.10 will not be listed. If an absolute correlation is "large" (i.e. greater than 0.75) it will be highlighted in red as a possible significant correlation.
It is important to note that the seed values are very important in the fit. The fit should be tried several times to insure that it has not fallen into a local minima.
The data can be plotted by pressing one of the three plot button in the upper right corner of the window. These buttons will plot in R-space, k-space, or q-space (back-transformed k-space). The plot is produced by IFEFFIT and will be brought up in a separate window. The ranges for the plots, as well as the types of data to plot (i.e. fit, data, real/imaginary/magnitude portions of the data) are selected under the respective "k", "R", or "q" tabs in the plot options panel. The fit data will be the result of the last fit produced, either by direct minimization, reconstruction, or single path reconstruction.
It can often be extremely time consuming and frustrating to rebuild an EXAFS model each time a fit is desired. Thus, one can save and load parameter files, or "workspaces". These parameter files save the most important parts of the fitting variables into a text file. SIXPack can also differentiate the parameter files for the various analysis modules and will warn the user if they are trying to load the wrong parameter type. To load or save a workspace, select "Save-Load Workspace" or "Save-Save Workspace" under the menubar. In the advance fitting module, all of the path information, fit parameters, and variables are stored. Results are not stored. It is important to remember that the save function only saves those paths and variables that are checked for use. Paths and variables can be deleted by saving the workspace and omitting the "use" checkboxes.
The Path Maker module is a utility which makes simple single-scattering path phase and amplitude files using FEFF. This are used for performing EXAFS fits in the Advanced Fitting module. It is important to note that SIXPack does not come with FEFF installed. It merely acts as an interface for running FEFF. The FEFF suite of programs can be obtained through the University of Washington's website. Once installed, the FEFF executable should be placed in the system path so that the Path Maker can run it from the command line prompt without searching for the installation directory.
The first parameters to select are the absorbing and backscattering atoms. This is done by clicking on the periodic table. A left-click selects the absorbing atom, and a right-click selects the scattering atom. Next, in the FEFF options panel, the edge of the element is selected along with the distance between the absorbing-scattering pair and the it geometry. The geometry option is not entirely realistic for all possible pairs, but does help FEFF to run smoothly in some cases. It will also set the degeneracy of the path to an appropriate value. Two advanced parameters can also be set, which include the AFOLP parameter and the exchange parameter. AFOLP determines the amount of overlap of the muffin tin potentials, which is an important parameter for heavier absorbing elements. The exchange parameter specifies the energy dependent exchange correlation potential model. The default Hedin-Lundqvist model works very well for most cases, but can be changed to several of the other FEFF models if desired.
Before running FEFF, the version that is installed on the users system must be selected. SIXPack supports the most common releases of FEFF, FEFF6/7 and FEFF8. These two versions are different primarily in the structure of the input files. For EXAFS calculations, their output is very similar. The FEFF command must also be entered into the command entry field. This is the name of the FEFF executable that can be called from the command line prompt. FEFF will be run by clicking on the "Run FEFF" button.
The FEFF calculation will take place in the 'fefftemp' directory in the 'sixpack' installation path. The details of the calculation will be printed in the standard output window. Final results will be placed in appropriate directory in the 'sixpack' path. Results are sorted by the absorbing atom. For example, the FEFF results of a K-edge Cu-O absorbing-scattering pair at 2.1 angstroms will be saved in the 'cupaths' directory and titled 'CUK-O-2100.dat'.