⚙️XPS analysis: How it works

Overview

AtomCloud XPS Gen 1 uses a Machine Learning model to estimate elemental composition in survey spectra produced from an Al K-alpha source.

No peak-fitting is used, and no area calculations. The entire spectra signature is extracted to generate a composition inference.

The Gen 1 model is trained using 300,000+ expert labeled experimental spectra and simulated spectra—both from an Al K-alpha X-ray source.

Users can upload a .vms file for a survey spectra to the platform and within 10-30 seconds the results will be ready to view.

No additional inputs are needed to generate results.

How the model makes inferences

The model extracts a signature from an input spectra by generating summary statistics for rolling windows of various length used to sample the data. This is essentially an effort to abstract the features of the spectra in a consistent way—decoupling the “information” from the structure of the data (a list of binned intensities).

Once the spectra signature has been extracted the model generates an inference for the elemental composition. This inference is based on applying the relationships between composition and spectra signatures that was developed when training the model.

How the model is trained

Model training requires a dataset of training spectra and composition labels for each spectra.

To train the model, a signature is extracted from all training spectra. These signatures are correlated with their labels to generate a matrix that encapsulates the relationships between spectra signature and composition.

Where the training data comes from

Our training data comes from two sources.

  1. Experimental data labeled by experts: a compilation of publicly available reference spectra

  2. Simulated spectra

Our spectra simulation is based on the NIST SEESA framework and generates spectra as it would be seen in an instrument.

  • The background and interactions that necessitate RSF adjustments for peak fitting are baked into the simulated training data.

  • Spectra are generated for compounds with physically compatible element combinations,

  • Added noise and peak broadening are used to further approximate real experimental spectra and add robustness to the model.

This means that we can generate realistic analogs to experimental spectra tied to the composition label, which are inputs to the simulation—resulting in high quality training data.

Last updated