EELS data analysis

In this project, we will show some workflows that are useful for EELS data analysis. We will cover multiple topics such as background removal, Hartree-Slater cross section function, low loss analysis (band gap, plasmon shift calculations), high loss analysis (L2/L3 ratio analysis) and so on.

Data loading and Background substraction

This script handles a dataset of Electron Energy Loss Spectroscopy (EELS) spectra. Here’s what the code does:

Import necessary modules and packages: This includes numpy for general numerical operations, pylab, sympy, and matplotlib.pyplot for plotting and visualization, scipy.optimize and scipy.integrate for curve fitting and integration, glob for file handling, re for regular expression operations, nmmn.plots for specialized plots, and hyperspy.api for EELS data analysis.
Define paths and load data: The script sets paths to load EELS data files and a path for saving results. The EELS spectra are then loaded into a list named spectra from files named according to a specified pattern.
Define and apply shifts: The script defines shifts in nanometers due to the drift of the sample during multiple line scan acquisitions. These shifts are then applied to the loaded spectra.
Slice and rebin spectra: The script identifies the smallest navigation and signal dimensions across all loaded spectra. It then « rebins » each spectrum in the list to match these smallest dimensions, effectively ensuring all spectra have the same dimensions.
Sum spectra: The script creates a deep copy of the first rebinned spectrum, and then adds the rest of the rebinned spectra to it.
Plot averaged spectrum: Finally, the script plots the summed spectra. A commented line of code indicates that at one point, the author considered averaging the signal across the navigation axis before plotting.

Keep in mind that some portions of the script are commented out, which means they’re not currently executing. We left these parts in, likely for reference or potential future use.


# -*- coding: utf-8 -*-
"""
Created on Fri Jun  2 15:25:29 2023

@author: adchmielews
"""
from numpy import cos,sin,exp,pi,sqrt,log,log10  
import numpy as np                      						
import pylab as py  
import glob                    						              					
from scipy.optimize import curve_fit
from scipy.integrate import simps
import sympy as sy
import matplotlib.pyplot as plt	
import re
import nmmn.plots
import hyperspy.api as hs 
parula=nmmn.plots.parulacmap()	

#path to load the data
path = '/Users/chmilew/Documents/Python Projects/EELS/Mn L2_L3 dans STO film - LSMO/SI-{:03d}/EELS Spectrum Image (high-loss) (aligned).dm3'
#path to save the data
path2 = '/Users/chmilew/Documents/Python Projects/EELS/Mn L2_L3 dans STO film - LSMO/'
# Load all the EELS spectra
spectra = [hs.load(path.format(i)) for i in range(25, 35)]

############## If we want to join all 2D spectrum images to make one 2D spectrum images ############


# Defining the shifts (in nm) in case we need to shift some spectra (due to the drift of the sample during multiple line scan acquisitions)
shifts = [0.08,-0.07,-0.15,-0.15,-0.15,0,0,0,0,0]

# Applying the shifts
for i, spectrum in enumerate(spectra):
    spectrum.axes_manager[-1].offset += shifts[i]
    
########To verify if the shift was applied, plot this before and after shift application ###########
# Let's plot the 3rd spectrum after the shift
# plt.plot(spectra[2].axes_manager.signal_axes[0].axis,spectra[2].data[0])

"""
s.inav[:min_nav_dim]: This operation slices the navigation axis of the spectrum s 
up to the smallest navigation dimension found in all the spectra, effectively removing any extra navigation points.

s.isig[:min_sig_dim]: This operation slices the signal axis of the spectrum s 
up to the smallest signal dimension found in all the spectra, effectively removing any extra signal channels.
"""


# Find the smallest navigation and signal dimensions
min_nav_dim = min([s.axes_manager.navigation_size for s in spectra])
min_sig_dim = min([s.axes_manager.signal_size for s in spectra])

# Rebin all spectra to the smallest navigation and signal dimensions
rebinned_spectra = [s.inav[:min_nav_dim].isig[:min_sig_dim] for s in spectra]

# Sum all the spectra
s_sum = rebinned_spectra[0].deepcopy()
for s in rebinned_spectra[1:]:
    s_sum += s

# Average the signal across the navigation axis
# s_sum = s_sum.mean(0)

# Plot the averaged spectrum
s_sum.plot()

When processing Electron Energy-Loss Spectroscopy (EELS) data, one of the key steps is background removal. This step is crucial as it reduces noise and enhances the accuracy of subsequent analyses. This particular Python code that we’re examining embodies this important process.

In the beginning, the code delves into the collected EELS data by extracting individual spectra from the summed EELS spectrum. This dissection of the data provides a granular view of each spectrum, a necessary step for the upcoming analysis.

Having isolated each spectrum, the code then retrieves the corresponding energy values for every data point in each spectrum. These energy values will serve as critical markers in the following steps.

Next, the code outlines a power-law function. This is no random choice; this function has been specifically designed to model the background of the EELS spectra, a prerequisite for effective background removal.

The next task involves identifying the energy range within which the power-law function will be fit to the data. This requires defining start and end points for the fitting process. Once defined, the code proceeds to carefully identify the data within these specified energy ranges for each spectrum. These data are stored in arrays which will be accessed later for fitting.

With the fitting range defined and data prepared, it’s time to get into the heart of the operation. Utilizing the curve_fit function from the SciPy package, the code meticulously fits the power-law function to the background of each spectrum within the defined energy range. This is a pivotal step in the background removal process.

Following the fitting, it’s time to subtract the calculated background. This action effectively removes the background from each spectrum, bringing us closer to the clean, noise-reduced data we’re aiming for.

But the journey doesn’t end there. Now that the background has been subtracted, the code plots these new, cleaner spectra to visualize the result of the background removal process.

However, we’re dealing with multiple spectra, and for better comparison, we need uniformity. To ensure all spectra have the same dimension in the energy axis, the code resamples each background-subtracted spectrum to have the same energy values, creating a new energy axis.

The final stages involve creating a new EELSSpectrum object with the background-subtracted, resampled data. The metadata and axes properties are updated to match the original summed EELS spectrum. This attention to detail ensures that the final data is as accurate and as comparable to the original as possible.

The process concludes with the plotting of the final EELSSpectrum after background removal, a testament to the intricate journey we’ve taken through the data.



############### Home made method for background removal #######################


# Extract the individual spectra from s_sum
num_spectra = s_sum.axes_manager.navigation_size
spectra = [s_sum.inav[i].data for i in range(num_spectra)]

# Get energy values from the first axis
energy_values = s_sum.axes_manager.signal_axes[0].axis

py.xlabel("Energy (eV)")
py.ylabel("EELS intensity (arb. units)")

def powerlaw(x,a,b):
    return a*(x**b)


start = []
end = []
Xstart = 610
Xend = 620

pas = 0.05

for A in range(len(spectra)):
    for i in range(len(energy_values)):
        if energy_values[i] < Xstart + pas:
            if energy_values[i] > Xstart - pas:
                start.append(i)
    for j in range(len(energy_values)):
        if energy_values[j] < Xend + pas:
            if energy_values[j] > Xend - pas:
                end.append(j)


Xfit = [x[:] for x in [[0] * len(spectra) ] * len(spectra)]
Yfit = [x[:] for x in [[0] * len(spectra) ] * len(spectra)]



for A in range(len(spectra)):
    Xfit[A] = (energy_values[start[0]:end[0]])
    Yfit[A] = (spectra[A][start[0]:end[0]])


popt2 = [x[:] for x in ['0' * len(spectra) ] * len(spectra)]
pcov2 = [x[:] for x in ['0' * len(spectra) ] * len(spectra)]

for A in range(len(spectra)):
    popt2[A],pcov2[A] = curve_fit(powerlaw, Xfit[A], Yfit[A], maxfev=80000)

data = [x[:] for x in ['0' * len(spectra) ] * len(spectra)]

for A in range(len(spectra)):
    data[A] = np.array([energy_values[start[A]:], spectra[A][start[A]:] - powerlaw(energy_values[start[A]:], *popt2[A])])
    data[A] = data[A].T
    
#plot of colored spectra

fig,ax = plt.subplots(1,1)
[x.set_linewidth(1.5) for x in ax.spines.values()]
# plt.gca().invert_yaxis()

NUM_COLORS = len(spectra)
OFFSET = np.linspace(0,200000,len(spectra))
colors = parula(np.linspace(0,1,NUM_COLORS))
for A in range(len(spectra)):
    plt.plot(data[A][:,0],data[A][:,1] + np.max(OFFSET) - OFFSET[A],color = colors[A])
    
py.xlabel("Energy (eV)")
py.ylabel("EELS intensity (arb. units)")
plt.tight_layout()
plt.show()

#### Here we convert the final signal with background removed that is stored in 'data' as a spectrum 2D image ####

# Find the maximum energy value that is present in all spectra
# Determine the common energy range
min_energy = max([d[0, 0] for d in data])  # start from the maximum of the minimum energies
max_energy = min([d[-1, 0] for d in data])  # end at the minimum of the maximum energies

# Determine the common energy step size
step_sizes = [d[1, 0] - d[0, 0] for d in data]  # energy differences between consecutive steps
step_size = max(step_sizes)  # use the maximum step size to ensure all data can be covered

# Create a new energy axis that covers the common energy range with the common step size
new_energy_axis = np.arange(min_energy, max_energy, step_size)

# Resample each spectrum so that it has the new energy axis
resampled_data = [np.interp(new_energy_axis, d[:, 0], d[:, 1]) for d in data]

# Get the navigation shape and the size of the energy axis
nav_shape = s_sum.axes_manager.navigation_shape
energy_size = len(new_energy_axis)

# Initialize an empty array with the required shape
data_array = np.zeros(nav_shape + (energy_size,), dtype=float)

# Fill the data_array with the intensity values from resampled_data
for idx, d in enumerate(resampled_data):
    nav_idx = np.unravel_index(idx, nav_shape)
    data_array[nav_idx] = d

# Create a new EELSSpectrum object with the modified data
s_background_removed = hs.signals.EELSSpectrum(data_array)

# Copy the metadata and axes properties from s_sum
s_background_removed.metadata = s_sum.metadata.deepcopy()
s_background_removed.axes_manager = s_sum.axes_manager.deepcopy()

# Update the energy axis properties to match the resampled data
energy_axis = s_background_removed.axes_manager.signal_axes[0]
energy_axis.scale = step_size
energy_axis.offset = min_energy
energy_axis.size = energy_size

# Plot
s_background_removed.plot()

The Python code here is an in-depth analysis of spectra data with a particular focus on peak analysis. The aim is to isolate and analyze two distinct peaks within each spectrum via Gaussian fitting, a common technique in data analysis when you’re dealing with bell-shaped distribution of data points.

The first segment of the code outlines two critical functions – a simple linear function and a double Gaussian function. The linear function serves as a model for a linear background, while the double Gaussian function models two distinct peaks in the data, identified by their heights, positions, and widths.

In the data visualization phase, the code plots the spectra in various colors for better distinction. It also defines two fitting windows within which the Gaussian peaks are expected to reside. The ‘ratios’ list is also initialized, which will store the ratio of the integrated signals under the two Gaussian curves.

The heart of the analysis lies within a loop that cycles through each spectrum in the data. Here, the code slices the energy range to focus on the region of interest. Boolean masks are also created to mark the fitting windows for the Gaussian peaks.

By using curve fitting, the code extracts initial guesses for the parameters of the linear background model based on the two fitting windows. These initial estimates help in forming a more accurate fit for the background subtraction later on.

The code then calculates an estimate of the background, a combination of two linear functions across different energy ranges. This background is then subtracted from the original intensity to give a background-subtracted spectrum, ready for Gaussian peak analysis.

Fitting the subtracted data with two Gaussian peaks is the next major task. The code carries out this operation meticulously, setting initial guesses and bounds for the parameters to ensure an accurate fit. It then stores the positions of the two Gaussian peaks for further plotting.

To gauge the prominence of the two peaks, the code calculates the integrated signals under each Gaussian curve. This measure essentially quantifies the area under each peak, providing an insight into their relative magnitudes.

Next, a ratio of the two integrated signals is calculated and stored in the ‘ratios’ list. This ratio serves as a quantitative measure of the balance between the two peaks in each spectrum, a vital piece of information for further analysis.

As the loop completes, each spectrum is plotted, now background-subtracted and adorned with the Gaussian peak fits. This visualization paints a clear picture of the two distinct peaks within each spectrum.

After processing all spectra, the elapsed time for the operation is printed, indicating the computational efficiency of the analysis. Lastly, the final plot of the processed spectra is displayed, marking the completion of the task.

Background substraction with Hartree-Slater cross-section function and fitting with double Gaussians:

Batch fitting plot:

Mn L2&L3 edge shifts plot and L3/L2 ratio:



import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import time
from scipy.integrate import quad

# Define simple linear function
def linear(x, a, b):
    return a * x + b

# Define double Gaussian function
def double_gaussian(x, a1, x01, sigma1, a2, x02, sigma2):
    return (a1 * np.exp(-(x - x01)**2 / (2 * sigma1**2)) + 
            a2 * np.exp(-(x - x02)**2 / (2 * sigma2**2)))

# Plot of colored spectra
fig, ax = plt.subplots(1, 1)
[x.set_linewidth(1.5) for x in ax.spines.values()]

NUM_COLORS = len(data)
OFFSET = np.linspace(0, 300000, len(data))
colors = parula(np.linspace(0, 1, NUM_COLORS))

# Define fitting windows
window1 = [630.5, 638.7]
window2 = [660,700]

ratios = []  # Create a list to store the ratios
max_gauss_1 = []
max_gauss_2 = []

start_time = time.time()

data2 = data[:45]

for A in range(len(data2)):
    energy = data2[A][:,0]
    intensity = data2[A][:,1]
    mask = (energy >= 630) & (energy <= 700)  # boolean mask
    energy = energy[mask]
    intensity = intensity[mask]

    # Create boolean masks for fitting windows
    mask1 = (energy >= window1[0]) & (energy <= window1[1])
    mask2 = (energy >= window2[0]) & (energy <= window2[1])

    # Get initial guesses for a and c parameters from fitting windows using curve_fit
    a_guess, _ = curve_fit(linear, energy[mask1], intensity[mask1])
    c_guess, _ = curve_fit(linear, energy[mask2], intensity[mask2])

    # Estimate positions of two Gaussian peaks
    peak_mask1 = (energy >= 642) & (energy <= 650)
    peak_mask2 = (energy >= 653) & (energy <= 656.85)
    x01 = energy[peak_mask1][np.argmax(intensity[peak_mask1])]
    x02 = energy[peak_mask2][np.argmax(intensity[peak_mask2])]

    window3 = [x01,x02]

    mask3 = (energy >= window3[0]) & (energy <= window3[1])

    # Compute the background over the entire energy range
    bg = np.piecewise(energy, 
                      [energy <= window3[0], 
                       (energy > window3[0]) & (energy <= window3[1]),
                       energy > window3[1]], 
                      [lambda x: linear(x, *a_guess), 
                       lambda x: linear(x, c_guess[0], (2*c_guess[1] - x01*c_guess[0])/3), 
                       lambda x: linear(x, *c_guess)])

    # Subtract the background
    bg_subtracted = intensity - bg

    # Fit subtracted data with two Gaussian peaks
    p0 = [np.max(intensity[peak_mask1]), x01, 1, np.max(intensity[peak_mask2]), x02, 1]  # initial guess
    
    # Set lower and upper bounds for parameters
    lower_bounds = [0, 642, 0, 0, 653, 0]  # x02 should not go below 653
    upper_bounds = [np.inf, 647, np.inf, np.inf, 656.85, np.inf]  # x02 should not go above 655

    popt, _ = curve_fit(double_gaussian, energy, bg_subtracted, p0=p0, bounds=(lower_bounds, upper_bounds))
    
    # Store the maximum of the two gaussians for further plotting
    
    max_gauss_1.append(popt[1])
    max_gauss_2.append(popt[4])

    # Calculate integrated signals under each Gaussian curve
    integral1, _ = quad(lambda x: popt[0] * np.exp(-(x - popt[1])**2 / (2 * popt[2]**2)), 630,650)
    integral2, _ = quad(lambda x: popt[3] * np.exp(-(x - popt[4])**2 / (2 * popt[5]**2)), 650,660)

    # Print ratio of integrated signals
    if integral2 != 0:
        ratio = integral1 / integral2
        ratios.append(ratio)
        print("Ratio of integrated signals for spectrum {}: {:.2f}".format(A, ratio))
    else:
        print("Warning: Integral of second peak for spectrum {} is zero. Skipping this spectrum.".format(A))

    # Plot subtracted data and double Gaussian fit
    plt.plot(energy, bg_subtracted + np.max(OFFSET) - OFFSET[A], color=colors[A])
    # plt.plot(energy, double_gaussian(energy, *popt) + np.max(OFFSET) - OFFSET[A], '--', color='red')
    plt.fill_between(energy, popt[0]*np.exp(-(energy - popt[1])**2 / (2 * popt[2]**2)) + np.max(OFFSET) - OFFSET[A],np.max(OFFSET) - OFFSET[A], alpha=0.6, color=colors[A])
    plt.fill_between(energy, popt[3]*np.exp(-(energy - popt[4])**2 / (2 * popt[5]**2)) + np.max(OFFSET) - OFFSET[A],np.max(OFFSET) - OFFSET[A], alpha=0.3, color=colors[A])

end_time = time.time()
elapsed_time = end_time - start_time
print("Time elapsed: ", elapsed_time, "seconds")

plt.xlabel("Energy (eV)")
plt.ylabel("EELS intensity (arb. units)")
plt.tight_layout()
plt.show()

Data augmentation

This script is designed to implement a convolutional neural network (CNN) for a multiclass classification task using Keras, with the data being spectra acquired from certain experiments and preserved in pickle files. The initial segment of the script deals with the import of necessary libraries and modules. These comprise of Python’s standard libraries such as numpy and pandas, in addition to machine learning libraries like Keras and sklearn, and even specific functions and classes from these libraries.

The subsequent phase is data loading, wherein three datasets, ‘Mn2_C’, ‘Mn3_C’, and ‘Mn4_C’, from a specified directory path are incorporated. Each dataset signifies a different class, which are initially loaded into pandas dataframes, followed by a combination into a single dataframe Mn_All and subsequent conversion into a numpy array.

Simultaneously, a list labels is generated in which the spectra from the ‘Mn2_C’, ‘Mn3_C’, and ‘Mn4_C’ datasets are labelled as 0, 1, and 2 respectively. Following this, the data is divided into training and testing sets with an 85% to 15% ratio, using the train_test_split() function from sklearn.

Data augmentation then takes place, leveraging principal component analysis (PCA). A noise model is produced and applied with diverse signal-to-noise ratios (SNR) to the training data to generate additional training samples. This stage is crucial to the robustness of the model.

Before the data is fed to the CNN, the script undertakes several preprocessing steps. The spectra are cropped and then reshaped according to the input format that Keras expects. The data is also mean-centered and normalized to ensure the scale of input features is compatible.

The labels for the classes (0, 1, and 2) are one-hot encoded in preparation for use in the CNN model. This encoding process transforms each integer label into a binary vector with the index of the integer label marked as 1 while the remainder of the vector is populated with 0’s. This format is more appropriate for classification tasks where classes are not ordinal.

Finally, the script plots one spectrum from the training set and one from the testing set as a visual representation. While this marks the end of the data preparation process, the script is primed to move forward with defining the CNN model’s architecture, compiling it, and subsequently training it using the prepared data.


import os
import sys
import numpy as np
import pandas as pd
from keras.utils import np_utils
from keras.optimizers import Adam
from keras.models import Sequential
from matplotlib import pyplot as plt
from sklearn.decomposition import PCA
from keras.callbacks import ModelCheckpoint
from sklearn.metrics import confusion_matrix
from sklearn.model_selection import train_test_split
from keras.layers.pooling import GlobalAveragePooling1D
from keras.layers import BatchNormalization
from keras.layers import Dropout, Activation, Dense, Flatten
from keras.layers.convolutional import Convolution1D,AveragePooling1D,MaxPooling1D


#################################

path = '/Users/chmilew/Documents/Python Projects/EELS/Mn L2_L3 dans STO film - LSMO/Mn_Classifier_CNNs-master/Data/'

Mn2_C = (pd.read_pickle(path+'/Mn2_Larger_Clean_Thin.pkl'))
Mn3_C = (pd.read_pickle(path+'/Mn3_Larger_Clean_Thin.pkl'))
Mn4_C = (pd.read_pickle(path+'/Mn4_Larger_Clean_Thin.pkl'))
Mn_All = (Mn2_C.append(Mn3_C, ignore_index=True)).append(Mn4_C, ignore_index=True)
Mn_All = np.array(Mn_All)

labels=[]
for i in range(0, len(Mn2_C)):
    labels.append(0)
for i in range(0, len(Mn3_C)):
    labels.append(1)
for i in range(0, len(Mn4_C)):
    labels.append(2)
    

#train-test split
X_train, X_test, y_train, y_test = train_test_split(Mn_All, labels, test_size=0.15, random_state=13)

#data augmentation using principal components
noise_aug = []
noise = np.copy(X_train)
mu = np.mean(noise, axis=0)
pca = PCA()
noise_model = pca.fit(noise)
nComp = 10
Xhat = np.dot(pca.transform(noise)[:,:nComp], pca.components_[:nComp,:])
noise_level = np.dot(pca.transform(noise)[:,nComp:], pca.components_[nComp:,:])
Xhat += mu
SNR = np.linspace(1,5,50)
for i  in range(len(SNR)):
    noise_aug.append(SNR[i]*noise_level + Xhat)
    j = 0
    for spectra in noise_aug[i]:
        noise_aug[i][j] = spectra/np.max(spectra)
        j += 1
X_train = np.array(noise_aug).reshape(50*2684,700)
y_train = [item for i in range(50) for item in y_train]

#cropping
X_train = X_train[:,100:600]
X_test = X_test[:,100:600]

#formatting for keras
X_train = np.array(X_train).astype('float32')
X_train = X_train.reshape(X_train.shape + (1,))
X_train -=  np.mean(X_train)
X_train /= np.max(X_train)
X_test = np.array(X_test).astype('float32')
X_test = X_test.reshape(X_test.shape + (1,))
X_test -= np.mean(X_test)   
X_test /= np.max(X_test)

y_train = np.array(y_train)
y_test = np.array(y_test)
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]

print("Total of "+str(num_classes)+" classes.")
print("Data mean-centered, normalized and hot-encoded.")
print("Total of "+str(len(X_train))+" training samples.")
plt.plot(X_test[0],label='X_test')
plt.plot(X_train[0],label='X_train')
plt.legend()
plt.show()

Creation of the model

The Sequential model is a linear stack of layers that allows the user to build a neural network layer by layer, from input to output. Here, each layer has exactly one input tensor and one output tensor.

The model architecture is defined by sequentially adding various layers:

A 1-dimensional convolutional layer (Convolution1D) is added as the input layer of the model, taking an input shape of (500, 1). This layer has 2 filters, each of size 9, and uses a rectified linear unit (ReLU) activation function.
An average pooling layer (AveragePooling1D) is added next. This layer down-samples the input along its temporal dimension (time dimension in case of time-series data) by taking the average value over a window in the input.
Batch normalization (BatchNormalization) is then applied which normalizes the activations of the previous layer at each batch, i.e., it applies a transformation that maintains the mean activation close to 0 and the activation standard deviation close to 1. This often improves the model’s performance.

The same structure (convolution, pooling, batch normalization) is then repeated four more times, each with slightly different parameters. Notably, the size of the convolutional kernels decreases each time while the number of filters increases.

A dropout layer (Dropout) is added after the fifth convolutional block to prevent overfitting. It randomly sets 10% of input units to 0 at each update during training time.

Next, another 1D convolutional layer is added, followed by a global average pooling layer (GlobalAveragePooling1D). This layer will compute a global average of its inputs for each feature and produce a 2D tensor. This helps reduce the dimensionality of the input and is particularly useful to reduce overfitting and computational cost.

Finally, a softmax activation function is applied which transforms the output to a probability distribution over the target classes. This function is typically used in multiclass classification problems.

The model is then compiled with the Adam optimizer, using categorical cross-entropy as the loss function (which is suitable for multiclass classification tasks) and tracking accuracy as the metric.

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv1d (Conv1D)             (None, 492, 2)            20        
                                                                 
 average_pooling1d (AverageP  (None, 246, 2)           0         
 ooling1D)                                                       
                                                                 
 batch_normalization (BatchN  (None, 246, 2)           8         
 ormalization)                                                   
                                                                 
 conv1d_1 (Conv1D)           (None, 240, 2)            30        
                                                                 
 average_pooling1d_1 (Averag  (None, 120, 2)           0         
 ePooling1D)                                                     
                                                                 
 batch_normalization_1 (Batc  (None, 120, 2)           8         
 hNormalization)                                                 
                                                                 
 conv1d_2 (Conv1D)           (None, 114, 4)            60        
                                                                 
 average_pooling1d_2 (Averag  (None, 57, 4)            0         
 ePooling1D)                                                     
                                                                 
 batch_normalization_2 (Batc  (None, 57, 4)            16        
 hNormalization)                                                 
                                                                 
 conv1d_3 (Conv1D)           (None, 53, 4)             84        
                                                                 
 average_pooling1d_3 (Averag  (None, 26, 4)            0         
 ePooling1D)                                                     
                                                                 
 batch_normalization_3 (Batc  (None, 26, 4)            16        
 hNormalization)                                                 
                                                                 
 conv1d_4 (Conv1D)           (None, 24, 8)             104       
                                                                 
 average_pooling1d_4 (Averag  (None, 12, 8)            0         
 ePooling1D)                                                     
                                                                 
 batch_normalization_4 (Batc  (None, 12, 8)            32        
 hNormalization)                                                 
                                                                 
 dropout (Dropout)           (None, 12, 8)             0         
                                                                 
 conv1d_5 (Conv1D)           (None, 12, 3)             27        
                                                                 
 global_average_pooling1d (G  (None, 3)                0         
 lobalAveragePooling1D)                                          
                                                                 
 loss (Activation)           (None, 3)                 0         
                                                                 
=================================================================
Total params: 405
Trainable params: 365
Non-trainable params: 40
_________________________________________________________________
None
CNN Model created.


model = Sequential()
activation = 'relu'
model.add(Convolution1D(2, 9, input_shape=(500,1), activation=activation))
model.add(AveragePooling1D())
model.add(BatchNormalization())

model.add(Convolution1D(2, 7, activation=activation))
model.add(AveragePooling1D())
model.add(BatchNormalization())

model.add(Convolution1D(4, 7, activation=activation))
model.add(AveragePooling1D())
model.add(BatchNormalization())

model.add(Convolution1D(4, 5, activation=activation))
model.add(AveragePooling1D())
model.add(BatchNormalization())

model.add(Convolution1D(8, 3, activation=activation))
model.add(AveragePooling1D())
model.add(BatchNormalization())

model.add(Dropout(0.10))
model.add(Convolution1D(3, 1))
model.add(GlobalAveragePooling1D())

model.add(Activation('softmax', name='loss'))

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

print(model.summary())
print("CNN Model created.")

Model fitting

This block of code is fitting (i.e., training) the previously defined model to the data and saving the weights for the best model during training.

Here’s a step-by-step explanation:

np.random.seed(seed): This line is setting the seed for the random number generator in numpy. Setting the seed ensures that the random numbers generated will be the same each time the code is run. This is important for the reproducibility of experiments in machine learning.
best_model_file = '/Users/chmilew/Documents/Python Projects/EELS/Mn L2_L3 dans STO film - LSMO/Mn_Classifier_CNNs-master/best_weights/highest_val_acc_weights_epoch199-train_loss0.026_.h5': This line defines the filepath where the weights of the best model (i.e., the model with the highest validation accuracy) will be saved during training.
best_model = ModelCheckpoint(best_model_file, monitor='val_acc', verbose = 1, save_best_only = True): Here, a Keras callback is created to save the model’s weights after every epoch. Only the weights of the model with the highest validation accuracy (monitor='val_acc') are saved (save_best_only = True).
hist = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=epochs, batch_size=batch_size, callbacks = [best_model], shuffle = True, verbose=1): This is the line where the model training happens. The model is trained on X_train and y_train, with a specified number of epochs and batch size. The validation_data parameter allows the model to evaluate its performance on the test data (X_test, y_test) after each epoch. The shuffle parameter being True means that the order of the samples will be shuffled in each epoch. The callbacks parameter includes the best_model function, which will be called after each epoch. Finally, verbose=1 indicates that the progress of the training will be printed to the console after each epoch.


#Params
epochs = 10
batch_size = 512
seed = 7

# fit and run our model
np.random.seed(seed)
best_model_file = '/Users/chmilew/Documents/Python Projects/EELS/Mn L2_L3 dans STO film - LSMO/Mn_Classifier_CNNs-master/best_weights/highest_val_acc_weights_epoch199-train_loss0.026_.h5'
best_model = ModelCheckpoint(best_model_file, monitor='val_acc', verbose = 1, save_best_only = True)
hist = model.fit(X_train,
                 y_train,
                 validation_data=(X_test, y_test),
                 epochs=epochs,
                 batch_size=batch_size,
                 callbacks = [best_model],
                 shuffle = True,
                 verbose=1)
print("done")


#summarize history for accuracy
plt.figure(figsize=(15, 5))
plt.rcParams.update({'font.size': 16})

plt.subplot(1, 2, 1)
plt.plot(hist.history['accuracy'], linewidth = 3)
plt.title('Model Training Accuracy')
plt.ylabel('Training Accuracy')
plt.xlabel('Epoch')

# summarize history for loss
plt.subplot(1, 2, 2)
plt.plot(hist.history['loss'], linewidth = 3)
plt.title('Model Training Loss')
plt.ylabel('Cross Entropy Loss')
plt.xlabel('Epoch')
plt.savefig(path+'training_accuracy.png')
plt.show()

plt.figure(figsize=(10, 8))

plt.plot(hist.history['val_accuracy'], linewidth = 3)
plt.plot(hist.history['accuracy'], linewidth = 3)
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Test', 'Train'], loc='lower right')
plt.savefig(path+'test_accuracy.png')

plt.show()

Testing model accuracy

The given table is a confusion matrix for the test set of a machine learning model. This matrix helps to visualize the performance of a classification model by outlining the true and false positives/negatives. Each row of the matrix represents the instances in an actual class while each column represents the instances in a predicted class.

Here’s a breakdown of this specific confusion matrix:

For the Mn2+ class, 155 instances were correctly classified (true positives), while 2 instances were wrongly classified as Mn3+ and none were classified as Mn4+.
For the Mn3+ class, 141 instances were correctly classified (true positives), and none were misclassified as other classes.
For the Mn4+ class, 118 instances were correctly classified (true positives), while 58 instances were wrongly classified as Mn3+.

The overall accuracy of the model is 87.34%, which suggests that it correctly classified 87.34% of all instances.

From the confusion matrix, we can conclude that the model performs well for the Mn2+ and Mn3+ classes but struggles with the Mn4+ class. Particularly, the model seems to misclassify a significant number of Mn4+ instances as Mn3+, which is an area that could potentially be improved in future iterations of the model. A detailed analysis could involve checking the features of these misclassified instances and adjusting the model’s architecture or training strategy accordingly.

Confusion Matrix of Test Set
      Mn2+  Mn3+  Mn4+
Mn2+   155     2     0
Mn3+     0   141     0
Mn4+     0    58   118
Accuracy: 87.34%


y_test_pred, y_test_labels=[], []
for i in range(0, len(X_test)):
    y_test_pred.append(np.argmax(model.predict(X_test[i:i+1])))
    y_test_labels.append(np.argmax(y_test[i]))
print("Confusion Matrix of Test Set")
conf_matrix = pd.DataFrame(confusion_matrix(y_pred=y_test_pred, y_true=y_test_labels))
conf_matrix.columns = ["Mn2+", "Mn3+", "Mn4+" ]
conf_matrix = pd.DataFrame.transpose(conf_matrix)
conf_matrix.columns = ["Mn2+", "Mn3+", "Mn4+" ]
conf_matrix = pd.DataFrame.transpose(conf_matrix)
print(conf_matrix)
scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))

Data Augmentation effect

Publié par Adrian Chmielewski le 05/05/202305/05/2023

0 commentaire

Laisser un commentaire Annuler la réponse

TEM analysis

HRSTEM images characterization