Code
import streamlit as st
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import statistics as stt
from scipy import stats
from scipy.optimize import curve_fit
Jie Xu & Dominik C. Hezel
January 25, 2023
This program is capable of:
Upload data files from Neptune: click the ‘Browse files’ botton, then, use ‘command A’ to choose all ‘.exp’ data files required from Neptune.
Set up parameters for isotopic data: (1). drag slider to choose bacground and signal area. Orange color zone represents background, and blue color zone is signal area. (2). set your outlier factor: with smaller number, more data will be cut as outlier, which can be observed from red spots. (3). set the bulge factor for 11B factor: this is related with bulge correction from 10B, 0.6 here as a factor is defined by Dr. Axel Gerdes. (4). choose your standard for intra-sequence instrumental drift correction: the name, the ‘A/B/C/D’ inside the name of standard, the regression level.
Upload your log file from laser: click the ‘Browse files’ botton and choose your laser file. (have a check if it is matched with isotopic data.)
Set up parameters for corrected boron concentration from signal intensity: (1). the regression level for boron concentration correction. (2). insert the depth of selected reference depth and other sample depth if you have. Otherwise, just keep it. (3). insert the shape of your spots: circle or squre. (4). tell us if you used split stream or not.
*5. Upload your trace element file if you used split stream. (not necessary)
Input datafiles from Neptune: (1). ‘.dat’, ‘.exp’, ‘.log’, ‘.TDT’ four type of datafiles for each measurement would be produced. Only ‘.exp’ can be read successfully and can be openned by excel. (2). underds datafiles are named in a format of ‘num-A/B/C/D/U’(e.g. ‘001-A’, ‘002-B’), num represents sequence number, A/B/C/D are four label for standards, U is label for unknown samples. Attention: all datafiles in one sequence need to be all uploaded once! (3). Inside each ‘.exp’ datafile, data start from the 23th row, and columns(‘9.9’, ‘10B’, ‘10.2’, ‘11B’) are necessary for data processing according to our method.
Input laser file: this is a csv file produced during abalation. Please have a check about all recorded information, which should be in the same order with Neptune datafile. Error may happen here.
Input trace element datafile: trace element data required from Ladr here. Raw data should be processed by Ladr and be appended here.
Required packages
Uploading files, multiple files and only .exp files are allowed.
if st.button('clear uploaded data'):
st.session_state.uploaded_files = []
if 'uploaded_files' in st.session_state and len(st.session_state.uploaded_files) != 0:
uploaded_files = st.session_state.uploaded_files
else:
st.session_state.uploaded_files = st.file_uploader('upload files', type=['exp'], accept_multiple_files=True)
2023-03-25 16:22:47.765
Warning: to view this Streamlit app on a browser, run it with the following
command:
streamlit run /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/ipykernel_launcher.py [ARGUMENTS]
2023-03-25 16:22:47.766 Session state does not function when running a script without `streamlit run`
-> Include here explanations of what the functions do to the data, e.g., why the regression, why higher orders: or – why the subtraction of the backgrounds, what two backgrounds exist: the ‘normal’ one, and one from an unknown source
def selSmpType(dataFiles)
Get the sequence number for each datafile from their file name. The sequence number can be used for Instrumental drift correction later.
def outlierCorrection(data, factorSD)
Outlier rejection of data, data is out of factorSD times of standard deviation will be taken as outliers. The first one is used for plot, the second one is used for calculation.
def outlierCorrection_plot(data, factorSD):
element_signal = np.array(data)
mean = np.mean(element_signal, axis=0)
sd = np.std(element_signal, axis=0)
fil = (data < mean + factorSD * sd) & (data > mean - factorSD * sd)
return fil
def outlierCorrection(data, factorSD):
element_signal = np.array(data)
mean = np.mean(element_signal, axis=0)
sd = np.std(element_signal, axis=0)
return [x for x in data if (x > mean - factorSD * sd) and (x < mean + factorSD * sd)]
def parseBoronTable(file)
find the useful data body from uploaded ‘.exp’ datafiles.
def parseBoronTable(file):
#content = file.read()
content = file.getvalue().decode("utf-8")
fname = file.__dict__["name"]
_start = content.find("Cycle\tTime")
_end = content.find("***\tCup")
myTable = content[_start:_end-1]
cleanFname = f"temp/{fname}_cleanTable"
with open(cleanFname, "w") as _:
_.write(myTable)
df = pd.read_csv(cleanFname,
sep='\t',
# dtype="float" #not working -->time
)
return df, fname
def sig_selection()
Plot the signal selection zone.
# st.session_state.sample_plot = st.selectbox(
# 'Which is your sample to plot?',
# (st.session_state.uploaded_files))
# def sig_selection():
# average_B = []
# df_data, filename = parseBoronTable(st.session_state.sample_plot)
# df_data = df_data[['Cycle', '9.9', '10B', '10.2', '11B']].astype(float)
# fig, ax = plt.subplots()
# ax.plot(df_data['11B'], label='11B', c='green')
# ax.plot(df_data['10B'], label='10B', c='firebrick')
# ax.set_ylabel('signal intensity')
# ax.set_xlabel('cycle')
# x = df_data['11B'].index.to_numpy()
# ax.fill_between(x, max(df_data['11B']), where=(
# x < st.session_state.sig_end) & (x > st.session_state.sig_str), alpha=0.5)
# ax.fill_between(x, max(df_data['11B']), where=(
# x < st.session_state.bac_end) & (x > st.session_state.bac_str), alpha=0.5)
# ax.legend()
# return fig
def bacground_sub(factorSD, factor_B11)
Background subtraction. ‘9.9’, ‘10B’, ’10.2’ and ‘11B’ are useful data here. (1). noise substraction from each cup. (2). bulge is defined by ‘9.9’ and ’10.2’. The average value of ‘9.9’ and ’10.2’ is applied for 10B correction, multiply 0.6 of the average value is applied for 11B correction. (3). the outlier data is plotted here. (4). the average of 11B/10B, standard deviation and name of datafile are returned.
def bacground_sub(factorSD, factor_B11):
average_B = []
for i in st.session_state.uploaded_files:
df_data, filename = parseBoronTable(i)
df_data = df_data[['Cycle', '9.9', '10B', '10.2', '11B']].astype(float)
df_bacground_mean = df_data[st.session_state.bac_str:st.session_state.bac_end].mean()
df_signal = df_data[st.session_state.sig_str:st.session_state.sig_end]
df_bacground_sub = df_signal - df_bacground_mean
df_bacground_sub['10B_bulc_sub'] = df_bacground_sub['10B'] - \
(df_bacground_sub['9.9']+df_bacground_sub['10.2'])/2
df_bacground_sub['11B_bulc_sub'] = df_bacground_sub['11B'] - \
factor_B11*(df_bacground_sub['9.9']+df_bacground_sub['10.2'])/2
df_bacground_sub['11B/10B'] = df_bacground_sub['11B_bulc_sub'] / \
df_bacground_sub['10B_bulc_sub']
fil = outlierCorrection_plot(df_bacground_sub['11B/10B'], factorSD)
res_iso = df_bacground_sub['11B/10B'][fil]
res_iso_outlier = df_bacground_sub['11B/10B'][~fil]
res_11B = outlierCorrection(df_bacground_sub['11B'], factorSD)
if i == st.session_state.sample_plot:
fig1, ax = plt.subplots()
ax.plot(df_bacground_sub['11B/10B'], 'ko')
ax.plot(res_iso_outlier, 'ro', label='outliers')
ax.set_ylabel('$^{11}B$/$^{1O}B$')
ax.legend()
st.pyplot(fig1)
average_B.append({'filename': filename, '11B': np.mean(
res_11B), '11B/10B_row': np.mean(res_iso), 'se': np.std(res_iso)/np.sqrt(len(res_iso))})
df = pd.DataFrame(average_B)
st.session_state.average_B = df
return df
def polynomFit(inp, *args)
used for regression function.
def regression(x, y, ref_stand, order, listname)
Get the correction function of the Intra-sequence Instrumental drift.
def regression(x, y, ref_stand, order, listname):
x_use = np.array(x)
popt, pcov = curve_fit(polynomFit, xdata=x_use, ydata=y , p0=[0]*(order+1))
fitData=polynomFit(x_use,*popt)
res = []
for unknown in listname:
y_unknown = ref_stand / polynomFit(unknown,*popt)
res.append({'factor': y_unknown})
return(pd.DataFrame(res))
def regression_plot(x, y, ref_stand, order, listname)
Return the plot the regress line.
def regression_plot(x, y, ref_stand, order, listname):
fig, ax = plt.subplots()
ax.plot(x, y, label='measuered', marker='o', linestyle='none' )
x_use = np.array(x)
popt, pcov = curve_fit(polynomFit, xdata=x_use, ydata=y , p0=[0]*(order+1))
fitData=polynomFit(x_use,*popt)
ax.plot(x_use, fitData, label='polyn. fit, order '+str(order), linestyle='--' )
ax.legend(loc='upper left', bbox_to_anchor=(1.05, 1))
return fig
def prepare_trace(datafile)
Prepare trace element datafile from Ladr: change the column titles and change data formate from str to float.
def prepare_trace(datafile):
if 'LR' in datafile.columns[14]:
del datafile['44Ca(LR)']
del datafile['26Mg(LR)']
else:
del datafile['44Ca']
del datafile['26Mg']
datafile.columns = datafile.columns.str.replace('\d+', '')
datafile.columns = datafile.columns.str.replace('\('+'LR'+'\)', '')
res = []
for i in range(13, len(datafile.columns)):
for j in datafile.iloc[:, i]:
if '<' in j:
res.append(j)
RES = datafile.replace(to_replace=res, value='nan', regex=True)
RES2 = RES.replace(
{'ERROR: Error (#1002): Internal standard composition can not be 0': np.nan})
RES3 = RES2.replace(
{'ERROR: Error (#1003): Calibration RM composition does not contain analyte element': np.nan})
RES4 = RES3.iloc[:, 13:].astype(float)
columns = RES4.iloc[:, 13:].columns
RES4[columns] = RES4.iloc[:, 13:]
RES4[' Sequence Number'] = RES3['LB#']
return(RES4)
def processData()
Use functions for Intra-sequence Instrumental drift.
def processData():
st.set_option('deprecation.showPyplotGlobalUse', False)
st.subheader('1.1 select your background and signal area')
st.session_state.bac_str, st.session_state.bac_end = st.slider('Select bacground', 0, 200, (5, 70))
st.session_state.sig_str, st.session_state.sig_end = st.slider('Select signal', 0, 200, (95, 175))
st.pyplot(sig_selection())
st.subheader('1.2 Please set your outlier and bulge factor')
outlier_factor = st.number_input('insert your outlier factor (means data is outlier_factor times of sd will be cut)',
value=1.5)
bulc_factor = st.number_input(
'insert your bulge factor for 11B correction', value=0.6)
if "average_B" in st.session_state:
df_data = st.session_state.average_B
else:
df_data = bacground_sub(outlier_factor, bulc_factor)
st.subheader(
'1.3 Please choose your standard for boron isotopes correction')
standard = st.selectbox(
'NIST 612 or B5 for correction?',
('NIST SRM 612', 'B5'))
if standard == 'B5':
number_iso = int(4.0332057)
number_trace = int(8.42)
SRM951_value = int(4.0492)
if standard == 'NIST SRM 612':
number_iso = int(4.05015)
number_trace = int(35)
SRM951_value = int(4.0545)
st.session_state.standard_values = {
"number_iso" : number_iso,
"number_trace" : number_trace,
"SRM951_value" : SRM951_value
}
st.session_state.sample_correction = st.selectbox(
'Which type is your choosed standard?',
('A', 'B', 'C', 'D'))
st.session_state.default_reg_level = 4
st.session_state.regress_level = st.number_input('insert your regression level (4 is recommended)', step=1, value=st.session_state.default_reg_level, format='%X'
)
fil = df_data['filename'].str.contains(st.session_state.sample_correction)
df_data_B = df_data[fil]
df_data[' Sequence Number'] = selSmpType(df_data['filename'])
y_isotope = df_data_B['11B/10B_row']
y_11B = df_data_B['11B']
x = df_data_B.index.to_numpy()
factor_iso = regression(x, y_isotope,
number_iso,
st.session_state.regress_level if "regress_level" in st.session_state else st.session_state.default_reg_level,
df_data.index.to_numpy()
)
df_data['factor_iso'] = factor_iso
df_data['11B/10B_corrected'] = df_data['factor_iso']*df_data['11B/10B_row']
df_data['δ11B'] = ((df_data['11B/10B_corrected']/SRM951_value)-1)*1000
df_data['δ11B_se'] = (df_data['se']*df_data['factor_iso']/SRM951_value)*1000
st.session_state.df_data = df_data
st.session_state.df_data_B = df_data_B
def processLaser()
Use functions and volume factors for corrected boron concerntrations.
def processLaser():
if "df_data" in st.session_state:
st.header('2. Please upload your log file from Laser')
st.session_state.uploaded_laser_file = st.file_uploader("Choose a laser file", type='csv')
if st.session_state.uploaded_laser_file is not None:
st.session_state.df_Laser = pd.read_csv(st.session_state.uploaded_laser_file)
st.session_state.df_Laser_part1 = st.session_state.df_Laser[st.session_state.df_Laser[' Laser State']
== 'On'].iloc[:, [13, 20, 21]]
st.session_state.df_Laser_part2 = st.session_state.df_Laser[st.session_state.df_Laser[' Sequence Number'].notnull()].iloc[:, [
1, 4]]
st.session_state.df_Laser_res = pd.concat([st.session_state.df_Laser_part2.reset_index(
drop=True), st.session_state.df_Laser_part1.reset_index(drop=True)], axis=1)
st.session_state.df_map1 = st.session_state.df_Laser_res.merge(st.session_state.df_data, on=' Sequence Number')
st.subheader('2.1 B concerntration correction')
st.session_state.regress_level_B = st.number_input('insert your regression level for [B] (4 is recommended)',
step=1,
value=st.session_state.default_reg_level,
format='%X'
)
y_isotope = st.session_state.df_data_B['11B/10B_row']
y_11B = st.session_state.df_data_B['11B']
x = st.session_state.df_data_B.index.to_numpy()
factor_B = regression(x, y_11B, st.session_state.standard_values["number_trace"],
st.session_state.regress_level_B if "regress_level_B" in st.session_state else st.session_state.default_reg_level_B,
st.session_state.df_data.index.to_numpy()
)
st.session_state.df_map1['factor_B'] = factor_B
depth_ref = st.number_input('insert the abalation depth of selected reference / µm', value = 30.0)
depth_sample = st.number_input('insert the abalation depth of other samples / µm', value = 30.0)
depth_ratios = []
for i in st.session_state.df_map1['filename'].str.contains('A'):
if i == True:
depth_ratio = 1
else:
depth_ratio = depth_sample / depth_ref
depth_ratios.append(depth_ratio)
st.session_state.df_map1['depth_correction'] = depth_ratios
spot_shape = st.selectbox(
'What is the type of your spots?',
('circle', 'squre'))
if spot_shape == 'circle':
st.session_state.df_map1[' Spot Size (um)'] = st.session_state.df_Laser_res[' Spot Size (um)']
ref = ((st.session_state.df_map1[st.session_state.df_map1['filename'].str.contains(st.session_state.sample_correction)][' Spot Size (um)']/2)**2).mean()
st.session_state.df_map1['[B]_corrected'] = st.session_state.df_map1['11B']*st.session_state.df_map1['factor_B'] * (ref / ((st.session_state.df_map1[' Spot Size (um)']/2)**2) / depth_ratios)
if spot_shape == 'squre':
dia = st.session_state.df_map1[' Spot Size (um)']
spotsize = dia.str.split(' ').str[0].apply(lambda x: float(x))
st.session_state.df_map1[' Spot Size (um)'] = spotsize
ref = ((st.session_state.df_map1[st.session_state.df_map1['filename'].str.contains(st.session_state.sample_correction)][' Spot Size (um)'])**2).mean()
st.session_state.df_map1['[B]_corrected'] = st.session_state.df_map1['11B']*st.session_state.df_map1['factor_B'] * (ref / ((st.session_state.df_map1[' Spot Size (um)'])**2) / depth_ratios)
st.session_state.df_map1 = st.session_state.df_map1
def maping()
upload trace element datafile and merge laser parameter, isotopic results and trace element compositions into one file based on sequence number.
def maping():
if "df_map1" in st.session_state:
st.subheader('2.2 export results or append your trace elements')
trace_file = st.selectbox(
'split stream or not?',
('Split stream', 'No'))
if trace_file == 'No':
st.session_state.df_all = st.session_state.df_map1
elif trace_file == 'Split stream':
st.header('3. Please upload your trace element data processed from Ladr')
st.session_state.trace = st.file_uploader("Choose a file", type='csv', accept_multiple_files=True)
if "trace" in st.session_state and len(st.session_state.trace) > 0:
trace_file = pd.read_csv(st.session_state.trace[0])
#trace_file = pd.read_csv('2022-11-28-Si corrected-B5.csv')
df_trace = prepare_trace(trace_file)
st.session_state.df_all = st.session_state.df_map1.merge(df_trace, on=' Sequence Number')
# fig4, ax = plt.subplots()
# ax.plot([0,1],[0,1], transform=ax.transAxes, c = 'red')
# ax.scatter(st.session_state.df_all['[B]_corrected'], st.session_state.df_all['B'], s =70, c = 'darkorange', edgecolors = 'black')
# ax.set_ylabel('[B]_measured by Element')
# ax.set_xlabel('[B]_corrected by Neptune')
# st.pyplot(fig4)
if "df_all" in st.session_state:
st.session_state.df_all.to_csv('final.csv')
st.write(st.session_state.df_all)
result_csv = st.session_state.df_all.to_csv().encode('utf-8')
st.download_button(
label='download results as .csv',
data=result_csv,
file_name='boron results.csv',
mime='txt/csv',
)
run thw function:
What exactly is the output, likely best with screen shots.
–>’Sequence Number’ column: the number of datafile in all sequence. –>The ‘Comment’ column: sample name, labelled by yourself during measuring. –> ‘Spot size (um)’, ‘Laser HV (kV)’, ‘Laser Energy (mJ)’: useful information selected from laser parameters. –>The ‘filename’ column: name of datafile. –>from ‘11B’ to ‘factor_iso’: all results from Neptune. ’[B]_corrected’ is calculated B concentrations from 11B. ‘δ11B’ and ‘δ11B_se’column are calculated isotope results and erros. –>from ‘Li’, ‘B’ to ‘U’ are all trace element results from Element XR.
(the following is copied from what was a ‘text’ file.) 1. csv files are changed from original .exp file 2. data automatically from machine can be found in ‘data/original data type’.
For a demonstration of a line plot on a polar axis, see Figure 1.