pip install mag4Requirement already satisfied: mag4 in /Users/dominik/anaconda3/lib/python3.11/site-packages (0.0.214)
Note: you may need to restart the kernel to use updated packages.
In den vergangenen Jahren werden immer mehr Daten publiziert, deren Menge zukünftig noch schneller anwachsen wird. Es ist nicht leicht, all den Publikationen dieser erfreulich großen Mengen immer neuer Informationen zu folgen. Jupyter Notebooks wurden gerade auch dafür entwickelt, auf große Mengen an Daten zuzugreifen, diese zu visualisieren, analysieren und auf Problemstellungen anzuwenden. Natürlich ist das nur möglich, wenn die Daten in entsprechenden Formaten vorliegen. Oftmals ist daher im ersten Schritt ein ›Daten clean-up‹ notwendig. Seit 2020 gibt es nun endlich mehrere Initiativen, darunter so große wie die Nationale Forschungsdatenbank Initiative (NFDI) der DFG, GWK und anderer, die versuchen gemeinsam mit internationalen Partnern Daten besser verfügbar zu machen. Schon bestehende Datenbanken wie GeoROC, EarthChem oder auch MetBase erlauben schon seit langer Zeit solchen Zugang, und damit Data Science in z.B. der Geo- und Kosmochemie. In dieser Einheit steigen wir in das neue Feld der Data Science in der Mineralogie ein.
Zunächst müssen wir uns etwas anschauen, wie Datensätze und eine Datenbank überhaupt aufgebaut sind. Wir sind zwar sehr damit vertraut, Tabellen zu erstellen, als Datenbank taugen die meisten jedoch selten etwas. Datenbanken, bzw. einzelne Datensätze erstellt man mit einigen wenigen, einfachen Regeln, die wir nun kennen lernen wollen – danach sehen auch die meisten Tabellen besser aus. Außerdem lernen wir noch ein wenig hilfreiches, und Datenbank bezogenen Vokabular kennen.
pandas: shape(), info, columns, set_option(›display.max_rows’, 20), python: type() set_option(‘display.max_rows’, 20), drop_duplicates(), df[›category name‹], >, <, &
Available Notebooks: Lecture, Exercise, Solution
Python selbst stellt nur einen sehr begrenzten Umfang an Befehlen zur Daten-Manipulation zur verfügung. Allerdings gibt es sehr viele zusätzliche Befehlspakete – genannt: libraries oder packages – für alles mögliche, auch eines zur Daten-Manipulation. Diese heißt (warum auch immer) ›pandas‹, und dessen Grundzüge lernen wir nun kennen.
Requirement already satisfied: mag4 in /Users/dominik/anaconda3/lib/python3.11/site-packages (0.0.214)
Note: you may need to restart the kernel to use updated packages.
Import pandas
Display the data
df = mg.get_data('Mineral Data Example Dataset')
pd.set_option('display.max_rows', df.shape[0])
pd.set_option('display.max_columns', df.shape[1])
print(df.shape)
print(df.columns)(25, 28)
Index(['mineral', 'oxygens', 'sample', 'no.', 'SiO2', 'TiO2', 'Al2O3', 'Cr2O3',
'FeO', 'MnO', 'NiO', 'MgO', 'CaO', 'Na2O', 'K2O', 'total oxides', 'Si',
'Ti', 'Al', 'Cr', 'Fe', 'Mn', 'Ni', 'Mg', 'Ca', 'Na', 'K',
'total cations'],
dtype='object')
Select the data of a single category
0 42.51
1 42.16
2 52.33
3 51.54
4 57.09
5 59.19
6 58.33
7 53.87
8 47.10
9 47.72
10 42.72
11 42.83
12 40.43
13 42.46
14 45.09
15 46.72
16 46.36
17 47.78
18 0.59
19 35.13
20 35.52
21 3.24
22 0.16
23 51.72
24 32.79
Name: SiO2, dtype: float64
0 ol
2 cpx
4 opx
8 fs
16 fsp/mes
18 sp
19 ol/sp
Name: mineral, dtype: object
Using a filter
| mineral | oxygens | sample | no. | SiO2 | TiO2 | Al2O3 | Cr2O3 | FeO | MnO | NiO | MgO | CaO | Na2O | K2O | total oxides | Si | Ti | Al | Cr | Fe | Mn | Ni | Mg | Ca | Na | K | total cations | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | ol | 4 | chd#2 | 1 | 42.51 | 0.04 | 0.07 | 0.04 | 0.18 | 0.00 | 0.00 | 55.26 | 1.58 | 0.0 | 0.0 | 99.68 | 1.003 | 0.001 | 0.002 | 0.001 | 0.004 | 0.000 | 0.0 | 1.944 | 0.040 | 0.0 | 0.0 | 2.995 |
| 11 | ol | 4 | chd#3 | 17 | 42.83 | 0.03 | 0.11 | 0.16 | 0.45 | 0.00 | 0.01 | 56.10 | 0.46 | 0.0 | 0.0 | 100.15 | 1.004 | 0.001 | 0.003 | 0.003 | 0.009 | 0.000 | 0.0 | 1.961 | 0.012 | 0.0 | 0.0 | 2.992 |
| 13 | ol | 4 | chd#3 | 21 | 42.46 | 0.05 | 0.27 | 0.10 | 0.32 | 0.03 | 0.01 | 55.26 | 0.57 | 0.0 | 0.0 | 99.07 | 1.006 | 0.001 | 0.008 | 0.002 | 0.006 | 0.001 | 0.0 | 1.951 | 0.014 | 0.0 | 0.0 | 2.989 |
type() with df being the variable in this function.
This is the Pandas DataFrame dataset (=row) numbering, starting with the mathematical 0.
Yes, the first row contains the column category names, sometimes also called attributes, not to be confused with attributes as part of a code.
& (and) and | (or)? Start with an example and explain this.
&
| mineral | oxygens | sample | no. | SiO2 | TiO2 | Al2O3 | Cr2O3 | FeO | MnO | NiO | MgO | CaO | Na2O | K2O | total oxides | Si | Ti | Al | Cr | Fe | Mn | Ni | Mg | Ca | Na | K | total cations | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 4 | opx | 6 | chd#2 | 8 | 57.09 | 0.22 | 0.76 | 0.48 | 1.92 | 0.08 | 0.03 | 36.87 | 0.67 | 0.01 | 0.0 | 98.13 | 1.971 | 0.006 | 0.031 | 0.013 | 0.055 | 0.002 | 0.001 | 1.897 | 0.025 | 0.001 | 0.0 | 4.002 |
| 5 | opx | 6 | chd#2 | 10 | 59.19 | 0.19 | 0.63 | 0.51 | 0.67 | 0.20 | 0.03 | 38.06 | 0.67 | 0.01 | 0.0 | 100.16 | 1.988 | 0.005 | 0.025 | 0.014 | 0.019 | 0.006 | 0.001 | 1.906 | 0.024 | 0.001 | 0.0 | 3.988 |
| 6 | cpx | 6 | chd#2 | 11 | 58.33 | 0.30 | 0.98 | 0.53 | 0.33 | 0.14 | 0.00 | 34.70 | 5.02 | 0.01 | 0.0 | 100.34 | 1.977 | 0.008 | 0.039 | 0.014 | 0.009 | 0.004 | 0.000 | 1.754 | 0.182 | 0.001 | 0.0 | 3.989 |
Only those datasets are shown for which the SiO2 values are larger than 50 wt% and the CaO values are larger than 10 wt%.
|
| mineral | oxygens | sample | no. | SiO2 | TiO2 | Al2O3 | Cr2O3 | FeO | MnO | NiO | MgO | CaO | Na2O | K2O | total oxides | Si | Ti | Al | Cr | Fe | Mn | Ni | Mg | Ca | Na | K | total cations | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | ol | 4 | chd#2 | 1 | 42.51 | 0.04 | 0.07 | 0.04 | 0.18 | 0.00 | 0.00 | 55.26 | 1.58 | 0.00 | 0.00 | 99.68 | 1.003 | 0.001 | 0.002 | 0.001 | 0.004 | 0.000 | 0.000 | 1.944 | 0.040 | 0.000 | 0.000 | 2.995 |
| 1 | ol | 4 | chd#2 | 9 | 42.16 | 0.10 | 0.06 | 0.20 | 3.14 | 0.07 | 0.03 | 53.81 | 0.24 | 0.00 | 0.00 | 99.81 | 1.004 | 0.002 | 0.002 | 0.004 | 0.063 | 0.001 | 0.001 | 1.910 | 0.006 | 0.000 | 0.000 | 2.992 |
| 2 | cpx | 6 | chd#2 | 4 | 52.33 | 2.00 | 4.09 | 0.81 | 0.32 | 0.10 | 0.00 | 19.44 | 21.00 | 0.01 | 0.01 | 100.11 | 1.872 | 0.054 | 0.172 | 0.023 | 0.010 | 0.003 | 0.000 | 1.037 | 0.805 | 0.001 | 0.000 | 3.977 |
| 3 | cpx | 6 | chd#2 | 6 | 51.54 | 1.74 | 5.72 | 0.56 | 0.67 | 0.09 | 0.00 | 19.12 | 20.75 | 0.04 | 0.00 | 100.23 | 1.842 | 0.047 | 0.241 | 0.016 | 0.020 | 0.003 | 0.000 | 1.019 | 0.795 | 0.003 | 0.000 | 3.984 |
| 4 | opx | 6 | chd#2 | 8 | 57.09 | 0.22 | 0.76 | 0.48 | 1.92 | 0.08 | 0.03 | 36.87 | 0.67 | 0.01 | 0.00 | 98.13 | 1.971 | 0.006 | 0.031 | 0.013 | 0.055 | 0.002 | 0.001 | 1.897 | 0.025 | 0.001 | 0.000 | 4.002 |
| 5 | opx | 6 | chd#2 | 10 | 59.19 | 0.19 | 0.63 | 0.51 | 0.67 | 0.20 | 0.03 | 38.06 | 0.67 | 0.01 | 0.00 | 100.16 | 1.988 | 0.005 | 0.025 | 0.014 | 0.019 | 0.006 | 0.001 | 1.906 | 0.024 | 0.001 | 0.000 | 3.988 |
| 6 | cpx | 6 | chd#2 | 11 | 58.33 | 0.30 | 0.98 | 0.53 | 0.33 | 0.14 | 0.00 | 34.70 | 5.02 | 0.01 | 0.00 | 100.34 | 1.977 | 0.008 | 0.039 | 0.014 | 0.009 | 0.004 | 0.000 | 1.754 | 0.182 | 0.001 | 0.000 | 3.989 |
| 7 | cpx | 6 | chd#2 | 12 | 53.87 | 1.10 | 3.09 | 0.51 | 0.21 | 0.08 | 0.00 | 22.00 | 19.24 | 0.00 | 0.00 | 100.10 | 1.911 | 0.029 | 0.129 | 0.014 | 0.006 | 0.002 | 0.000 | 1.164 | 0.731 | 0.000 | 0.000 | 3.988 |
| 10 | ol | 4 | chd#3 | 15 | 42.72 | 0.05 | 1.08 | 0.11 | 0.37 | 0.01 | 0.00 | 54.32 | 0.68 | 0.53 | 0.00 | 99.87 | 1.005 | 0.001 | 0.030 | 0.002 | 0.007 | 0.000 | 0.000 | 1.904 | 0.017 | 0.024 | 0.000 | 2.991 |
| 11 | ol | 4 | chd#3 | 17 | 42.83 | 0.03 | 0.11 | 0.16 | 0.45 | 0.00 | 0.01 | 56.10 | 0.46 | 0.00 | 0.00 | 100.15 | 1.004 | 0.001 | 0.003 | 0.003 | 0.009 | 0.000 | 0.000 | 1.961 | 0.012 | 0.000 | 0.000 | 2.992 |
| 12 | ol | 4 | chd#3 | 18 | 40.43 | 0.02 | 0.03 | 0.11 | 11.16 | 0.00 | 0.01 | 46.57 | 0.33 | 0.01 | 0.00 | 98.67 | 1.009 | 0.000 | 0.001 | 0.002 | 0.233 | 0.000 | 0.000 | 1.733 | 0.009 | 0.000 | 0.000 | 2.989 |
| 13 | ol | 4 | chd#3 | 21 | 42.46 | 0.05 | 0.27 | 0.10 | 0.32 | 0.03 | 0.01 | 55.26 | 0.57 | 0.00 | 0.00 | 99.07 | 1.006 | 0.001 | 0.008 | 0.002 | 0.006 | 0.001 | 0.000 | 1.951 | 0.014 | 0.000 | 0.000 | 2.989 |
| 18 | sp | 4 | #24 - CAI | 74 | 0.59 | 0.27 | 67.71 | 0.63 | 5.54 | 0.03 | 0.00 | 23.43 | 0.14 | 0.07 | 0.00 | 98.41 | 0.015 | 0.005 | 1.972 | 0.012 | 0.115 | 0.001 | 0.000 | 0.863 | 0.004 | 0.003 | 0.000 | 2.990 |
| 19 | ol/sp | 4 | #24 - CAI | 76 | 35.13 | 0.58 | 14.39 | 0.30 | 8.06 | 0.05 | 0.00 | 33.12 | 4.09 | 1.04 | 0.13 | 96.89 | 0.886 | 0.011 | 0.428 | 0.006 | 0.170 | 0.001 | 0.000 | 1.246 | 0.111 | 0.051 | 0.004 | 2.913 |
| 21 | sp | 4 | #24 - CAI | 84 | 3.24 | 1.36 | 65.06 | 0.08 | 3.05 | 0.01 | 0.01 | 27.71 | 0.39 | 0.02 | 0.00 | 100.93 | 0.077 | 0.024 | 1.826 | 0.002 | 0.061 | 0.000 | 0.000 | 0.984 | 0.010 | 0.001 | 0.000 | 2.985 |
| 22 | sp | 4 | #24 - CAI | 85 | 0.16 | 0.51 | 69.36 | 0.26 | 2.41 | 0.00 | 0.00 | 25.71 | 0.08 | 0.01 | 0.00 | 98.50 | 0.004 | 0.009 | 1.988 | 0.005 | 0.049 | 0.000 | 0.000 | 0.932 | 0.002 | 0.000 | 0.000 | 2.990 |
| 23 | cpx | 6 | #24 - CAI | 75 | 51.72 | 0.20 | 3.27 | 0.05 | 0.27 | 0.02 | 0.00 | 17.69 | 24.25 | 0.43 | 0.05 | 97.95 | 1.908 | 0.006 | 0.142 | 0.001 | 0.008 | 0.001 | 0.000 | 0.973 | 0.959 | 0.031 | 0.002 | 4.031 |
All datasets with SiO2 values larger than 50 wt% or with CaO values are larger than 10 wt% are shown.
Alternatively stated: All datasets with SiO2 values larger than 50 wt% are shown and all datasets with CaO values larger than 10 wt% are shown.
✓With a filter
✗With the .sel() method
✗With the sel() function
Start with importing the pandas library.
Then we want to get some information about the dataset: First we need to read in some data. Use the ‘Mineral Data Example Dataset’ exercise file.
Store the data in the variable df.
Print some information about this file, such as its number of rows, columns.
In addition, use the .info() method to display the data types of the categories. What data type, for example, is MgO?
Then use the set_option('display.max_xxx') with xxx being rows/columns attribute to automatically display all rows and columns of the data.
Finally, display the category names.
df = mg.get_data('Mineral Data Example Dataset')
print(df.shape)
print(df.info())
pd.set_option('display.max_rows', df.shape[0])
pd.set_option('display.max_columns', df.shape[1])
print(df.columns)(25, 28)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 25 entries, 0 to 24
Data columns (total 28 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 mineral 25 non-null object
1 oxygens 25 non-null int64
2 sample 25 non-null object
3 no. 25 non-null int64
4 SiO2 25 non-null float64
5 TiO2 25 non-null float64
6 Al2O3 25 non-null float64
7 Cr2O3 25 non-null float64
8 FeO 25 non-null float64
9 MnO 25 non-null float64
10 NiO 25 non-null float64
11 MgO 25 non-null float64
12 CaO 25 non-null float64
13 Na2O 25 non-null float64
14 K2O 25 non-null float64
15 total oxides 25 non-null float64
16 Si 25 non-null float64
17 Ti 25 non-null float64
18 Al 25 non-null float64
19 Cr 25 non-null float64
20 Fe 25 non-null float64
21 Mn 25 non-null float64
22 Ni 25 non-null float64
23 Mg 25 non-null float64
24 Ca 25 non-null float64
25 Na 25 non-null float64
26 K 25 non-null float64
27 total cations 25 non-null float64
dtypes: float64(24), int64(2), object(2)
memory usage: 5.6+ KB
None
Index(['mineral', 'oxygens', 'sample', 'no.', 'SiO2', 'TiO2', 'Al2O3', 'Cr2O3',
'FeO', 'MnO', 'NiO', 'MgO', 'CaO', 'Na2O', 'K2O', 'total oxides', 'Si',
'Ti', 'Al', 'Cr', 'Fe', 'Mn', 'Ni', 'Mg', 'Ca', 'Na', 'K',
'total cations'],
dtype='object')
Extracting data
Extract a subset of from the df data imported above, e.g., all MnO values. Now extract all MnO values larger the detection limit of 0.02 wt%.
Finally, we would like to know which samples are in the dataset. Therefore display all sample names in the dataset, but each no more than once.
0 0.00
1 0.07
2 0.10
3 0.09
4 0.08
5 0.20
6 0.14
7 0.08
8 0.00
9 0.03
10 0.01
11 0.00
12 0.00
13 0.03
14 0.00
15 0.00
16 0.00
17 0.00
18 0.03
19 0.05
20 0.02
21 0.01
22 0.00
23 0.02
24 0.03
Name: MnO, dtype: float64
| mineral | oxygens | sample | no. | SiO2 | TiO2 | Al2O3 | Cr2O3 | FeO | MnO | NiO | MgO | CaO | Na2O | K2O | total oxides | Si | Ti | Al | Cr | Fe | Mn | Ni | Mg | Ca | Na | K | total cations | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | ol | 4 | chd#2 | 9 | 42.16 | 0.10 | 0.06 | 0.20 | 3.14 | 0.07 | 0.03 | 53.81 | 0.24 | 0.00 | 0.00 | 99.81 | 1.004 | 0.002 | 0.002 | 0.004 | 0.063 | 0.001 | 0.001 | 1.910 | 0.006 | 0.000 | 0.000 | 2.992 |
| 2 | cpx | 6 | chd#2 | 4 | 52.33 | 2.00 | 4.09 | 0.81 | 0.32 | 0.10 | 0.00 | 19.44 | 21.00 | 0.01 | 0.01 | 100.11 | 1.872 | 0.054 | 0.172 | 0.023 | 0.010 | 0.003 | 0.000 | 1.037 | 0.805 | 0.001 | 0.000 | 3.977 |
| 3 | cpx | 6 | chd#2 | 6 | 51.54 | 1.74 | 5.72 | 0.56 | 0.67 | 0.09 | 0.00 | 19.12 | 20.75 | 0.04 | 0.00 | 100.23 | 1.842 | 0.047 | 0.241 | 0.016 | 0.020 | 0.003 | 0.000 | 1.019 | 0.795 | 0.003 | 0.000 | 3.984 |
| 4 | opx | 6 | chd#2 | 8 | 57.09 | 0.22 | 0.76 | 0.48 | 1.92 | 0.08 | 0.03 | 36.87 | 0.67 | 0.01 | 0.00 | 98.13 | 1.971 | 0.006 | 0.031 | 0.013 | 0.055 | 0.002 | 0.001 | 1.897 | 0.025 | 0.001 | 0.000 | 4.002 |
| 5 | opx | 6 | chd#2 | 10 | 59.19 | 0.19 | 0.63 | 0.51 | 0.67 | 0.20 | 0.03 | 38.06 | 0.67 | 0.01 | 0.00 | 100.16 | 1.988 | 0.005 | 0.025 | 0.014 | 0.019 | 0.006 | 0.001 | 1.906 | 0.024 | 0.001 | 0.000 | 3.988 |
| 6 | cpx | 6 | chd#2 | 11 | 58.33 | 0.30 | 0.98 | 0.53 | 0.33 | 0.14 | 0.00 | 34.70 | 5.02 | 0.01 | 0.00 | 100.34 | 1.977 | 0.008 | 0.039 | 0.014 | 0.009 | 0.004 | 0.000 | 1.754 | 0.182 | 0.001 | 0.000 | 3.989 |
| 7 | cpx | 6 | chd#2 | 12 | 53.87 | 1.10 | 3.09 | 0.51 | 0.21 | 0.08 | 0.00 | 22.00 | 19.24 | 0.00 | 0.00 | 100.10 | 1.911 | 0.029 | 0.129 | 0.014 | 0.006 | 0.002 | 0.000 | 1.164 | 0.731 | 0.000 | 0.000 | 3.988 |
| 9 | fs | 8 | chd#2 | 5 | 47.72 | 0.06 | 31.87 | 0.01 | 0.71 | 0.03 | 0.00 | 0.58 | 16.88 | 2.12 | 0.00 | 99.98 | 2.200 | 0.002 | 1.732 | 0.000 | 0.027 | 0.001 | 0.000 | 0.040 | 0.834 | 0.190 | 0.000 | 5.026 |
| 13 | ol | 4 | chd#3 | 21 | 42.46 | 0.05 | 0.27 | 0.10 | 0.32 | 0.03 | 0.01 | 55.26 | 0.57 | 0.00 | 0.00 | 99.07 | 1.006 | 0.001 | 0.008 | 0.002 | 0.006 | 0.001 | 0.000 | 1.951 | 0.014 | 0.000 | 0.000 | 2.989 |
| 18 | sp | 4 | #24 - CAI | 74 | 0.59 | 0.27 | 67.71 | 0.63 | 5.54 | 0.03 | 0.00 | 23.43 | 0.14 | 0.07 | 0.00 | 98.41 | 0.015 | 0.005 | 1.972 | 0.012 | 0.115 | 0.001 | 0.000 | 0.863 | 0.004 | 0.003 | 0.000 | 2.990 |
| 19 | ol/sp | 4 | #24 - CAI | 76 | 35.13 | 0.58 | 14.39 | 0.30 | 8.06 | 0.05 | 0.00 | 33.12 | 4.09 | 1.04 | 0.13 | 96.89 | 0.886 | 0.011 | 0.428 | 0.006 | 0.170 | 0.001 | 0.000 | 1.246 | 0.111 | 0.051 | 0.004 | 2.913 |
| 24 | cpx | 6 | #24 - CAI | 81 | 32.79 | 1.35 | 26.06 | 0.46 | 2.53 | 0.03 | 0.00 | 17.46 | 15.75 | 0.80 | 0.20 | 97.43 | 1.228 | 0.038 | 1.150 | 0.014 | 0.079 | 0.001 | 0.000 | 0.975 | 0.632 | 0.058 | 0.010 | 4.185 |
read_csv(), maptplotlib.pyplot, plt, plot(), scatter(), show(), xlabel(), legend(), xlim()
Available Notebooks: Lecture, Exercise, Solution
Eine weitere library ist die ›matplotlib‹, die praktische alle nur denkbaren Plots und deren noch so komplizierten Darstellungen ermöglicht. In diese nehmen wir nun einen ersten, kleinen Einblick.
A simple line plot
A simple scatter plot
Formating elements
This is the name in pandas for a table with pandas specific formatting, indices, structure, etc.
A line plot has categories on the x-axis and a corresponding value on the y-axis. A category could simply be a number, but also chemical elements, e.g., the REE. The values on the y-axis would then be e.g., their concentrations.
A scatter plot has values on the x- and y-axis, e.g., Mg- and Si- values of a rock or mineral on the x- and y-axes.
✗import plt
✓import matplotlib.pyplot as plt
✗import matplotlib as plt
✗import pyplot as plt
✗True
✓False
Start with importing the pandas and matplotlib libraries.
Then read the example file Mineral Data Example Dataset 3, and store it a variable.
Display the categories and the data table. Use .tolist() to display the categories in a compact form.
Then plot the FeO data on the y-axis and label it with FeO.
Make the code flexibel, so that FeO can be quickly replaced by another oxide.
Then plot two oxides against each other, again flexible, and add a legend.
Finally, add data from the file Mineral Data Example Dataset to the latter plot.
['No.', 'SiO2', 'TiO2', 'Al2O3', 'FeO', 'MnO', 'MgO', 'CaO', 'Na2O', 'K2O', 'P2O5', 'Cr2O3', 'NiO', 'Total', 'Comment']
| No. | SiO2 | TiO2 | Al2O3 | FeO | MnO | MgO | CaO | Na2O | K2O | P2O5 | Cr2O3 | NiO | Total | Comment | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 57.2700 | 1.6275 | 17.9100 | 6.4200 | 0.1614 | 2.2800 | 3.46 | 4.4600 | 5.4500 | 0.8939 | 0.0026 | 0.0000 | 99.9354 | GProbe Mitte Grid 010x011 Grid 020x021 |
| 1 | 2 | 57.2800 | 1.5939 | 18.0100 | 6.4400 | 0.1372 | 2.3100 | 3.49 | 4.4300 | 5.4300 | 0.8622 | 0.0000 | 0.0070 | 99.9903 | GProbe Mitte Grid 010x011 Grid 020x021 |
| 2 | 4 | 57.1600 | 1.6335 | 18.0300 | 6.4500 | 0.1535 | 2.3000 | 3.48 | 4.4700 | 5.4300 | 0.9481 | 0.0109 | 0.0163 | 100.0823 | GProbe Mitte Grid 010x011 Grid 020x021 |
| 3 | 5 | 57.3100 | 1.6235 | 17.9800 | 6.4300 | 0.1502 | 2.2700 | 3.48 | 4.3800 | 5.4300 | 0.9043 | 0.0000 | 0.0000 | 99.9580 | GProbe Mitte Grid 010x011 Grid 020x021 |
| 4 | 7 | 57.3900 | 1.6314 | 18.0800 | 6.5100 | 0.1466 | 2.2900 | 3.51 | 4.4000 | 5.4400 | 0.8663 | 0.0161 | 0.0000 | 100.2804 | GProbe Mitte Grid 010x011 Grid 020x021 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 192 | 287 | 1.3845 | 0.0000 | 0.0045 | 0.0010 | 0.0000 | 1.3767 | 49.92 | 0.0713 | 0.0000 | 0.0104 | 0.0070 | 0.0000 | 52.7754 | GProbe Mitte Grid 010x011 Grid 020x021 |
| 193 | 289 | 0.7298 | 0.0082 | 0.0000 | 0.0333 | 0.0000 | 1.4190 | 49.99 | 0.0640 | 0.0000 | 0.0128 | 0.0000 | 0.0000 | 52.2571 | GProbe Mitte Grid 010x011 Grid 020x021 |
| 194 | 290 | 0.6210 | 0.0072 | 0.0006 | 0.0060 | 0.0000 | 1.3827 | 50.26 | 0.0655 | 0.0000 | 0.0000 | 0.0046 | 0.0000 | 52.3476 | GProbe Mitte Grid 010x011 Grid 020x021 |
| 195 | 292 | 0.5870 | 0.0000 | 0.0000 | 0.0072 | 0.0000 | 1.3888 | 50.19 | 0.0542 | 0.0026 | 0.0000 | 0.0000 | 0.0000 | 52.2299 | GProbe Mitte Grid 010x011 Grid 020x021 |
| 196 | 293 | 0.6773 | 0.0000 | 0.0084 | 0.0000 | 0.0000 | 1.4192 | 50.44 | 0.0713 | 0.0000 | 0.0002 | 0.0216 | 0.0143 | 52.6523 | GProbe Mitte Grid 010x011 Grid 020x021 |
197 rows × 15 columns
.iloc(), .loc(), .set_index()
Available Notebooks: Lecture, Exercise, Solution
Nun wollen wir lernen, wie leicht mit Pandas jegliche gewünschten Daten aus einer Datenbank, also z.B. einer einfachen Excel-Tabelle extrahiert werden können. Das ist besonders wichtig und hilfreich, wenn die Datenbank sehr groß ist, oder wenn man immer andere Datenbanken verwendet, die aber immer anders aufgebaut sind. Anders aufgebaut meint, dass z.B. die Kategorienamen in anderer Reihenfolge stehen, oder evtl. auch etwas anders lauten. Dann müsste bei einem Code mit Pandas nur der entsprechende Kategoriename geändert werden, der Code selbst bliebe gleich.
df = mg.get_data("Banda Arc") #could add: , index_col = 'xxx'
#pd.set_option('display.max_rows', df.shape[0])
#pd.set_option('display.max_columns', df.shape[1])
print(df.columns.tolist())
df['Citations', 'Tectonic Setting', 'Location', 'Location Comment', 'Latitude (Min)', 'Latitude (Max)', 'Longitude (Min)', 'Longitude (Max)', 'Land or Sea', 'Elevation (Min)', 'Elevation (Max)', 'Sample Name', 'Rock Name', 'Age (a, Min)', 'Age (a, Max)', 'Geol', 'Age', 'Eruption Day', 'Eruption Month', 'Eruption Year', 'Rock Texture', 'Rock Type', 'Drill Depth (Min)', 'Drill Depth (Max)', 'Alteration', 'Mineral', 'Material', 'Si', 'Ti', 'B', 'Al', 'Cr', 'Fe3+', 'Fe2+', 'Fetot(2+)', 'Ca', 'Mg', 'Mn', 'Ni', 'K', 'Na', 'P', 'H2O', 'H2OP', 'H2OM', 'H2Otot', 'CO2', 'CO', 'F', 'Cl', 'Cl2', 'OH', 'CH4', 'SO2', 'SO3', 'SO4', 'S', 'LOI', 'Volatiles', 'O', 'Others', 'HE(CCM/G)', 'HE(CCMSTP/G)', 'HE3(CCMSTP/G)', 'HE3(AT/G)', 'HE4(CCM/G)', 'HE4(CCMSTP/G)', 'HE4(AT/G)', 'HE4(MOLE/G)', 'HE4(NCC/G)', 'HE(NCC/G)', 'Li', 'Be', 'B.1', 'C', 'CO2.1', 'F.1', 'Na.1', 'Mg.1', 'Al.1', 'P.1', 'S.1', 'Cl.1', 'K.1', 'Ca.1', 'Sc', 'Ti.1', 'V', 'Cr.1', 'Mn.1', 'Fe', 'Co', 'Ni.1', 'Cu', 'Zn', 'Ga', 'Ge', 'As', 'Se', 'Br', 'Rb', 'Sr', 'Y', 'Zr', 'Nb', 'Mo', 'Ru', 'Rh', 'Pd', 'Ag', 'Cd', 'In', 'Sn', 'Sb', 'Te', 'I', 'Cs', 'Ba', 'La', 'Ce', 'Pr', 'Nd', 'Sm', 'Eu', 'Gd', 'Tb', 'Dy', 'Ho', 'Er', 'Tm', 'Yb', 'Lu', 'Hf', 'Ta', 'W', 'Re', 'Os', 'Ir', 'Pt', 'Au', 'Hg', 'Tl', 'Pb', 'Bi', 'Th', 'U', 'Nd143/Nd144', 'Nd143/Nd144 initial', 'e Nd', 'Sr87/Sr86', 'SR87_SR86_INI', 'PB206_PB204', 'PB206_PB204_INI', 'PB207_PB204', 'PB207_PB204_INI', 'PB208_PB204', 'PB208_PB204_INI', 'OS184_OS188', 'OS186_OS188', 'OS187_OS186', 'OS187_OS188', 'RE187_OS186', 'RE187_OS188', 'HF176_HF177', 'HE3_HE4', 'HE3_HE4(R/R(A))', 'HE4_HE3', 'HE4_HE3(R/R(A))', 'K40_AR40', 'AR40_K40', 'Unique Id', 'Unnamed: 171']
| Citations | Tectonic Setting | Location | Location Comment | Latitude (Min) | Latitude (Max) | Longitude (Min) | Longitude (Max) | Land or Sea | Elevation (Min) | Elevation (Max) | Sample Name | Rock Name | Age (a, Min) | ... | OS186_OS188 | OS187_OS186 | OS187_OS188 | RE187_OS186 | RE187_OS188 | HF176_HF177 | HE3_HE4 | HE3_HE4(R/R(A)) | HE4_HE3 | HE4_HE3(R/R(A)) | K40_AR40 | AR40_K40 | Unique Id | Unnamed: 171 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | [3085] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / KISU / KISU / INDIAN O... | NaN | -8.38 | -8.38 | 124.41 | 124.41 | SUBAERIAL | NaN | NaN | s BUA [3085] | NOT GIVEN [3085] | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3.57 | NaN | NaN | NaN | NaN | 104872 | NaN |
| 1 | [3085] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / ROMANG / PACIFIC OCEAN | NaN | -7.55 | -7.55 | 127.42 | 127.42 | SUBAERIAL | NaN | NaN | s RO7B [3085] | NOT GIVEN [3085] | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.99 | NaN | NaN | NaN | NaN | 104873 | NaN |
| 2 | [3085] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / BANDA / PACIFIC OCEAN | NaN | -4.52 | -4.52 | 129.87 | 129.87 | SUBAERIAL | NaN | NaN | s BA113A [3085] | NOT GIVEN [3085] | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3.01 | NaN | NaN | NaN | NaN | 104874 | NaN |
| 3 | [3910] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / TERNATE / TERNATE / PA... | NaN | 0.80 | 0.80 | 127.33 | 127.33 | SUBAERIAL | NaN | NaN | s T7 [3910] | NOT GIVEN [3910] | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 11648-T7 | NaN |
| 4 | [5178] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / TREWEG / PACIFIC OCEAN | NaN | -8.00 | -8.00 | 124.00 | 124.00 | SUBAERIAL | NaN | NaN | s TR1.2.1 [5178] | ANDESITE, BASALTIC [5178] | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 120443 | NaN |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 465 | [2708] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / AMBON / BANDA SEA / PA... | NaN | -3.61 | -3.61 | 128.11 | 128.11 | SUBAERIAL | NaN | NaN | s A-4 [2708] | BASALT [2708] | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 8829-A-4 | NaN |
| 466 | [2708] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / AMBON / BANDA SEA / PA... | NaN | -3.61 | -3.61 | 128.11 | 128.11 | SUBAERIAL | NaN | NaN | s A-5 [2708] | BASALT [2708] | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 8829-A-5 | NaN |
| 467 | [3147] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / AMBON / BANDA SEA / PA... | NaN | -3.61 | -3.61 | 128.11 | 128.11 | SUBAERIAL | NaN | NaN | s AM104A1 [3147] | ANDESITE [3147] | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 8829-AM104A1 | NaN |
| 468 | [3140][2708] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / AMBON / BANDA SEA / PA... | NaN | -3.61 | -3.61 | 128.11 | 128.11 | SUBAERIAL | NaN | NaN | s AS-2 [3140] | ANDESITE, BASALTIC [3140] / BASALT [2708] | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 8829-AS-2 | NaN |
| 469 | [2708] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / AMBON / BANDA SEA / PA... | NaN | -3.61 | -3.61 | 128.11 | 128.11 | SUBAERIAL | NaN | NaN | s AS-9 [2708] | BASALT [2708] | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 8829-AS-9 | NaN |
470 rows × 172 columns
Using .iloc[] (-> integer location)
# General Syntax: iloc[row_position, column_position]
# just '.tolist()' everything to obtain the desired values only.
# Select a single dataset/row by its index
df.iloc[5]
# Select a sequence of datasets/row by their indices
df.iloc[4:9]
# Select a single series by its index
df.iloc[:, 5]
# Select multiple datasets/rows by their index
df.iloc[[4, 8]]
# Select a sequence of datasets/rows by their index
df.iloc[4:8]
df.iloc[4:]
df.iloc[:8]
# Select only values from one sereies, but of only specific rows by their index
df.iloc[[4, 8], 4]
# Select values from multiple datasets/rows and multiple series by their index
df.iloc[[4, 8], [4, 6]]
# Select values from multiple series and a sequence of datasets/rows by their index
df.iloc[4:8, [4, 6]]
# Select values from a sequence of series and a sequence of datasets/rows by their index
df.iloc[4:8, 4:7]
# Select a single value using its index
df.iloc[120, 100]3.0
Using .loc[] (-> location)
# General Syntax: loc[row_label, column_label]
# Single value using index and location
df.loc[3, 'Mg']
# The first part only selects datasets indexed 3 and 5, the latter selects the desired series
df.loc[[3, 5], 'Mg']
# The first part only selects datasets indexed 3 and 5, the latter selects the desired series
df.loc[[3, 5], ['Mg', 'Si']]
# The first part selects a sequence of datasets, the latter selects the desired series
df.loc[3:5, ['Mg', 'Si']]| Mg | Si | |
|---|---|---|
| 3 | 27378.016 | 251106.082 |
| 4 | 20322.448 | 260174.321 |
| 5 | 21106.400 | 262885.444 |
Making a series indices to use it in .loc[]
0 VOLCANIC ROCK
47 PLUTONIC ROCK
50 METAMORPHIC ROCK
358 MANTLE XENOLITH
Name: Rock Type, dtype: object
| Citations | Tectonic Setting | Location | Location Comment | Latitude (Min) | Latitude (Max) | Longitude (Min) | Longitude (Max) | Land or Sea | Elevation (Min) | Elevation (Max) | Sample Name | Rock Name | Age (a, Min) | ... | OS186_OS188 | OS187_OS186 | OS187_OS188 | RE187_OS186 | RE187_OS188 | HF176_HF177 | HE3_HE4 | HE3_HE4(R/R(A)) | HE4_HE3 | HE4_HE3(R/R(A)) | K40_AR40 | AR40_K40 | Unique Id | Unnamed: 171 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Rock Type | |||||||||||||||||||||||||||||
| VOLCANIC ROCK | [3085] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / KISU / KISU / INDIAN O... | NaN | -8.38 | -8.38 | 124.41 | 124.41 | SUBAERIAL | NaN | NaN | s BUA [3085] | NOT GIVEN [3085] | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3.57 | NaN | NaN | NaN | NaN | 104872 | NaN |
| VOLCANIC ROCK | [3085] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / ROMANG / PACIFIC OCEAN | NaN | -7.55 | -7.55 | 127.42 | 127.42 | SUBAERIAL | NaN | NaN | s RO7B [3085] | NOT GIVEN [3085] | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.99 | NaN | NaN | NaN | NaN | 104873 | NaN |
| VOLCANIC ROCK | [3085] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / BANDA / PACIFIC OCEAN | NaN | -4.52 | -4.52 | 129.87 | 129.87 | SUBAERIAL | NaN | NaN | s BA113A [3085] | NOT GIVEN [3085] | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3.01 | NaN | NaN | NaN | NaN | 104874 | NaN |
| VOLCANIC ROCK | [3910] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / TERNATE / TERNATE / PA... | NaN | 0.80 | 0.80 | 127.33 | 127.33 | SUBAERIAL | NaN | NaN | s T7 [3910] | NOT GIVEN [3910] | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 11648-T7 | NaN |
| VOLCANIC ROCK | [5178] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / TREWEG / PACIFIC OCEAN | NaN | -8.00 | -8.00 | 124.00 | 124.00 | SUBAERIAL | NaN | NaN | s TR1.2.1 [5178] | ANDESITE, BASALTIC [5178] | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 120443 | NaN |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| VOLCANIC ROCK | [2708] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / AMBON / BANDA SEA / PA... | NaN | -3.61 | -3.61 | 128.11 | 128.11 | SUBAERIAL | NaN | NaN | s A-4 [2708] | BASALT [2708] | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 8829-A-4 | NaN |
| VOLCANIC ROCK | [2708] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / AMBON / BANDA SEA / PA... | NaN | -3.61 | -3.61 | 128.11 | 128.11 | SUBAERIAL | NaN | NaN | s A-5 [2708] | BASALT [2708] | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 8829-A-5 | NaN |
| VOLCANIC ROCK | [3147] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / AMBON / BANDA SEA / PA... | NaN | -3.61 | -3.61 | 128.11 | 128.11 | SUBAERIAL | NaN | NaN | s AM104A1 [3147] | ANDESITE [3147] | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 8829-AM104A1 | NaN |
| VOLCANIC ROCK | [3140][2708] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / AMBON / BANDA SEA / PA... | NaN | -3.61 | -3.61 | 128.11 | 128.11 | SUBAERIAL | NaN | NaN | s AS-2 [3140] | ANDESITE, BASALTIC [3140] / BASALT [2708] | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 8829-AS-2 | NaN |
| VOLCANIC ROCK | [2708] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / AMBON / BANDA SEA / PA... | NaN | -3.61 | -3.61 | 128.11 | 128.11 | SUBAERIAL | NaN | NaN | s AS-9 [2708] | BASALT [2708] | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 8829-AS-9 | NaN |
470 rows × 171 columns
# Select all datasets with a specific characteristic in the index series
df.loc["MANTLE XENOLITH"]
df.loc["MANTLE XENOLITH", ['Mg']]
df.loc[["MANTLE XENOLITH"], ['Mg', 'Si']]| Mg | Si | |
|---|---|---|
| Rock Type | ||
| MANTLE XENOLITH | 275589.280 | 210392.4935 |
| MANTLE XENOLITH | 264372.736 | 212028.5160 |
| MANTLE XENOLITH | 248693.696 | 206372.5525 |
| MANTLE XENOLITH | 252734.064 | 204923.5040 |
| MANTLE XENOLITH | 240854.176 | 207961.8315 |
| MANTLE XENOLITH | 241216.000 | 209130.4190 |
| MANTLE XENOLITH | 236572.592 | 204456.0690 |
| MANTLE XENOLITH | 236813.808 | 207167.1920 |
| MANTLE XENOLITH | 222702.672 | 213290.5905 |
| MANTLE XENOLITH | 212752.512 | 195855.2650 |
| MANTLE XENOLITH | 220411.120 | 201978.6635 |
| Rock Type | Citations | Tectonic Setting | Location | Location Comment | Latitude (Min) | Latitude (Max) | Longitude (Min) | Longitude (Max) | Land or Sea | Elevation (Min) | Elevation (Max) | Sample Name | Rock Name | ... | OS186_OS188 | OS187_OS186 | OS187_OS188 | RE187_OS186 | RE187_OS188 | HF176_HF177 | HE3_HE4 | HE3_HE4(R/R(A)) | HE4_HE3 | HE4_HE3(R/R(A)) | K40_AR40 | AR40_K40 | Unique Id | Unnamed: 171 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | VOLCANIC ROCK | [3085] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / KISU / KISU / INDIAN O... | NaN | -8.38 | -8.38 | 124.41 | 124.41 | SUBAERIAL | NaN | NaN | s BUA [3085] | NOT GIVEN [3085] | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3.57 | NaN | NaN | NaN | NaN | 104872 | NaN |
| 1 | VOLCANIC ROCK | [3085] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / ROMANG / PACIFIC OCEAN | NaN | -7.55 | -7.55 | 127.42 | 127.42 | SUBAERIAL | NaN | NaN | s RO7B [3085] | NOT GIVEN [3085] | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.99 | NaN | NaN | NaN | NaN | 104873 | NaN |
| 2 | VOLCANIC ROCK | [3085] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / BANDA / PACIFIC OCEAN | NaN | -4.52 | -4.52 | 129.87 | 129.87 | SUBAERIAL | NaN | NaN | s BA113A [3085] | NOT GIVEN [3085] | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3.01 | NaN | NaN | NaN | NaN | 104874 | NaN |
| 3 | VOLCANIC ROCK | [3910] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / TERNATE / TERNATE / PA... | NaN | 0.80 | 0.80 | 127.33 | 127.33 | SUBAERIAL | NaN | NaN | s T7 [3910] | NOT GIVEN [3910] | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 11648-T7 | NaN |
| 4 | VOLCANIC ROCK | [5178] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / TREWEG / PACIFIC OCEAN | NaN | -8.00 | -8.00 | 124.00 | 124.00 | SUBAERIAL | NaN | NaN | s TR1.2.1 [5178] | ANDESITE, BASALTIC [5178] | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 120443 | NaN |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 465 | VOLCANIC ROCK | [2708] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / AMBON / BANDA SEA / PA... | NaN | -3.61 | -3.61 | 128.11 | 128.11 | SUBAERIAL | NaN | NaN | s A-4 [2708] | BASALT [2708] | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 8829-A-4 | NaN |
| 466 | VOLCANIC ROCK | [2708] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / AMBON / BANDA SEA / PA... | NaN | -3.61 | -3.61 | 128.11 | 128.11 | SUBAERIAL | NaN | NaN | s A-5 [2708] | BASALT [2708] | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 8829-A-5 | NaN |
| 467 | VOLCANIC ROCK | [3147] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / AMBON / BANDA SEA / PA... | NaN | -3.61 | -3.61 | 128.11 | 128.11 | SUBAERIAL | NaN | NaN | s AM104A1 [3147] | ANDESITE [3147] | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 8829-AM104A1 | NaN |
| 468 | VOLCANIC ROCK | [3140][2708] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / AMBON / BANDA SEA / PA... | NaN | -3.61 | -3.61 | 128.11 | 128.11 | SUBAERIAL | NaN | NaN | s AS-2 [3140] | ANDESITE, BASALTIC [3140] / BASALT [2708] | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 8829-AS-2 | NaN |
| 469 | VOLCANIC ROCK | [2708] | CONVERGENT MARGIN | BANDA ARC / BANDA ARC / AMBON / BANDA SEA / PA... | NaN | -3.61 | -3.61 | 128.11 | 128.11 | SUBAERIAL | NaN | NaN | s AS-9 [2708] | BASALT [2708] | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 8829-AS-9 | NaN |
470 rows × 172 columns
Index is an integer number that can be used to refer to a row or column. A series is a pandas object, more specific, a single column with still pandas formating. More than two series togehter then constitute the pandas dataframe.
✓.iloc[] and .loc[]
✗.integerloc[] and .location[]
✗.intergerlocation[] and .location[]
✓True
✗False
Start with reading the file West African Craton into a variable and displaying the categories in a compact form.
Display the table, but with only 5 rows and columns.
Select the first 4 dataset.
Select the series number 4, 17, and 99.
What types of rock (-> ‘Rock Type’) are represented in the database?
Select all metamorphick rocks. (This is an already tricky one.)
Again, select all metamorphic rocks, but only display the series ‘Mg’ and ‘Ti’.
Reset the indices first, then select datasets 10 throughout and including 20, but only with the columns MgO and Cr2O3.
df = mg.get_data('West African Craton')
pd.set_option('display.max_rows', 5)
pd.set_option('display.max_columns', 5)
print(df.columns.tolist())
df['Citations', 'Tectonic Setting', 'Location', 'Location Comment', 'Latitude (Min)', 'Latitude (Max)', 'Longitude (Min)', 'Longitude (Max)', 'Land or Sea', 'Elevation (Min)', 'Elevation (Max)', 'Sample Name', 'Rock Name', 'Age (a, Min)', 'Age (a, Max)', 'Geol', 'Age', 'Eruption Day', 'Eruption Month', 'Eruption Year', 'Rock Texture', 'Rock Type', 'Drill Depth (Min)', 'Drill Depth (Max)', 'Alteration', 'Mineral', 'Material', 'Si', 'Ti', 'B', 'Al', 'Cr', 'Fe3+', 'Fe2+', 'Fetot(2+)', 'Ca', 'Mg', 'Mn', 'Ni', 'K', 'Na', 'P', 'H2O', 'H2OP', 'H2OM', 'H2Otot', 'CO2', 'CO', 'F', 'Cl', 'Cl2', 'OH', 'CH4', 'SO2', 'SO3', 'SO4', 'S', 'LOI', 'Volatiles', 'O', 'Others', 'HE(CCM/G)', 'HE(CCMSTP/G)', 'HE3(CCMSTP/G)', 'HE3(AT/G)', 'HE4(CCM/G)', 'HE4(CCMSTP/G)', 'HE4(AT/G)', 'HE4(MOLE/G)', 'HE4(NCC/G)', 'HE(NCC/G)', 'Li', 'Be', 'B.1', 'C', 'CO2.1', 'F.1', 'Na.1', 'Al.1', 'P.1', 'S.1', 'Cl.1', 'K.1', 'Ca.1', 'Sc', 'Ti.1', 'V', 'Cr.1', 'Mn.1', 'Fe', 'Co', 'Ni.1', 'Cu', 'Zn', 'Ga', 'Ge', 'As', 'Se', 'Br', 'Rb', 'Sr', 'Y', 'Zr', 'Nb', 'Mo', 'Ru', 'Rh', 'Pd', 'Ag', 'Cd', 'In', 'Sn', 'Sb', 'Te', 'I', 'Cs', 'Ba', 'La', 'Ce', 'Pr', 'Nd', 'Sm', 'Eu', 'Gd', 'Tb', 'Dy', 'Ho', 'Er', 'Tm', 'Yb', 'Lu', 'Hf', 'Ta', 'W', 'Re', 'Os', 'Ir', 'Pt', 'Au', 'Hg', 'Tl', 'Pb', 'Bi', 'Th', 'U', 'Nd143/Nd144', 'Nd143/Nd144 initial', 'e Nd', 'Sr87/Sr86', 'SR87_SR86_INI', 'PB206_PB204', 'PB206_PB204_INI', 'PB207_PB204', 'PB207_PB204_INI', 'PB208_PB204', 'PB208_PB204_INI', 'OS184_OS188', 'OS186_OS188', 'OS187_OS186', 'OS187_OS188', 'RE187_OS186', 'RE187_OS188', 'HF176_HF177', 'HE3_HE4', 'HE3_HE4(R/R(A))', 'HE4_HE3', 'HE4_HE3(R/R(A))', 'K40_AR40', 'AR40_K40', 'Unique Id']
| Citations | Tectonic Setting | ... | AR40_K40 | Unique Id | |
|---|---|---|---|---|---|
| 0 | [5205] | ARCHEAN CRATON (INCLUDING GREENSTONE BELTS) | ... | NaN | 184947 |
| 1 | [5205] | ARCHEAN CRATON (INCLUDING GREENSTONE BELTS) | ... | NaN | 184948 |
| ... | ... | ... | ... | ... | ... |
| 41 | [8624] | ARCHEAN CRATON (INCLUDING GREENSTONE BELTS) | ... | NaN | 247031 |
| 42 | [8624] | ARCHEAN CRATON (INCLUDING GREENSTONE BELTS) | ... | NaN | 247032 |
43 rows × 170 columns
| Citations | Tectonic Setting | ... | AR40_K40 | Unique Id | |
|---|---|---|---|---|---|
| 0 | [5205] | ARCHEAN CRATON (INCLUDING GREENSTONE BELTS) | ... | NaN | 184947 |
| 1 | [5205] | ARCHEAN CRATON (INCLUDING GREENSTONE BELTS) | ... | NaN | 184948 |
| 2 | [5205] | ARCHEAN CRATON (INCLUDING GREENSTONE BELTS) | ... | NaN | 184949 |
| 3 | [5205] | ARCHEAN CRATON (INCLUDING GREENSTONE BELTS) | ... | NaN | 184950 |
4 rows × 170 columns
| Latitude (Min) | Eruption Day | Rb | |
|---|---|---|---|
| 0 | 8.5 | NaN | 0.0 |
| 1 | 8.5 | NaN | 0.0 |
| ... | ... | ... | ... |
| 41 | 9.5 | NaN | 4.0 |
| 42 | 9.5 | NaN | 39.0 |
43 rows × 3 columns
0 VOLCANIC ROCK
23 METAMORPHIC ROCK
38 PLUTONIC ROCK
Name: Rock Type, dtype: object
| Citations | Tectonic Setting | ... | AR40_K40 | Unique Id | |
|---|---|---|---|---|---|
| Rock Type | |||||
| METAMORPHIC ROCK | [8624] | ARCHEAN CRATON (INCLUDING GREENSTONE BELTS) | ... | NaN | 247013 |
| METAMORPHIC ROCK | [8624] | ARCHEAN CRATON (INCLUDING GREENSTONE BELTS) | ... | NaN | 247014 |
| ... | ... | ... | ... | ... | ... |
| METAMORPHIC ROCK | [8624] | ARCHEAN CRATON (INCLUDING GREENSTONE BELTS) | ... | NaN | 247031 |
| METAMORPHIC ROCK | [8624] | ARCHEAN CRATON (INCLUDING GREENSTONE BELTS) | ... | NaN | 247032 |
19 rows × 169 columns
| Mg | Ti | |
|---|---|---|
| Rock Type | ||
| METAMORPHIC ROCK | 48243.200 | 6592.740 |
| METAMORPHIC ROCK | 34373.280 | 10788.120 |
| ... | ... | ... |
| METAMORPHIC ROCK | 151845.472 | 659.274 |
| METAMORPHIC ROCK | 246643.360 | 1618.218 |
19 rows × 2 columns
| Rock Type | Citations | ... | AR40_K40 | Unique Id | |
|---|---|---|---|---|---|
| 0 | VOLCANIC ROCK | [5205] | ... | NaN | 184947 |
| 1 | VOLCANIC ROCK | [5205] | ... | NaN | 184948 |
| ... | ... | ... | ... | ... | ... |
| 41 | METAMORPHIC ROCK | [8624] | ... | NaN | 247031 |
| 42 | METAMORPHIC ROCK | [8624] | ... | NaN | 247032 |
43 rows × 170 columns
Available Notebooks: Lecture, Exercise, Solution
Zum Schluss dieser Einheit wollen wir die geballte Kraft von Pandas mit der von maplotlib kombinieren. D.h., wir selektieren zunächst mit Pandas die gewünschten Daten aus einer Datenbank, und visualisieren diese umgehend und einfach mit Hilfe der matplotlib Bibliothek.
pandas and matplotlib.pyplot
✗True
✓False
Import the 2 databases we require for plotting, plus the one to load data.
Then, we need a database, which shall be West African Craton, and as usual we want to display some sensible information.
Then display datasets 5 through 25, and flexibly plot two elements against each other that are also shown as lables on the respective axes, as well as a legend indicating what data are shown.
The data are in wt-ppm. Change this to wt%.
If we want to concatenate strings, we can simply do this using + between two strings, e.g.: 'a' + 'b', which will result in 'ab'. Try using this to add units to the axes labelling.
['Citations', 'Tectonic Setting', 'Location', 'Location Comment', 'Latitude (Min)', 'Latitude (Max)', 'Longitude (Min)', 'Longitude (Max)', 'Land or Sea', 'Elevation (Min)', 'Elevation (Max)', 'Sample Name', 'Rock Name', 'Age (a, Min)', 'Age (a, Max)', 'Geol', 'Age', 'Eruption Day', 'Eruption Month', 'Eruption Year', 'Rock Texture', 'Rock Type', 'Drill Depth (Min)', 'Drill Depth (Max)', 'Alteration', 'Mineral', 'Material', 'Si', 'Ti', 'B', 'Al', 'Cr', 'Fe3+', 'Fe2+', 'Fetot(2+)', 'Ca', 'Mg', 'Mn', 'Ni', 'K', 'Na', 'P', 'H2O', 'H2OP', 'H2OM', 'H2Otot', 'CO2', 'CO', 'F', 'Cl', 'Cl2', 'OH', 'CH4', 'SO2', 'SO3', 'SO4', 'S', 'LOI', 'Volatiles', 'O', 'Others', 'HE(CCM/G)', 'HE(CCMSTP/G)', 'HE3(CCMSTP/G)', 'HE3(AT/G)', 'HE4(CCM/G)', 'HE4(CCMSTP/G)', 'HE4(AT/G)', 'HE4(MOLE/G)', 'HE4(NCC/G)', 'HE(NCC/G)', 'Li', 'Be', 'B.1', 'C', 'CO2.1', 'F.1', 'Na.1', 'Al.1', 'P.1', 'S.1', 'Cl.1', 'K.1', 'Ca.1', 'Sc', 'Ti.1', 'V', 'Cr.1', 'Mn.1', 'Fe', 'Co', 'Ni.1', 'Cu', 'Zn', 'Ga', 'Ge', 'As', 'Se', 'Br', 'Rb', 'Sr', 'Y', 'Zr', 'Nb', 'Mo', 'Ru', 'Rh', 'Pd', 'Ag', 'Cd', 'In', 'Sn', 'Sb', 'Te', 'I', 'Cs', 'Ba', 'La', 'Ce', 'Pr', 'Nd', 'Sm', 'Eu', 'Gd', 'Tb', 'Dy', 'Ho', 'Er', 'Tm', 'Yb', 'Lu', 'Hf', 'Ta', 'W', 'Re', 'Os', 'Ir', 'Pt', 'Au', 'Hg', 'Tl', 'Pb', 'Bi', 'Th', 'U', 'Nd143/Nd144', 'Nd143/Nd144 initial', 'e Nd', 'Sr87/Sr86', 'SR87_SR86_INI', 'PB206_PB204', 'PB206_PB204_INI', 'PB207_PB204', 'PB207_PB204_INI', 'PB208_PB204', 'PB208_PB204_INI', 'OS184_OS188', 'OS186_OS188', 'OS187_OS186', 'OS187_OS188', 'RE187_OS186', 'RE187_OS188', 'HF176_HF177', 'HE3_HE4', 'HE3_HE4(R/R(A))', 'HE4_HE3', 'HE4_HE3(R/R(A))', 'K40_AR40', 'AR40_K40', 'Unique Id']
| Citations | Tectonic Setting | ... | AR40_K40 | Unique Id | |
|---|---|---|---|---|---|
| 0 | [5205] | ARCHEAN CRATON (INCLUDING GREENSTONE BELTS) | ... | NaN | 184947 |
| 1 | [5205] | ARCHEAN CRATON (INCLUDING GREENSTONE BELTS) | ... | NaN | 184948 |
| ... | ... | ... | ... | ... | ... |
| 41 | [8624] | ARCHEAN CRATON (INCLUDING GREENSTONE BELTS) | ... | NaN | 247031 |
| 42 | [8624] | ARCHEAN CRATON (INCLUDING GREENSTONE BELTS) | ... | NaN | 247032 |
43 rows × 170 columns