.. _example_datasets: :orphan: ================ Example datasets ================ Explore our example datasets on this page. Understanding their parameters means you can ascertain which one is best to use to trial twinLab's capabilities. Many of these datasets have been used in our tutorials, which you can check out on the :ref:`examples` page. The `quickstart` dataset ~~~~~~~~~~~~~~~~~~~~~~~~ The `quickstart` dataset is a simple, non-contextual dataset. In the dataset, the rows are the samples and the columns are: * x * y .. list-table:: :widths: 25 50 :header-rows: 1 * - Dataset property - * - Author - Dr Freddy Wordingham (digiLab Solutions Ltd.) * - Provenance - Generated randomly with packages like `numpy` and `scipy`. * - Copyright information - MIT license * - Size - 20 * - Shape - (10,2) We can see how the dataset is distributed if we plot the points as a scatterplot: .. image:: ../../_static/images/example_datasets_images/quickstart.png :width: 400 :alt: A scatterplot of the quickstart dataset. The `advancedstart` dataset ~~~~~~~~~~~~~~~~~~~~~~~~~~~ The `advancedstart` dataset is a simple, non-contextual dataset. In the dataset, the rows are the samples and the columns are: * x * y * z .. list-table:: :widths: 25 50 :header-rows: 1 * - Dataset property - * - Author - Dr Alexander Mead (digiLab Solutions Ltd.) * - Provenance - Generated randomly with packages like `numpy` and `scipy`. * - Copyright information - MIT license * - Size - 75 * - Shape - (25,3) We can see how the dataset is distributed if we plot the points as a 3D scatterplot: .. image:: ../../_static/images/example_datasets_images/advancedstart.png :width: 400 :alt: A scatterplot of the advancedstart dataset. The `biscuits` dataset ~~~~~~~~~~~~~~~~~~~~~~ The `biscuits` dataset explores a hypothetical pricing optimisation problem for the manager of a biscuit factory. In the dataset, the rows are the samples and the columns are: * Pack price in GBP * The number of biscuits per pack * The number of packs sold * Profit made in GBP .. list-table:: :widths: 25 50 :header-rows: 1 * - Dataset property - * - Author - Dr Freddy Wordingham (digiLab Solutions Ltd.) * - Provenance - Generated randomly with packages like `numpy` and `scipy`. * - Copyright information - MIT license * - Size - 48 * - Shape - (12,4) We can see how the dataset is distributed if we plot scatterplots. Because this is a 4D dataset, to better understand the distribution of points we present two different plots, to represent each y-value via the colorbars: .. image:: ../../_static/images/example_datasets_images/biscuits_a.png :width: 400 :alt: The first 3D scatterplot of the biscuits dataset. .. image:: ../../_static/images/example_datasets_images/biscuits_b.png :width: 400 :alt: The second 3D scatterplot of the biscuits dataset. The `gardening` dataset ~~~~~~~~~~~~~~~~~~~~~~~ The `gardening` dataset explores a hypothetical growth optimisation problem for an intrepid gardener seeking to understand what makes their plants grow best. In the dataset, the rows are the samples and the columns are: * Sunlight in hours per day * The amount of times the garden was watered per week * The amount of units of fruit produced .. list-table:: :widths: 25 50 :header-rows: 1 * - Dataset property - * - Author - Dr Alexander Mead (digiLab Solutions Ltd.) * - Provenance - Generated randomly with packages like `numpy` and `scipy`. * - Copyright information - MIT license * - Size - 75 * - Shape - (25,3) We can see how the dataset is distributed if we plot the points on a 4D scatterplot: .. image:: ../../_static/images/example_datasets_images/gardening.png :width: 400 :alt: A 4D scatterplot of the gardening dataset. The `tritium-desorption-small` dataset ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The `tritium-desorption-small` dataset explores microscopic transport of tritium in fusion reactor materials. In the dataset, the rows are the samples and the columns are: * E1, E2, E3; representing: the detrapping energy of tritrium traps in a reactor. * n1, n2; representing: the density of the intrinsic traps. * y0-y623; representing: the flux of tritium across the trap boundary as a function of time, in atomic fractions. The dataset is created from simulations of `Achlys `_ using the software `UM-Bridge `_. Achlys models the macroscopic transport (and subsequent desorption) of tritium through fusion reactor materials using Foster-McNabb equations. .. list-table:: :widths: 25 50 :header-rows: 1 * - Dataset property - * - Authors - Dr Mikkel Lykkegaard (digiLab Solutions Ltd.) and Dr Anne Reinarz (Durham University) * - Provenance - This dataset was created as part of simulations calculated in Seelinger et al. 2024 (`arXiv: 2402.13768v4 `_). The software used to generate this dataset was UM-Bridge, and more details about the simulation and subsequent generated dataset can be found on the UM-Bridge documentation, on both the `inverse benchmark `_ documentation, and the `benchmark `_ documentation. * - Copyright information - MIT license * - Size - 251,600 * - Shape - (400,629) The `tritium-desorption-temperature-grid` dataset ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The `tritium-desorption-temperature-grid` dataset is an accompaniment to the `tritium-desorption-small` dataset. The grid is derived from accompanying simulations of `Achlys `_ using the software `UM-Bridge `_. Achlys models the macroscopic transport (and subsequent desorption) of tritium through fusion reactor materials using Foster-McNabb equations. .. list-table:: :widths: 25 50 :header-rows: 1 * - Dataset property - * - Authors - Dr Mikkel Lykkegaard (digiLab Solutions Ltd.) and Dr Anne Reinarz (Durham University) * - Provenance - This dataset was derived as part of simulations calculated in Seelinger et al. 2024 (`arXiv: 2402.13768v4 `_). The software used to generate this dataset was UM-Bridge, and more details about the simulation and subsequent generated dataset can be found on the UM-Bridge documentation, on both the `inverse benchmark `_ documentation, and the `benchmark `_ documentation. * - Copyright information - MIT license * - Size - 623 * - Shape - (623,1) The `jet-confinement` dataset ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This dataset explores the confinement of magnetic fusion devices and how that changes with device parameters. Derived from a larger dataset which features dozens of fusion experiments around the world, it is a subset that describes the outcome of high-confinement mode experiments from Joint European Torus (JET). JET is a record-breaking tokamak located in the UK. In the dataset, the rows are the samples and the columns are: * Magnetic field strength: the intensity of the magnetic field, in units of teslas, applied to confine the plasma within the tokamak. * Plasma current: the electric current flowing through the plasma, in amperes. * Thermal power: the estimated amount of heat energy, in watts, consumed by the plasma. * Major radius: the distance in meters from the center of the tokamak to the center of the plasma. * Elongation: the ratio of the plasma's height to its width. * Electron density: the number of electrons per unit volume (meters cubed) within the plasma. * Effective mass number: the average atomic mass (amu) of the ions in the plasma, weighted by their abundance. * Inverse aspect ratio: the ratio of the plasma's minor radius to its major radius. * Energy confinement time: the duration, in seconds, for which the plasma retains its energy before it is lost to the surroundings. .. list-table:: :widths: 25 50 :header-rows: 1 * - Dataset property - * - Authors - Verdoolaege, G. (Ghent University), Kaye, S. M. (Princeton University), Angioni, C. (Max-Planck-Institut für Plasmaphysik), Kardaunn, O. W. J. F. (Max-Planck-Institut für Plasmaphysik), Maslov, M. (United Kingdom Atomic Energy Authority), Romanelli, M. (United Kingdom Atomic Energy Authority), Ryter, F. (Max-Planck-Institut für Plasmaphysik), and Thomsen, K (Max-Planck-Institut für Plasmaphysik). * - Provenance - This dataset is a subset of JET experiment data from the `ITPA Global H-mode Confinement Database `_ . This dataset was published by Princeton Plasma Physics Laboratory, Princeton University. It was funded by the United States Department of Energy and the Euratom Research and Training Programme. * - Copyright information - Creative Commons Attribution 4.0 International (CC BY) * - Size - 29160 * - Shape - (3240, 9) We can see how the energy confinement time is distributed against the magnetic field strength if we plot the data as a scatterplot: .. image:: ../../_static/images/example_datasets_images/jet-confinement.png :width: 400 :alt: A scatterplot of the energy confinement time versus the magnetic field strength for JET experiments.