twinlab.Emulator.benchmark#
- Emulator.benchmark(params=<twinlab.params.BenchmarkParams object>, verbose=False)[source]#
Benchmark the predicted uncertainty of a trained emulator.
A test dataset must have been defined in order for this to produce a result (otherwise
None
is returned). This means thattrain_test_ratio
must be less than 1 when training the emulator. This method returns the calibration curve of a trained emulator, which can be used to asses the quality of the uncertainty predictions. The calibration curve is a plot of the fraction of values that are predicted to be within a certain confidence interval against the actual fraction of values that are within that interval.100 monotonically increasing values between 0 and 1 are returned for each output dimension of the emulator, in the form of a
pandas.DataFrame
. These values can be plotted as the values on a y-axis, while the x-axis is taken to be 100 equally spaced values between 0 and 1 (inclusive). A well-calibrated emulator will have a curve that is close to the liney = x
. If the shape deviates from this line, the emulator may be under- or overconfident, but the exact interpretation depends on the type of curve. See the documentation forBenchmarkParams()
for more information on the available benchmark types.- Parameters:
params (BenchmarkParams, optional) – A parameter-configuration object that contains optional parameters for benchmarking an emulator.
verbose (bool, optional) – Display detailed information about the operation while running.
- Returns:
Either a
pandas.DataFrame
containing the calibration curve for an emulator, orNone
if there is no test data.- Return type:
pandas.DataFrame, None
Example
emulator = tl.Emulator("quickstart") emulator.benchmark()
y 0 0.0 1 0.0 2 0.0 3 0.0 4 0.0 .. ... 95 1.0 96 1.0 97 1.0 98 1.0 99 1.0