twinlab.Emulator.benchmark#

Emulator.benchmark(params=<twinlab.params.BenchmarkParams object>, verbose=False)[source]#

Benchmark the predicted uncertainty of a trained emulator.

A test dataset must have been defined in order for this to produce a result (otherwise None is returned). This means that train_test_ratio must be less than 1 when training the emulator. This method returns the calibration curve of a trained emulator, which can be used to asses the quality of the uncertainty predictions. The calibration curve is a plot of the fraction of values that are predicted to be within a certain confidence interval against the actual fraction of values that are within that interval.

100 monotonically increasing values between 0 and 1 are returned for each output dimension of the emulator, in the form of a pandas.DataFrame. These values can be plotted as the values on a y-axis, while the x-axis is taken to be 100 equally spaced values between 0 and 1 (inclusive). A well-calibrated emulator will have a curve that is close to the line y = x. If the shape deviates from this line, the emulator may be under- or overconfident, but the exact interpretation depends on the type of curve. See the documentation for BenchmarkParams() for more information on the available benchmark types.

Parameters:
  • params (BenchmarkParams, optional) – A parameter-configuration object that contains optional parameters for benchmarking an emulator.

  • verbose (bool, optional) – Display detailed information about the operation while running.

Returns:

Either a pandas.DataFrame containing the calibration curve for an emulator, or None if there is no test data.

Return type:

pandas.DataFrame, None

Example

emulator = tl.Emulator("quickstart")
emulator.benchmark()
      y
0   0.0
1   0.0
2   0.0
3   0.0
4   0.0
..  ...
95  1.0
96  1.0
97  1.0
98  1.0
99  1.0