twinlab.BenchmarkParams#

class twinlab.BenchmarkParams(type='quantile')[source]#

Parameter configuration for benchmarking a trained emulator.

Variables:

type (str, optional) –

Specifies the type of emulator benchmark to be performed. Can be one of:

  • "quantile": The calibration curve is calculated over statistical quantiles.

  • "interval": The calibration curve is calculated over confidence intervals.

The default is "quantile".

For example, for a well calibrated emulator one would expect to have 10 percent of the unseen datapoints (from the test set) to be outside of the emulator’s 90 percent confidence bound. If a given confidence interval contains less than expected amount of data the model is underconfident, whereas if it contains more then it is overconfident. The calibration curve is necessarily equal to 0 and 1 at the beginning and end respectively, as the fraction of data within the entire confidence interval must be between 0 and 1. Curves are also necessarily monotonically increasing. Convex calibration curves (those below the line y = x) indicate that the model is underconfident, while concave calibration curves (those above the line y = x) indicate that the model is overconfident. It is possible for a curve to be both above and below the line y = x, indicating regions of under- and overconfidence, and possible non-Gaussianity in the data.

If type = "quantile" then the calibration curve is calculated over statistical quantiles extending from negative infinity to positive infinity.

If type = "interval" then the calibration curve is calculated over confidence intervals, starting from the mean of the distribution and extending outwards in both directions simultaneously until the entire confidence interval is covered at negative/positive infinity.

__init__(type='quantile')[source]#

Methods

__init__([type])

unpack_parameters()