twinlab.Dataset.analyse_variance#
- Dataset.analyse_variance(columns, verbose=False)[source]#
Return an analysis of the variance retained per dimension after performing singular value decomposition (SVD) on the dataset.
SVD is useful for understanding how much variance in the dataset is retained after projecting it into a new basis. SVD components are naturally ordered by the amount of variance they retain, with the first component retaining the most variance. A decision can be made about how many dimensions to keep based on the cumulative variance retained. This analysis is usually performed on either the set of input or output columns of the dataset.
- Parameters:
- Returns:
A
pandas.DataFrame
containing the variance analysis.- Return type:
pandas.Dataframe
Example
dataset = tl.Dataset("quickstart") dataset.analyse_variance(columns=["x", "y"]) # Typically either input or output columns
Number of Dimensions Cumulative Variance 0 0 0.000000 1 1 0.925741 2 2 1.000000