Skip to main content

predict-statistics

Documentation / statistics/predict-statistics

trainModels()

function trainModels(
fullData: Object[],
targetName: string,
options: object): number;

Defined in: statistics/predict-statistics.js:103

Trains an XGBoost model on preprocessed data and evaluates its performance.

XGBoost (eXtreme Gradient Boosting) works by sequentially building decision trees where each new tree corrects errors made by the ensemble of previous trees. It uses gradient descent to minimize a loss function by adding trees that predict the residuals or errors of prior trees, then combining them through boosting. The algorithm employs regularization techniques to prevent overfitting and handles missing values effectively through its sparsity-aware split finding approach.

Parameters

ParameterTypeDescription

fullData

Object[]

Preprocessed training data as array of objects with numeric values only

targetName

string

Name of the target variable column to predict

options

{ xgbParams: { verbosity: number; max_depth: number; eta: number; objective: string; nthread: number; subsample: number; colsample_bytree: number; min_child_weight: number; gamma: number; alpha: number; lambda: number; early_stopping_rounds: number; seed: number; nrounds: number; }; testSize: number; featuresToUse: string[]; }

Configuration options for model training

options.xgbParams

{ verbosity: number; max_depth: number; eta: number; objective: string; nthread: number; subsample: number; colsample_bytree: number; min_child_weight: number; gamma: number; alpha: number; lambda: number; early_stopping_rounds: number; seed: number; nrounds: number; }

XGBoost hyperparameters like learning_rate, max_depth, etc.

// General Parameters

options.xgbParams.verbosity

number

[default=1] Controls the verbosity of XGBoost's output 0: silent mode (no messages) 1: warnings only 2: info messages 3: debug messages

// Tree Booster Parameters (Control tree structure)

options.xgbParams.max_depth

number

[default=6] Maximum depth of each tree Controls model complexity. Higher values create more complex trees that may overfit. Reduced from 8 to 6 to limit tree complexity and prevent overfitting.

options.xgbParams.eta

number

[default=0.3, alias: learning_rate] Step size shrinkage Controls how much weight is given to new trees in each boosting round. Smaller values (0.1) make the model more robust by shrinking feature weights. Set to 0.1 to allow more conservative boosting, requiring more trees but improving generalization.

options.xgbParams.objective

string

Specifies the learning task and objective 'reg:squarederror': Regression with squared loss (minimize MSE) Options include classification objectives, ranking, and other regression metrics.

options.xgbParams.nthread

number

Number of parallel threads used for training Set to 4 to utilize multi-core processing without overwhelming the system.

options.xgbParams.subsample

number

[default=1] Fraction of training instances used per tree Values < 1 implement random sampling of the training data for each tree. Set to 0.9 to reduce overfitting by introducing randomness while using most of the data.

options.xgbParams.colsample_bytree

number

[default=1] Fraction of features used per tree Controls feature sampling for each tree, similar to Random Forest. Set to 0.9 to reduce overfitting and create diverse trees.

options.xgbParams.min_child_weight

number

[default=1] Minimum sum of instance weight in a child Controls the minimum number of instances needed in a leaf node. Set to 3 to prevent the model from creating overly specific rules based on few samples.

options.xgbParams.gamma

number

[default=0, alias: min_split_loss] Minimum loss reduction for a split Controls the minimum reduction in the loss function required to make a split. Set to 0.1 to make splitting more conservative and reduce overfitting.

// Regularization Parameters

options.xgbParams.alpha

number

[default=0, alias: reg_alpha] L1 regularization on weights Encourages sparsity by penalizing non-zero weights (feature selection). Set to 0 as gamma is being used for regularization.

options.xgbParams.lambda

number

[default=1, alias: reg_lambda] L2 regularization on weights Penalizes large weights to prevent overfitting (similar to Ridge regression). Default value of 1 provides moderate regularization.

// Learning Control Parameters

options.xgbParams.early_stopping_rounds

number

Stop training if performance doesn't improve Stops adding trees when the validation metric doesn't improve for specified rounds. Set to 20 to prevent overfitting by stopping when the model stops improving.

options.xgbParams.seed

number

[default=0] Random number seed for reproducibility Set to 42 to ensure consistent results across training runs.

options.xgbParams.nrounds

number

Number of boosting rounds (trees to build) Set to 1000 to compensate for the lower learning rate (eta), allowing the model to converge slowly but more accurately.

options.testSize

number

Proportion of data to use for testing (default: 0.2)

options.featuresToUse

string[]

Specific feature columns to use for training

Returns

number

R² value (coefficient of determination) indicating model accuracy

See

XGBoost_parameters

Example

let data = [
{
"feature1": 1,
"feature2": 2,
"target": 3
}
];
let options = {
xgbParams: {
verbosity: 0,
max_depth: 7,
eta: 0.07,
objective: 'reg:squarederror',
nthread: 4,
}
};
let accuracy = await trainModels(data, 'target', options);
console.log(accuracy);

predictFuture()

function predictFuture(futureData: Object[], options: object): Promise<Object[]>;

Defined in: statistics/predict-statistics.js:243

Predicts energy output for future data using the trained XGBoost model

Parameters

ParameterTypeDescription

futureData

Object[]

Array of weather data objects for future dates

options

{ }

Returns

Promise<Object[]>

Promise resolving to array of data objects with predictions


saveModel()

function saveModel(modelPath: string): Promise<void>;

Defined in: statistics/predict-statistics.js:280

Saves the trained XGBoost model to the specified file path

Parameters

ParameterTypeDescription

modelPath

string

Path where the model should be saved

Returns

Promise<void>

Promise that resolves when the model is saved


loadModel()

function loadModel(modelPath: string): Promise<void>;

Defined in: statistics/predict-statistics.js:289

Loads a trained XGBoost model from the specified file path

Parameters

ParameterTypeDescription

modelPath

string

Path to the saved model file

Returns

Promise<void>

Promise that resolves when the model is loaded


calculateRollingStats()

function calculateRollingStats(
data: any[],
field: string,
window: number): any[];

Defined in: statistics/predict-statistics.js:303

Calculate rolling statistics for a given array of values

Parameters

ParameterTypeDefault valueDescription

data

any[]

undefined

Array of data objects

field

string

undefined

Field name to calculate rolling stats for

window

number

7

Rolling window size

Returns

any[]

Array with added rolling statistics