predict-statistics

Documentation / statistics/predict-statistics

trainModels()

function trainModels(
   fullData: Object[], 
   targetName: string, 
   options: object): number;

Defined in: statistics/predict-statistics.js:103

Trains an XGBoost model on preprocessed data and evaluates its performance.

XGBoost (eXtreme Gradient Boosting) works by sequentially building decision trees where each new tree corrects errors made by the ensemble of previous trees. It uses gradient descent to minimize a loss function by adding trees that predict the residuals or errors of prior trees, then combining them through boosting. The algorithm employs regularization techniques to prevent overfitting and handles missing values effectively through its sparsity-aware split finding approach.

Parameters

Parameter	Type	Description
`fullData`	`Object`[]	Preprocessed training data as array of objects with numeric values only
`targetName`	`string`	Name of the target variable column to predict
`options`	{ `xgbParams`: { `verbosity`: `number`; `max_depth`: `number`; `eta`: `number`; `objective`: `string`; `nthread`: `number`; `subsample`: `number`; `colsample_bytree`: `number`; `min_child_weight`: `number`; `gamma`: `number`; `alpha`: `number`; `lambda`: `number`; `early_stopping_rounds`: `number`; `seed`: `number`; `nrounds`: `number`; }; `testSize`: `number`; `featuresToUse`: `string`[]; }	Configuration options for model training
`options.xgbParams`	{ `verbosity`: `number`; `max_depth`: `number`; `eta`: `number`; `objective`: `string`; `nthread`: `number`; `subsample`: `number`; `colsample_bytree`: `number`; `min_child_weight`: `number`; `gamma`: `number`; `alpha`: `number`; `lambda`: `number`; `early_stopping_rounds`: `number`; `seed`: `number`; `nrounds`: `number`; }	XGBoost hyperparameters like learning_rate, max_depth, etc. // General Parameters
`options.xgbParams.verbosity`	`number`	[default=1] Controls the verbosity of XGBoost's output 0: silent mode (no messages) 1: warnings only 2: info messages 3: debug messages // Tree Booster Parameters (Control tree structure)
`options.xgbParams.max_depth`	`number`	[default=6] Maximum depth of each tree Controls model complexity. Higher values create more complex trees that may overfit. Reduced from 8 to 6 to limit tree complexity and prevent overfitting.
`options.xgbParams.eta`	`number`	[default=0.3, alias: learning_rate] Step size shrinkage Controls how much weight is given to new trees in each boosting round. Smaller values (0.1) make the model more robust by shrinking feature weights. Set to 0.1 to allow more conservative boosting, requiring more trees but improving generalization.
`options.xgbParams.objective`	`string`	Specifies the learning task and objective 'reg:squarederror': Regression with squared loss (minimize MSE) Options include classification objectives, ranking, and other regression metrics.
`options.xgbParams.nthread`	`number`	Number of parallel threads used for training Set to 4 to utilize multi-core processing without overwhelming the system.
`options.xgbParams.subsample`	`number`	[default=1] Fraction of training instances used per tree Values < 1 implement random sampling of the training data for each tree. Set to 0.9 to reduce overfitting by introducing randomness while using most of the data.
`options.xgbParams.colsample_bytree`	`number`	[default=1] Fraction of features used per tree Controls feature sampling for each tree, similar to Random Forest. Set to 0.9 to reduce overfitting and create diverse trees.
`options.xgbParams.min_child_weight`	`number`	[default=1] Minimum sum of instance weight in a child Controls the minimum number of instances needed in a leaf node. Set to 3 to prevent the model from creating overly specific rules based on few samples.
`options.xgbParams.gamma`	`number`	[default=0, alias: min_split_loss] Minimum loss reduction for a split Controls the minimum reduction in the loss function required to make a split. Set to 0.1 to make splitting more conservative and reduce overfitting. // Regularization Parameters
`options.xgbParams.alpha`	`number`	[default=0, alias: reg_alpha] L1 regularization on weights Encourages sparsity by penalizing non-zero weights (feature selection). Set to 0 as gamma is being used for regularization.
`options.xgbParams.lambda`	`number`	[default=1, alias: reg_lambda] L2 regularization on weights Penalizes large weights to prevent overfitting (similar to Ridge regression). Default value of 1 provides moderate regularization. // Learning Control Parameters
`options.xgbParams.early_stopping_rounds`	`number`	Stop training if performance doesn't improve Stops adding trees when the validation metric doesn't improve for specified rounds. Set to 20 to prevent overfitting by stopping when the model stops improving.
`options.xgbParams.seed`	`number`	[default=0] Random number seed for reproducibility Set to 42 to ensure consistent results across training runs.
`options.xgbParams.nrounds`	`number`	Number of boosting rounds (trees to build) Set to 1000 to compensate for the lower learning rate (eta), allowing the model to converge slowly but more accurately.
`options.testSize`	`number`	Proportion of data to use for testing (default: 0.2)
`options.featuresToUse`	`string`[]	Specific feature columns to use for training

Returns

number

R² value (coefficient of determination) indicating model accuracy

See

XGBoost_parameters

Example

let data = [
   {
     "feature1": 1,
     "feature2": 2,
     "target": 3
   }
 ];
 let options = {
   xgbParams: {
     verbosity: 0,
     max_depth: 7,
     eta: 0.07,
     objective: 'reg:squarederror',
     nthread: 4,
   }
 };
 let accuracy = await trainModels(data, 'target', options);
 console.log(accuracy);

predictFuture()

function predictFuture(futureData: Object[], options: object): Promise<Object[]>;

Defined in: statistics/predict-statistics.js:243

Predicts target variable for future data using the trained XGBoost model

Parameters

Parameter	Type	Description
`futureData`	`Object`[]	Array of weather data objects for future dates
`options`	{ }	‐

Returns

Promise<Object[]>

Promise resolving to array of data objects with predictions

saveModel()

function saveModel(modelPath: string): Promise<void>;

Defined in: statistics/predict-statistics.js:280

Saves the trained XGBoost model to the specified file path

Parameters

Parameter	Type	Description
`modelPath`	`string`	Path where the model should be saved

Returns

Promise<void>

Promise that resolves when the model is saved

loadModel()

function loadModel(modelPath: string): Promise<void>;

Defined in: statistics/predict-statistics.js:289

Loads a trained XGBoost model from the specified file path

Parameters

Parameter	Type	Description
`modelPath`	`string`	Path to the saved model file

Returns

Promise<void>

Promise that resolves when the model is loaded

calculateRollingStats()

function calculateRollingStats(
   data: any[], 
   field: string, 
   window: number): any[];

Defined in: statistics/predict-statistics.js:303

Calculate rolling statistics for a given array of values

Parameters

Parameter	Type	Default value	Description
`data`	`any`[]	`undefined`	Array of data objects
`field`	`string`	`undefined`	Field name to calculate rolling stats for
`window`	`number`	`7`	Rolling window size

Returns

any[]

Array with added rolling statistics

trainModels()​

Parameters​

Returns​

See​

Example​

predictFuture()​

Parameters​

Returns​

saveModel()​

Parameters​

Returns​

loadModel()​

Parameters​

Returns​

calculateRollingStats()​

Parameters​

Returns​

trainModels()

Parameters

Returns

See

Example

predictFuture()

Parameters

Returns

saveModel()

Parameters

Returns

loadModel()

Parameters

Returns

calculateRollingStats()

Parameters

Returns