Skip to main content

topic-distribution

ai-research-agent / topics/topic-distribution

Topics

extractTopicTermGroupsLDA()

function extractTopicTermGroupsLDA(sentences, options?): any[]

Latent Dirichlet (pronounced Dee-ruesh-ley) allocation is used in natural language processing to discover abstract topics in a collection of documents. It is a generative probabilistic model that assumes documents are mixtures of topics, where a topic is a probability distribution over words. LDA uses Bayesian inference to simultaneously learn the topics and topic mixtures that occur around each other in an unsupervised manner.

Latent Dirichlet Allocation (LDA) with Gibbs Sampling Explained
Latent Dirichlet Allocation
Topic Models (Youtube)

Parameters

ParameterTypeDescription

sentences

string[]

Array of input sentences.

options?

{ alpha: number; beta: number; numberOfIterations: number; numberOfTermsPerTopic: number; topicCount: number; valueBurnIn: number; valueSampleLag: number; }

Configuration options for LDA.

options.alpha?

number

default=0.1 - Dirichlet prior on document-topic distributions.

options.beta?

number

default=0.01 - Dirichlet prior on topic-word distributions.

options.numberOfIterations?

number

default=1000 - Number of iterations for the LDA algorithm.

options.numberOfTermsPerTopic?

number

default=10 - Number of terms to show for each topic.

options.topicCount?

number

default=10 - Number of topics to extract.

options.valueBurnIn?

number

default=100 - Number of burn-in iterations.

options.valueSampleLag?

number

default=10 - Lag between samples.

Returns

any[]

  • Array of topics, each containing term-probability pairs.

Author

ai-research-agent (2024)