Skip to main content

weigh-relevance-frequency

ai-research-agent / match/weigh-relevance-frequency

Match

weighRelevanceTermFrequency()

function weighRelevanceTermFrequency(
document,
query,
options?): number

📈📝 WRITEFAT: Weigh Relevance by Inference of Topics, Entities, and Frequency Averages for Terms

Calculate term specificity for a single doc with BM25 formula by using Wikipedia term frequencies as the baseline IDF.

Parameters

ParameterTypeDescription

document

string

a single document to calculate the score for

query

string

phrase to search tf and idf for each word

options?

{ avgDocWordCount: number; normalizeLength: number; saturationWeight: number; totalWikiPages: number; }

options.avgDocWordCount?

number

Estimated average word count of all documents

options.normalizeLength?

number

normalizeLengthcontrols the document length normalization. It ranges from 0 to 1, with 0.75 being a common default value. When normalizeLength=1: Full length normalization is applied. Longer documents are penalized more heavily.

options.saturationWeight?

number

saturationWeight controls the impact of term frequency saturation. It typically ranges from 1.2 to 2.0, with 1.5 being a common default value. As saturationWeight increases: The impact of term frequency increases (i.e., multiple occurrences of a term in a document become more significant).

options.totalWikiPages?

number

Total number of Wikipedia pages used to calculate IDF

Returns

number

score for term specificity

Author

ai-research-agent (2024)

Other

calculatePhraseSpecificity()

function calculatePhraseSpecificity(phrase, options): number

Calculate overall domain-speicificity after Query Resolution to Phrases. Words are tokenized into phrases and their specificity is calculated based on how many Wiki pages they appear in.

Parameters

ParameterTypeDescription

phrase

string

options

any

Returns

number

domain specificity 0-12~