Skip to main content

html-utils

Documentation / extractor/html-to-content/html-utils

HTML Utilities​

convertHTMLToEscapedHTML()​

function convertHTMLToEscapedHTML(str: string, unescape: boolean): string;

Defined in: extractor/html-to-content/html-utils.js:63

Converts HTML special characters like &"'`’ to & escaped codes or vice versa. It handles named entities and hexadecimal numeric character references.

Parameters​

ParameterTypeDefault valueDescription

str

string

undefined

The string to process.

unescape

boolean

true

default=true - If true, converts & codes to characters. If false, converts characters to codes.

Returns​

string

The processed string.

Example​

var normalHTML = convertHTMLToEscapedHTML('<p>This & that © 2023 '+
'"Quotes"'Apostrophes' €100 ☺</p>', true)
console.log(normalHTML) // Returns: "<p>This & that © 2023 "Quotes" 'Apostrophes' €100 ☺</p>"

convertMarkdownToHTML()​

function convertMarkdownToHTML(content: string, toHtml: boolean): string;

Defined in: extractor/html-to-content/html-utils.js:218

Converts Markdown text to HTML. It handles the following Markdown elements:

  • Headers (h1 to h6)
  • Bold text
  • Italic text
  • Unordered lists
  • Ordered lists
  • Paragraphs
  • Images
  • Links
  • Code blocks

Parameters​

ParameterTypeDefault valueDescription

content

string

undefined

The Markdown or HTML content to be converted.

toHtml

boolean

true

default=true - If true, converts Markdown to HTML. If false, converts HTML to Markdown.

Returns​

string

The resulting HTML string.

Example​

const markdown = "# Header\n\nThis is **bold** and *italic* text.\n\n* List item 1\n* List item 2";
const html = convertMarkdownToHTML(markdown);
console.log(html);
// Output:
// <h1>Header</h1>
// <p>This is <strong>bold</strong> and <em>italic</em> text.</p>
// <ul><li>List item 1</li><li>List item 2</li></ul>

convertMathLaTexToImage()​

function convertMathLaTexToImage(html: string): string;

Defined in: extractor/html-to-content/html-utils.js:10

Convert LaTex <math> equations found inside HTML into easy-to-read SVG and HTML with KaTex.js.

Parameters​

ParameterTypeDescription

html

string

html with math Latex

Returns​

string

html with SVG of equations


convertURLToAbsoluteURL()​

function convertURLToAbsoluteURL(base: string, relative: string): string;

Defined in: extractor/html-to-content/html-utils.js:138

Convert relative URL to absolute URL using base URL.

Parameters​

ParameterTypeDescription

base

string

base url of the domain

relative

string

partial urls like ../images/image.jpg #hash

Returns​

string

absolute URL

Example​

var absoluteURL = convertURLToAbsoluteURL('https://example.com', 'images/image.jpg')
console.log(absoluteURL) // Returns: "https://example.com/images/image.jpg"
var absoluteURL = convertURLToAbsoluteURL('https://example.com', '//images/image.jpg')
console.log(absoluteURL) // Returns: "https:images/image.jpg"

Author​

ai-research-agent (2024)


copyHTMLToClipboard()​

function copyHTMLToClipboard(html: string, options: object): Promise<void>;

Defined in: extractor/html-to-content/html-utils.js:424

Copy HTML to clipboard. When pasting into rich text field, pastes rich text. When pasting into plain text field, pastes: plain text, html, or markdown.

Parameters​

ParameterTypeDescription

html

string

The HTML content to be copied.

options

{ pastePlainFormat: number; }

The options object.

options.pastePlainFormat

number

default=0 0 - plain text 1 - markdown 2 - html

Returns​

Promise<void>

  • A promise that resolves when the HTML is copied to the clipboard.

Author​

ai-research-agent (2024)

Other​

convertHTMLToMarkdown()​

function convertHTMLToMarkdown(html: any): any;

Defined in: extractor/html-to-content/html-utils.js:362

Parameters​

ParameterType

html

any

Returns​

any