Skip to main content

html-to-basic-html

Documentation / extractor/html-to-content/html-to-basic-html

HTML Utilities​

convertHTMLToBasicHTML()​

function convertHTMLToBasicHTML(html: string, options?: object): string;

Defined in: extractor/html-to-content/html-to-basic-html.js:31

Strip HTML to ~30 basic markup HTML tags, lists, tables, images. Convert anchors and relative urls to absolute urls. Basic HTML supports the same elements as Markdown, which is used in writing plain text. Markdown is converted to HTML anyways to display it, and it is better to edit basic HTML in a rich text editor.

Mozilla DOM Reference
Source Code of Browser HTML DOM
RegExp JS V8 Code

Parameters​

ParameterTypeDescription

html

string

Any page's HTML to process

options?

{ allowedAttributes: string; allowTags: string; formatting: boolean; images: boolean; links: boolean; url: string; videos: boolean; }

options.allowedAttributes?

string

default="text,tag,href, src,type,width, height,id,data" List of allowed HTML attributes

options.allowTags?

string

default="br,p,u,b,i ,em,strong,h1,h2,h3,h4, h5,h6,blockquote, code,ul,ol,li,dd,dl, table,th,tr,td,sub,sup" - Comma-separated list of allowed HTML tags.

options.formatting?

boolean

default=true - Whether to include formatting

options.images?

boolean

default=true - Whether to include images

options.links?

boolean

default=true - Whether to include links

options.url?

string

base URL for converting relative URLs to absolute

options.videos?

boolean

default=true - Whether to include videos or not

Returns​

string

basic text formatting html

Author​

ai-research-agent (2024)

Other​

addDOMFunctions()​

function addDOMFunctions(domObject: any): any;

Defined in: extractor/html-to-content/html-to-basic-html.js:224

Parameters​

ParameterType

domObject

any

Returns​

any