Skip to main content

html-to-basic-html

ai-research-agent / extractor/html-to-content/html-to-basic-html

HTML Utilities

convertHTMLToBasicHTML()

function convertHTMLToBasicHTML(html, options?): string

Strip HTML to ~30 basic markup HTML tags, lists, tables, images. Convert anchors and relative urls to absolute urls. Basic HTML supports the same elements as Markdown, which is used in writing plain text. Markdown is converted to HTML anyways to display it, and it is better to edit basic HTML in a rich text editor.

Mozilla DOM Reference
Source Code of Browser HTML DOM
RegExp JS V8 Code

Parameters

ParameterTypeDescription

html

string

Any page's HTML to process

options?

{ allowedAttributes: string; allowTags: string; formatting: boolean; images: boolean; links: boolean; url: string; videos: boolean; }

options.allowedAttributes?

string

default="text,tag,href, src,type,width, height,id,data" List of allowed HTML attributes

options.allowTags?

string

default="br,p,u,b,i ,em,strong,h1,h2,h3,h4, h5,h6,blockquote, code,ul,ol,li,dd,dl, table,th,tr,td,sub,sup" - Comma-separated list of allowed HTML tags.

options.formatting?

boolean

default=true - Whether to include formatting

options.images?

boolean

default=true - Whether to include images

options.links?

boolean

default=true - Whether to include links

options.url?

string

base URL for converting relative URLs to absolute

options.videos?

boolean

default=true - Whether to include videos or not

Returns

string

basic text formatting html

Author

ai-research-agent (2024)

Other

addDOMFunctions()

function addDOMFunctions(domObject): any

Parameters

ParameterType

domObject

any

Returns

any