html-to-content
Documentation / extractor/html-to-content/html-to-content
ExtractedContent​
Defined in: extractor/html-to-content/html-to-content.js:83
Properties​
Property | Type | Description | Defined in |
---|---|---|---|
| The author's name | extractor/html-to-content/html-to-content.js:87 | |
| The full citation for the author | extractor/html-to-content/html-to-content.js:85 | |
| A shortened version of the author's name | extractor/html-to-content/html-to-content.js:86 | |
| The publication date | extractor/html-to-content/html-to-content.js:88 | |
| The extracted main content in HTML format | extractor/html-to-content/html-to-content.js:90 | |
| The source of the content | extractor/html-to-content/html-to-content.js:89 | |
| The title of the content | extractor/html-to-content/html-to-content.js:84 |
extractContentAndCite()​
function extractContentAndCite(documentOrHTML: any, options: object): any;
Defined in: extractor/html-to-content/html-to-content.js:30
Extracts the main content and citation information from a document or HTML string
Parameters​
Parameter | Type | Description |
---|---|---|
|
| The document or HTML string to extract content from |
| { | Optional configuration options |
|
| default=true - Whether to preserve formatting in the extracted content |
|
| default=true - Whether to include images in the extracted content |
|
| default=true - Whether to include links in the extracted content |
|
| The URL of the original document, if available, for absolutify-ing URLs |
|
| default=false - false uses Mozilla Readability, true uses Postlight Mercury. then use the alternate if the first returns less than 200 characters |
Returns​
any
The extracted content and citation information