Extract author, date, source, and title from HTML using meta tags
and common class names. Validates human name from author string to check
against common list of 90k first names, last names,and organizations to infer
if it should be reversed starting by author last name (accounting for affixes/titles),
since organizations are not reversed.
📚💎 Extract Expert Excerpt
Extract author, date, source, and title from HTML using meta tags and common class names. Validates human name from author string to check against common list of 90k first names, last names,and organizations to infer if it should be reversed starting by author last name (accounting for affixes/titles), since organizations are not reversed.
Article-extraction-benchmark