Taxonomy (general) is the practice and science of classification of things or concepts, including the principles that underlie such classification. In our case, this is a list of text categories. We support two taxonomies for now:

IAB-2 taxonomy

IAB-2 taxonomy:

‘Automotive’, ‘Books_and_Literature’, ‘Business_and_Finance’, ‘Careers’, ‘Education’, ‘Events_and_Attractions’, ‘Family_and_Relationships’, ‘Fine_Art’, ‘Food_&_Drink’, ‘Healthy_Living’, ‘Hobbies_&_Interests’, ‘Home_&_Garden’, ‘Medical_Health’, ‘Movies’, ‘Music_and_Audio’, ‘News_and_Politics’, ‘Personal_Finance’, ‘Pets’, ‘Pop_Culture’, ‘Real_Estate’, ‘Religion_&_Spirituality’, ‘Science’, ‘Shopping’, ‘Sports’, ‘Style_&_Fashion’, ‘Technology_&_Computing’, ‘Television’, ‘Travel’, ‘Video_Gaming’

Documents taxonomy

Documents taxonomy:

  1. ADVE - advertisements, brochures.
  2. Email
  3. Form
  4. Letter
  5. Memo - memorandums.
  6. News - articles, including news articles.
  7. Invoice
  8. Report
  9. Resume
  10. Scientific - scientific papers.
  11. Other - the other classes of documents or cases where the classifier is not sure.

Setiment taxonomy

Binary sentiment classification. Positive class probability can also be interpreted as the main result of classification.

  1. Negative
  2. Positive