R/data.R
sfutils-reuters.Rd
690 Fingerprinted documents (fps_train) and 189 fingerprinted documents (fps_test) belonging to nine categories taken from the 'reuters21578' dataset in the 'tm.corpus.Reuters21578' package. The data has been processed such that a) only articles are considered that belong to only one class, b) only articles belonging to one of the following topics are considered: grain, corn, crude, livestock, wheat, coffee, sugar, gold, copper, cocoa. The labels_binomial variable is created by recoding the topics as 'crude' if the topic belongs to 'crude' and 'other' if the topic belongs to one of the other classes.
fps_train fps_test
A list with three entries:
A vector called 'label_binomial' with two class labels: 'crude' and 'other'
A vector called 'label_multinomial' with nine, original class labels
A list of 690 (train) or 189 (test) fingerprinted documents of S4 class 'Document'
original document
fingerprint of document
unique id
type of the document that was fingerprinted
https://archive.ics.uci.edu/ml/datasets/Reuters-21578+Text+Categorization+Collection