The Document class is one of the four core classes in the sfutils package. A Document is a (large) body of text.

Document(text, ...)

Arguments

text

text to be fingerprinted

...

other options to be passed (uuid, fingerprint)

Details

(From http://documentation.cortical.io/working_with_text.html) The functionality we offer for text is a little more elaborate than for terms, given the more complex nature of texts. Besides getting a semantic fingerprint (semantic representation) for a given text (the /text endpoint), one can also get a list of keywords extracted from the text, or get the text split up into smaller consecutive chunks, based on information content. We also provide functionality for extracting terms from a text based on part of speech tags. There is also a bulk endpoint for merging several /text requests into just one http request. Finally there is a detect_language endpoint capable of detecting 50 languages.

Slots

text

text to be fingerprinted

fingerprint

numeric vector of the fingerprint

See also

See the Cortical documentation for more information about semantic fingerprinting and text

Examples

# NOT RUN {
# Get data
data("company_descriptions")

# Get a single text
txt <- company_descriptions$unilever$desc

# Fingerprint document
txt_fp <- do_fingerprint_document(txt)
# This is equivalent to above but above is more convenient
# Because it can fingerprint documents in bulk
txt_fp <- Document(txt)

# You can also pass a fingerprint to the Document constructor
# In which case the API won't be called
txt_fp_3 <- Document(txt, fingerprint = fingerprint(txt_fp_1))
# }