Excerpt |
---|
The NGramTransformation can extract n-grams from free text fields e.g. descriptions and use those n-grams (words) in a LookupCache transformation to find the correct tags e.g. take the IPTC description and extract words to check against a keyword list for tagging. |
Tip |
---|
Input: "this is the new Breithorn Release of Picturepark" NGramTransformation:
Output: Breithorn Release Picturepark |
Tip |
---|
Input: "this is the new Breithorn Release of Picturepark" NGramTransformation:
Output: Breithorn Release |
Specific Definitions
Select the condition.
Property | Value |
kind | NGramTransformation |
Size | The maximum size of n-gram, if set to 3 would produce unigram, bigram, and trigram. The size depends on punctuation and not characters, so size should be set to the longest words in my keyword list. |
Minimum word length | Minimum word length: minimum length a word must have to be considered for the n-gram production |
Maximum word length | Maximum word length: maximum length a word can have to be considered for the n-gram production |
...