Page Comparison

The Picturepark search offers 3 search modes. When to use which search mode is explained below.

AND search
The AND search finds content that contains all search terms entered. For example, if you search for “Stock shot” the Picturepark translates it to Stock AND shot and searches for images that contain these two values.

OR search
When using the OR search the Picturepark search translates the search term “Stock shot” into “Stock OR Shot and finds content that contains one or more search terms entered

Advanced search
The advanced search allows a variety of exact, fuzzy or replacement searches. You can access the advanced search cheat sheet with search examples below. These queries only work in "Advanced Mode". Using these queries allows searching/accessing specific values in specific fields on specific layers. Check the individual syntax per field.

Filter by label (Content by label)

showLabels	false
max	50
cql	label in ( "advanced-search" , "search" ) and space = "SOL"

Search Analyzers

Include Page

	TERMS:Search Analyzers
	TERMS:Search Analyzers

Expand

title	Simple Search Analyzer

Simple Search Analyzer

access in search queries: simple

The simple search analyzer is a custom Picturepark implementation not using Elastic search defaults. The custom analyzer uses a regex:

Regex

Code Block
/"(\[^\\p\{L\}\\d\]+)\|(?<=\\D)(?=\\d)\|(?<=\\d)(?=\\D)\|(?<=\[\\p\{L\}&&\[^\\p\{Lu\}\]\])(?=\\p\{Lu\})\|(?<=\\p\{Lu\})(?=\\p\{Lu\}\[\\p\{L\}&&\[^\\p\{Lu\}\]\])"/

Outcome:
- Lowercase / Uppercase
- Digit / non-digit
- Stemming
- HTML Strip
Examples
- Picturepark = Picturepark, picturepark
- Case Study = Case, Study, case, study

If you want to test the simple search analyzer, you can check your terms in a regex tester to see the outcome.

Open a regex checker
1. https://regex101.com/
2. https://regexr.com/
Add your term as a test string
Check the outcome

Expand

title	No Diacritics Analyzer

No Diacritics Analyzer

access in search queries: no-diacritics

The no diacritics analyzer:

only works for text fields
strip diacritic characters, so when the text value is: Kovačić Mateo you can search for “Kovačić Mateo” or “Kovacic Mateo”.

An example can be found in Elastic Search Documentation: https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-asciifolding-tokenfilter.html

Expand

title	Path Hierarchy Analyzer

Path Hierarchy Analyzer

access in search queries: pathHierarchy

The path hierarchy analyzer will:

Take a path found in a field (picturepark\platform\manual) and delimit the individual terms
Example
- picturepark\platform\manual = picturepark\platform\manual, picturepark\platform, manual
- Products/Family/Industry = Products/Family, Products, Products/Family/Industry

You should only configure this analyzer if being used via API. The simple search in Picturepark escapes Special Characters, and therefore you won't find assets when searching for some of the tokens generated by this analyzer.
An example can be found in Elastic Search Documentation: https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-pathhierarchy-tokenizer.html

Expand

title	Language Analyzer

Language Analyzer

access in search queries: language

There are several language analyzers available for elastic search. Language analyzers prevent stemming from language-specific values and language-specific stopwords.
https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-lang-analyzer.html
The current implementation is using the default Elastic Search Language analyzers as listed in the link. We are using the default stop words and rules for stemming, without any custom adaption.

Expand

title	Ngram Analyzer

Ngram Analyzer

access in search queries: ngram

Starting point for exact substring matches was ngram tokenizing, which indexes all the substrings up to length n. The drawback of ngram tokenizing is a large amount of disk space used.
Best practice:

Use ngram only if required - use carefully and not for every string

Settings allow to define min and max grams created on indexing and token_chars, which are characters classes to keep in the tokens, Elasticsearch splits on characters that don't belong to any of these classes.
Example: Search "Raven"

NGrams (splits term into tokens with one character):
Rav
Rave
Raven
ave
aven
Ven
...

Example: Search "Pegasus"

NGrams (splits term into tokens with one character):
Pegasus
Degas

Examples are in Elastic Search Documentation:
https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-ngram-tokenizer.html

Expand

title	Edge NGram Analyzer

Edge NGram Analyzer

access in search queries: edgeNGram

This tokenizer is very similar to nGram but only keeps n-grams that start at the beginning of a token. Settings allow to define min and max grams created on indexing and token_chars, which are characters classes to keep in the tokens, Elasticsearch splits on characters that don't belong to any of these classes.

Examples are in Elastic Search Documentation:
https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-edgengram-tokenizer.html

Expand