Why did I get this results?

  1. Home
  2. Blog
  3. Why did I get this results?

Why did I get this results?

In this topic we will try to understand how Oracle Endeca Commerce text search/keyword search which is also refers to two types of searches, record search and dimension search.

As its name implies, record search is process of searching for data whether it be part data, catalog data, document data etc. using search keywords entered by user. Dimension search, on the other hand, refers to search within dimension values defined in an application using same keywords. At its most basic level, record search is comprised of user entered keywords and a search key (property or dimension or search interface) to search within.

For understanding results of a search we should know what process Endeca engine uses
when it performs a search.

The steps are as follows:

  1.  Any filter is supplied in query it will be applied first, so that “searchable” dataset is smaller. [For example record filters]
    2. Query string sent to engine is tokenized, so that search terms can be
    extracted to get distinct search terms.
    3. Spell correction (if enabled) and automatic phrasing (if enabled) are applied.
    4. Original terms and any terms generated by spell correction are then compared to terms in Thesaurus and alternative terms are then added to list of terms to match.
    5. Stemming feature is applied to terms to account for plurals of noun terms.
    6. Terms gathered above are then used to search through documents to return matches containing term or phrase.
    7. Any alternate terms are gathered to be returned for “Did You Mean?” suggestions.
    8. Further filtering is done based on any current navigational state of application.
    9. Results gathered from query are then ordered using Relevance Ranking strategy defined.

There are many functions that affect search results. Few of them are given below.

Match Modes
Relevance Ranking
Search Interface
Stop Words
Stemming
DidYouMean
AutoCorrect
Spelling Correction

Go == > Match Modes

Go ==> Relevance Ranking

Search Interface

A search interface is a collection of properties and/or dimensions, that are enabled for search, grouped under one name. This allows multiple properties and/or dimensions to be searched in one query to the engine.A Relevance Ranking strategy can be defined for each search interface as well as other attributes such as cross-field matching

 Stemming

Stemming is a query expansion method. Site users can enter a search term like “laptop” and find records that don’t exactly match “laptop”, but do match “laptops” because it is another form of the search term. Stemming is language-specific. In order to find words that are related to a search term, MDEX Engine needs to know what language search term should be considered.

Go ==> DidYouMean

 AutoCorrect

If spelling correction feature is enabled and triggered, then spelling suggestions are created by enumerating (for each query term) a set of alternatives, and considering some of the combinations of term alternatives as whole-query alternatives.  Each of these whole-query alternatives is subject to thesaurus expansion and stemming. For example, if tokenized query is employee moral, then employee may generate set of alternatives (employer, employee, employed) , while moral may generate set of alternatives (moral, morale)

Endeca MDEX Engine supports three complementary forms of spelling correction:
• Auto-correction for dimension search.
• Auto-correction for record search.
• Explicit spelling suggestions for record search (the “Did you mean?” dialog box).
Any or all of these features can be used in a single application, and all are supported by the same underlying spelling engine and spelling correction modules.

 Spelling Correction

Endeca MDEX Engine spelling correction (or auto-correction) feature  enable search queries to return expected results when spelling used in query terms does not match the spelling used in the result text.  Endeca spelling correction features include
a number of tuning parameters to control performance, behavior, and result presentation. This section describes the steps necessary to enable spelling correction for record and/or dimension search, and provides a reference to the tuning parameters provided to allow applications to obtain various behavior and performance trade-offs from the spelling engine.

Spelling modes
Endeca spelling features compute contextual suggestions at the full query level. That is, suggestions may include one or more corrected query terms, which can depend on context such as other words used in the query. To determine these full query suggestions, the MDEX Engine relies on low-level spelling modulesto compute single-word suggestions—words similar to a given user query term and contained within the application-specific dictionary. The MDEX Engine supports two internal spelling modules, either or both of
which can be used by an application.

Aspell
Aspell is the default module. It supports sound-alike corrections. It does not support corrections to non-alphabetic/non-ASCII terms.
Espell
Supports non-phonetic correction of any term. Generally, applications that only need to correct normal English words can enable just default Aspell module. Applications that need to correct international words, or other non-English/non-word terms (such as partnumbers) should enable Espell module. In certain cases both Aspell and Espell can be enabled. Module selection is performed at index time through selection of a spelling mode.

Supported spelling modes are:

aspell
Use only Aspell module. Default mode.
espell
Use only Espell module.
aspell_OR_espell
Use both modules, segmenting dictionary, so that Aspell is loaded with all ASCII alphabetic words and Espell is loaded with other terms.
aspell_AND_espell
Use both modules, each loaded with the full application dictionary. Consult both modules to correct any word, selecting best suggestions from the union of the results.
disable
Disable spelling correction.

1 2 3 20
Let's Share
Show Buttons
Hide Buttons