Oracle Endeca Common Performance Problems

Info from MDEX Engine Performance Testing Best Practices

Engine Feature/Query function	Comments
setNavAllRefinements / allgroups=1	Automatically expands all of the available dimensions and calculates available values for each dimension. The operation is very computational intensive as it requires the MDEX Engine to calculate not only which dimension values are available, for all possible dimensions, but depending on configuration of the index, calculates the number of available records for each dimension value. The problem can be seen in the MDEX Engine statistics page HotSpots section. The Navigation Query Total line will have the highest percent time spent out of the total time spent inside the engine. It is easy to confirm whether this is an issue by creating a new version of the ENEPerf query log, replacing allgroups=1 with allgroups=0. The new ENEPerf log should result in significant improvement in the MDEX Engine performance.
Exact, Phrase, and Proximity relevance ranking modules	Exact, Phrase, and Proximity relevance ranking modules can be expensive. The lower in your strategy you place them, the better.
Large response sizes	Large response sizes can expose network bottlenecks or cause responses to take longer to travel over the network to the application server. A rule of thumb is that average response sizes >500k in your Cheetah reports may be cause for concern, especially in a high traffic site or 100 megabit Ethernet environment.
Complex analytics queries	Complex analytics queries can be computationally expensive.
Text searching against the entire record set.	Text searching against a very large (> 30 million) record set can be expensive. It is better to apply a record filter or force the user to navigate to a more manageable set before allowing text searches.
Large Flat Dimensions (no hierarchy or sifting)	Dimensions that have a flat list of thousands of dimension values can degrade performance of the MDEX Engine, especially if the dimensions are expanded. Create a hierarchy or configure precedence rules to reduce the number of calculations required and to ensure that thousands of values are not returned at once
Wildcarding	Wildcarding, especially on large text properties or very broad queries (such as ab) can be extremely computationally expensive.
Interactions of large thesaurus + spelling + stemming on large datasets	The interaction of the various query expansion modules when the dataset and thesaurus are large can be expensive.
Frequent Partial Updates	Applying partial updates flushes the MDEX Engine cache forcing subsequent queries to be fully recalculated. Very frequent partial updates can essentially cause the MDEX Engine to operate without a cache, which can dramatically impact performance.
Geospatial range filters on the entire record set	Geospatial filtering is an expensive operation depending on the number of records it is applied against. Whenever possible, use record filters before applying geospatial filters. For example, first filter to a region (e.g. state, country, or range of postal codes) prior to allowing geospatial filtering within that region.
Not enough physical RAM on server	Not enough physical RAM can cause a dramatic degradation in performance due to OS paging.

Analyze Cheetah Output

Cheetah will produce a lot of useful metrics and information. The complete list is available in the Cheetah documentation that is provided with the Cheetah installation package. It is important to review the entire report for any problem areas, although a few common issues are mentioned below.

To ensure that the overall test is valid, it is useful to first validate that the number of records returned on average is around the expected value. This is especially important if multiple tests are run where the query log is modified for each test. The average number of records can be found in the Response Profiling section (see example below) of the cheetah output under “Number of Base Records in Result Set”. Also verify that the “Number of Requests Returning Zero Results” did not significantly change from one test to another. It is also possible to find potential network issues by comparing engine-only processing time to total response time. The “Response Differential” section of the Cheetah report contains the statistics about the difference in time between the total roundtrip versus MDEX Engine processing. If the Average Differential is significant (>200ms), that could be a sign of a network bottleneck.

Analyze MDEX Engine Stats Page

If you have gotten to this stage of analysis and the performance looks sufficient for your requirements, then this step may not be required. However, if the previous results have indicated that the performance is not sufficient and the cause is not environmental, then the MDEX Engine stats page is a useful to understand how the MDEX Engine spends its processing time. It can help determine which features cause slower than desired performance.

There are a number of questions that this tool can help answer:

· Is the MDEX Engine Total Processing Time too high?

Details -> Server-> Scheduler: Processing Time

This section will help identify if the performance issues are uniform and affect all of the queries (where Average Processing Time is high and Standard Deviation is low) or if the performance numbers are skewed due to high variance in the individual query performance (where Average Processing Time is high and Standard Deviation is high). If the latter is true, it is useful to check the Most Expensive Queries section. This section lists the queries that take the longest to execute. In this situation, it is worthwhile to consider another test case in which the most expensive queries are removed from the ENEPerf query log and the test is rerun. This test will help determine whether the longest running queries have any significant impact on the average performance of the MDEX Engine. Sometimes the list of Top 10 queries provided by the MDEX Engine stats page is not enough. It is always useful to load the resulting MDEX Engine request log into a spreadsheet and sort the queries by their engine processing times.

If the average performance is indeed affected by the few long running queries, the next step is to understand what causes their slow performance. For example, the queries can be extracted out into a separate ENEPerf query log and a short performance test can be run using that log.

· Are MDEX Engine requests queuing?

Details -> Server-> Scheduler: Queue Time Before Processing

“Queue time before processing” reports how long requests spend in the MDEX Engine’s queue waiting to be picked up by an MDEX Engine thread to be processed. If requests are queuing, the easiest resolution would be to expand CPU and MDEX Engine thread capacity. However, the performance impact should be analyzed since increasing the number of queues may actually cause performance degradation after a certain point.

· Which MDEX Engine feature(s) are requiring the most processing time?

Details -> Hotspots

The “Hotspots” section of the MDEX Engine stats page explains which MDEX Engine features require the most processing time. Please note that “Navigation/RecSearch query total” can further be expanded by clicking on the triangles on the right of the Total column (see the screenshot above).. The expanded view shows details that can help identify the root cause of the performance issue. The important Total column shows which MDEX Engine feature consumes the most time out of the total time spent.

See below for an explanation of the various features listed in the HotSpots section:

Navigation/RecSearch query total	Time taken by MDEX Engine to perform Refinements calculations and keyword search functionality combined
Record Search Performance	Time taken by MDEX Engine to perform Record Search (Ntk/Ntt parameters, excluding relevance ranking)
Navigation Performance	Time taken by MDEX Engine to perform Navigation (N parameter)
Navigation binlist	Time taken by MDEX Engine to calculate which records to return based on the intersection of search results, navigation results, range filters, etc.
Navigation Refinements (LCA/topdown/bottomup)	Time taken by MDEX Engine to calculate which dimension values to display
Navigation Refinements Multi-OR	Time taken by MDEX Engine to calculate which dimvals to display for Multi-OR dimensions
Navigation Refinements Multi-AND	Time taken by MDEX Engine to calculate which dimvals to display for Multi-AND dimensions
Refinement Record Count	Time taken by MDEX Engine to calculate refinement statistics
Clustering performance	Time taken by MDEX Engine to perform record clustering
Record Filter performance	Time taken by MDEX Engine to perform record filters(Nr parameter)
Range Filter performance	Time taken by MDEX Engine to perform range filters (Nf parameter)
Content spotlighting performance	Time taken by MDEX Engine to apply Merchandising Rules, including sorting of Merch Rules results
Dimension Search Performance	Time taken by MDEX Engine to perform dimension search (D parameters)
Spell Engine Performance	Time taken by MDEX Engine to calculate variations of the spelling and come up DYM or Autocorrect options
Page render total	Time taken by MDEX Engine to package up the page of results, including aggregation (e.g. applying rollup, paging, and relevance ranking sort)
Page render/record list	The above time minus packaging the result into a binary structure
Snippeting performance	Time taken by MDEX Engine to find and tag text that matched the keywords

· What was the cache hit ratio?

Cache ->Main Cache

The MDEX Engine cache statistics provide information about the cache Hit/Miss rates, the number and size of evictions, and the size of entries, broken down by functional category. If the results of the performance test show surprisingly good performance, and the MDEX Engine Stats page shows a high Сache Hit Rate (greater than 80%), ensure the performance query log does not contain a high frequency of repeated queries.

If the total number of evictions (# of evictions) is close to or just couple of orders of magnitude smaller than the total number of lookups (# of lookups) and the “Size of Entries” is close to the size of MDEX Engine Cache (–cmem), this may be a sign of cache thrashing. Cache thrashing is a state in which multiple threads are competing for the same cache, and data inserted by one thread is almost immediately evicted by another, resulting in a high number of misses and evictions. One of the side effects of cache thrashing is low CPU utilization due to constant cache synchronization and degraded performance of the MDEX Engine. Although the MDEX Engine implements various caching algorithms that should minimize thrashing, it can still occur and therefore it is important to watch for the symptoms.

The MDEX Engine allows you to tune the amount of RAM allocated to the cache, which can modified via the MDEX Engine –cmem flag. A balance must be found between the size of the MDEX Engine cache and the amount of RAM left for the operating system. Often, the best configuration is found experimentally.

1 2 3 … 20 Next »

Comment (1)

rohan dekate says:

January 25, 2013 at 23:42

Very Nice Post Mandar.Liked your previous post on guided navigation. very well organized . I also started blogging . Iam a newbie Iam not as techie as you . the following is link to my blog
http://atgendecaoasis.blogspot.in/2013/01/atg-commerce-reference-store-endeca.html
please have a view and give tips to improve
Thanks ,
Rohan Dekate

Oracle Endeca Common Performance Problems

Oracle Endeca Common Performance Problems

Easiest way to run an LLM locally on your Mac

EKS cluster using an existing VPC

kubectl Unable to connect to the server

Lucidworks Spell Correction

Comment (1)