Off-Heap FieldCache Faceting and Sorting


Lucene/Solr background

Lucene has a segmented architecture – when a small amount of documents are added to an existing index, this will often just add an additional small segment to the index.

Caching data structures at the segment level (e.g. field values used for sorting) is often desirable so that when a new view of the index is opened, additional segment caches only need to be created for those new segments. To date, Heliosearch’s off-heap nCache has been all segment-level to enable good near real-time performance.

However some search operations require data structures across the entire index to operate efficiently, and faceting by string fields is one of those. Doing more work when a new view of the index is opened (such as computing global ords) saves work for every faceting request that will use that view.

FieldCache Insanity

The term “FieldCache insanity” refers to the same logical data being cached more than once in a different form, taking up twice as much memory as needed. Apache Solr currently suffers from FieldCache insanity when the same field is sometimes used to facet and sometimes used to sort. Faceting uses a top-level FieldCache entry and sorting uses a per-segment FieldCache entry.

nCache top-level off-heap string cache

Top-level string support has just been added to Heliosearch’s nCache to enable fast faceting and other operations that benefit from global ords.

  • Off-heap data lowers garbage collection pauses and GC overhead.
  • Sorting and faceting on the same field does not cause insanity!
  • Speeds up faceting
  • Slightly speeds up sorting
  • Can save memory (string values only appear once instead of being duplicated across segments)
  • enables future native-code optimizations (Unlike Java arrays, off-heap data can be transparently accessed from native code)

Sorting with a top-level FieldCache entry

You can force Heliosearch to use a top-level FieldCache entry via the top() function.
For example, instead of specifying

sort=myfield_s desc

use

sort=top(myfield_s) desc

NOTE: Although the top() function exists in Solr, the functionality was removed and it is currently a no-op.

Forcing a top-level FieldCache entry for sorting is not something one would normally explicitly need to do. If a top-level entry for a string field already exists, it will be used even if top() was not specified. Faceting on a single-valued string field will automatically use a top-level string cache, there is nothing you need to specify.

Heliosearch avoids insanity by using a slice/view of the top-level string cache if a per-segment string cache is requested. This avoids the overhead of data duplication.

Sorting performance

Benchmark details:

  • Ubuntu Linux 13.10, Java 1.7, quad-core CPU
  • 10M document index
  • Documents consist of an ID field, and 6 different single-valued string fields with varying numbers of unique values ranging from 10 to 1 Million
  • Client had 4 request threads
  • Each individual client request uses a random field to make the test realistic and to avoid the JVM overspecializing the code for any given field
  • Solr versions: Apache Solr 4.8.1, Heliosearch/Solr snapshot (based on Solr 4.9)

 
sorted_query_latency

 
Although the performance increase going from a per-segment to a top-level cache may be small for sorting, it’s essentially free if that top-level cache is needed for something else like fast faceting. We also previously covered the performance difference between Solr and Heliosearch for string sorting.

Faceting Performance

UPDATE: native code faceting has been implemented and results in even better performance than shown below.

Using the same set of fields, we also tested the performance of faceted search on the single valued string fields (which now automatically use the new off-heap nCache when on Heliosearch). The result was a 23% increase in request throughput! The chart below shows faceted request latency broken out by percentiles.

 
faceted_request_latency
 

Memory Performance

The chart below shows the memory usage of Solr and Heliosearch after running both a faceting and sorting test concurrently. The “Max Process Size” was observed via “top” during the entire test, and “Min JVM Heap Size” was obtained by waiting for the tests to finish, then attaching jconsole to the server and forcing garbage collections until the smallest in-use heap size was obtained.

 
memory_for_facet_sort

 

Conclusion

We started developing off-heap data structures for Solr (via Heliosearch) with the goal of solving many people’s JVM garbage collection problems, and enabling future native code optimizations. The performance increases to both sorting and faceting that we’ve now seen are a very nice added bonus!

If you try out Heliosearch on your own project, drop by the user list and let us know how it went, or stop in at the dev list to help further development!