yonik


Solr 4.10 and Heliosearch .07 have added a terms query (or terms filter) to more efficiently match many terms in a single field. A large number of terms are often useful for things like access control lists or security filters. Previously, the only way to do this was a large […]

Solr Terms Query for matching many terms


Native code faceting for Solr has just been added to Heliosearch, and benchmarks show an impressive 2x performance increase! This is faceting code written in C++ and statically compiled for maximum performance, and loaded into the JVM via JNI (Java Native Interface). nCache, Heliosearch’s off-heap version of the Lucene/Solr FieldCache, […]

Native Code Faceting


Solr needs a flexible cross-datacenter architecture that can handle both a variety of application needs as well as a variety of infrastructure resources. Design Goals Accommodate 2 or more data centers Accommodate active/active uses Accommodate limited band-with cross-datacenter connections Minimize coupling between peer clusters to increase reliability Support both full […]

Solr Cross Data Center Replication



The filter caching features in Solr allow for precise control over how filter queries are handled in order to maximize performance. Solr has the ability to specify if a filter is cached, specify the order filters are evaluated, and specify post filtering. Solr Filter Queries Adding a filter expressed as […]

Solr Filter Caching


Lucene/Solr background Lucene has a segmented architecture – when a small amount of documents are added to an existing index, this will often just add an additional small segment to the index. Caching data structures at the segment level (e.g. field values used for sorting) is often desirable so that […]

Off-Heap FieldCache Faceting and Sorting


I’ve often seen mistaken descriptions of Solr as just “a http wrapper around Lucene”. Unfortunately that mischaracterization was never nipped in the bud early enough and has continued to be repeated in many places such as press articles (where it is picked up and repeated again). Of course people who […]

A History of Lucene and Solr



Macro Expansion is a new Solr 5.1 feature that does parameter substitution across all request parameters. The macro expansion is done at the same point in time that default parameters are applied (i.e. when the request reaches the correct solr request handler). This means that request handler defaults, appends, and […]

Parameter Substitution / Macro Expansion


Solr 4.8 has been released. Here’s an overview of how to use some of the new features. Also see Solr download links and upcoming features of the next Solr release. Complex Phrase Queries The complexphrase query parser can produce phrase queries with embedded wildcards and boolean queries. It works via […]

Solr 4.8 Features


Heliosearch’s off-heap FieldCache was previously introduced and benchmarked for integer fields. Support for all numeric field types as well as string fields has now been completed, and this post will focus on the performance of string fields. A review of nCache (n is for “native”) features and goals: nCache has […]

Heliosearch/Solr Off-Heap FieldCache Performance



Solr 4.7 has been released! Here’s a slightly more in-depth overview of some selected features. Deep Paging Both single node, and distributed deep paging have been added to Solr! I previously created an example of how to use Solr’s deep paging, and Hoss has a great set of benchmarks showing […]

Solr 4.7 Features