Here’s an overview of some of the new features in Solr 5.2 Also see Solr download links and upcoming features of the next Solr release. Solr caches can be limited by memory use Caches using the LRUCache implementation can specify a new parameter maxRamMB that will evict based on RAM […]

Solr 5.2 Features


The percentile aggregation function was just added to the new Solr Facet Module. This allows one to calculate one or more percentiles for each facet bucket (i.e. each group of documents produced by faceting), and even sort facet buckets by any given percentile. The percentile aggregation even works with distributed […]

Percentiles for Solr Faceting


The Solr 5 Tutorial is Here Getting Started with Solr: a Simple Solr Tutorial Note: this tutorial is for Solr 4 1. Download Solr Download Apache Solr 4. You only need to download the single .ZIP or .TGZ file and extract it anywhere you like – no installation is required!! […]

Getting Started with Solr



Noggit is the world’s fastest streaming JSON parser for Java. Noggit is the streaming JSON parser used in Solr. It lives here on github. JSON features and extensions Noggit supports a number of extensions to the JSON grammar. All of these extensions are optional and may be disabled. Comments Unquoted […]

Noggit, the JSON Streaming Parser


Lucene/Solr trunk (the future 6.0 release) is now on Java8, while version 5.x is still on Java7. Linux and Windows allows one to install a JDK any place in the filesystem, and I use the convention of installing in /opt/jdk7 and /opt/jdk8. Things are a little more difficult on Mac […]

Switching between Java7 and Java8 in Lucene/Solr


Solr 4.10 and Heliosearch .07 have added a terms query (or terms filter) to more efficiently match many terms in a single field. A large number of terms are often useful for things like access control lists or security filters. Previously, the only way to do this was a large […]

Solr Terms Query for matching many terms



Native code faceting for Solr has just been added to Heliosearch, and benchmarks show an impressive 2x performance increase! This is faceting code written in C++ and statically compiled for maximum performance, and loaded into the JVM via JNI (Java Native Interface). nCache, Heliosearch’s off-heap version of the Lucene/Solr FieldCache, […]

Native Code Faceting


Solr needs a flexible cross-datacenter architecture that can handle both a variety of application needs as well as a variety of infrastructure resources. Design Goals Accommodate 2 or more data centers Accommodate active/active uses Accommodate limited band-with cross-datacenter connections Minimize coupling between peer clusters to increase reliability Support both full […]

Solr Cross Data Center Replication


The filter caching features in Solr allow for precise control over how filter queries are handled in order to maximize performance. Solr has the ability to specify if a filter is cached, specify the order filters are evaluated, and specify post filtering. Solr Filter Queries Adding a filter expressed as […]

Solr Filter Caching



Lucene/Solr background Lucene has a segmented architecture – when a small amount of documents are added to an existing index, this will often just add an additional small segment to the index. Caching data structures at the segment level (e.g. field values used for sorting) is often desirable so that […]

Off-Heap FieldCache Faceting and Sorting