The filter caching features in Solr allow for precise control over how filter queries are handled in order to maximize performance. Solr has the ability to specify if a filter is cached, specify the order filters are evaluated, and specify post filtering. Solr Filter Queries Adding a filter expressed as […]
lucene
I’ve often seen mistaken descriptions of Solr as just “a http wrapper around Lucene”. Unfortunately that mischaracterization was never nipped in the bud early enough and has continued to be repeated in many places such as press articles (where it is picked up and repeated again). Of course people who […]
A History of Lucene and Solr
The filter caching features in Solr allow for precise control over how filter queries are handled in order to maximize performance. Solr has the ability to specify if a filter is cached, specify the order filters are evaluated, and specify post filtering. Solr Filter Queries Adding a filter expressed as […]
Advanced Filter Caching in Solr
Solr took another step toward increasing it’s NoSQL datastore capabilities, with the addition of realtime get. Background As readers probably know, Lucene/Solr search works off of point-in-time snapshots of the index. After changes have been made to the index, a commit (or a new Near Real Time softCommit) needs to […]
Solr’s Realtime Get
Lucene’s default ranking function uses factors such as tf, idf, and norm to help calculate relevancy scores. Solr has now exposed these factors as function queries. docfreq(field,term) returns the number of documents that contain the term in the field. termfreq(field,term) returns the number of times the term appears in the […]
Solr relevancy function queries
I previously introduced Solr’s Result Grouping, also called Field Collapsing, that limits the number of documents shown for each “group”, normally defined as the unique values in a field or function query. Since then, there have been a number of bug fixes, performance improvements, and feature enhancements. You’ll need a […]
Solr Result Grouping / Field Collapsing Improvements
Solr 1.4 contains a new feature that allows range queries or range filters over arbitrary functions. It’s implemented as a standard Solr QParser plugin, and thus easily available for use any place that accepts the standard Solr Query Syntax by specifying the frange query type. Here’s an example of a […]
Ranges over Functions in Solr 1.4
One of the many performance improvements in the upcoming Solr 1.4 release involves improved filtering performance. Solr 1.4 filters are both faster (anywhere from 30% to 80% faster to calculate intersections, depending on configuration), take less memory (40% smaller), and are more efficiently applied to the query during a search. […]
Filtered query performance increases for Solr 1.4
With CPU cores constantly increasing, there has been some major work done in Lucene/Solr to increase the scalability under multi-threaded load. Read-only IndexReaders One bottleneck was synchronization around the checking of deleted docs in a Lucene IndexReader. Since another thread could delete a document at any time, the IndexReader.isDeleted() call […]
Solr scalability improvements
Having performance issues with Solr’s faceted search and certain types of fields? Help has arrived in the form of a new Solr faceting algorithm! This new faceting implementation dramatically improves the performance of faceted search, making it suitable for a much wider range of applications. The existing multivalued field faceting […]