yonik

nCache Heliosearch has a new replacement for the Lucene FieldCache currently used by Solr for sorting, faceting, and function queries. Introducing nCache (n is for “native”): nCache has Off-Heap Data-structures, just like the Off-Heap Filters to lower garbage collection pauses and GC overhead. nCache is a managed cache, meaning […]

nCache: Heliosearch/Solr Off-Heap FieldCache

February 3, 2014 in heliosearch / off-heap data / performance / search / solr tagged fieldCache / nCache / off-heap / off-heap FieldCache / solr garbage collection / solr off-heap by yonik (updated on April 28, 2015)

Off-Heap Native Filters is the first feature we added to Heliosearch, a new open source project designed to bring Solr performance to the next level. Big JVM heaps can be Big Trouble JVMs have never been good at dealing with large heaps. Large heaps mean lots of garbage collection work, […]

Heliosearch/Solr Off-Heap Filters

January 13, 2014 in filters / heliosearch / native code / performance / search / solr tagged garbage collection / GC pauses / off-heap / off-heap filters / solr garbage collection by yonik (updated on April 28, 2015)

The Solr 5 Tutorial is Here Getting Started with Solr: a Simple Solr Tutorial Note: this tutorial is for Solr 4 1. Download Solr Download Apache Solr 4. You only need to download the single .ZIP or .TGZ file and extract it anywhere you like – no installation is required!! […]

Getting Started with Solr

April 18, 2013 in solr tagged solr 4 examples / solr 4 guide / solr 4 tutorial by yonik (updated on April 28, 2015)

The filter caching features in Solr allow for precise control over how filter queries are handled in order to maximize performance. Solr has the ability to specify if a filter is cached, specify the order filters are evaluated, and specify post filtering. Solr Filter Queries Adding a filter expressed as […]

Advanced Filter Caching in Solr

February 10, 2013 in lucene / search / solr tagged filters / frange / function queries / geo search / lucene / post filter / solr / solr 4.0 / solr performance / spatial search by yonik (updated on April 28, 2015)

Background I needed a really good hash function for the distributed indexing in SolrCloud. Since it is be used for partitioning documents, it needed to be really high quality (well distributed) since we don’t want uneven shards. It also needed to be cross-platform, so a client could calculate this hash […]

MurmurHash3 for Java

September 15, 2011 in analytics / java / solr tagged murmur3 / MurmurHash3 / murmurhash3 128 / murmurhash3 64 / murmurhash3 java by yonik (updated on May 18, 2015)

Solr took another step toward increasing it’s NoSQL datastore capabilities, with the addition of realtime get. Background As readers probably know, Lucene/Solr search works off of point-in-time snapshots of the index. After changes have been made to the index, a commit (or a new Near Real Time softCommit) needs to […]

Solr’s Realtime Get

September 7, 2011 in lucene / search / solr tagged NoSQL / realtime / solr / solr 4.0 by yonik (updated on April 28, 2015)

Lucene’s default ranking function uses factors such as tf, idf, and norm to help calculate relevancy scores. Solr has now exposed these factors as function queries. docfreq(field,term) returns the number of documents that contain the term in the field. termfreq(field,term) returns the number of times the term appears in the […]

Solr relevancy function queries

March 10, 2011 in lucene / search / solr tagged function query / lucidworks / Similarity / solr / solr 4.0 by yonik (updated on April 28, 2015)

I previously introduced Solr’s Result Grouping, also called Field Collapsing, that limits the number of documents shown for each “group”, normally defined as the unique values in a field or function query. Since then, there have been a number of bug fixes, performance improvements, and feature enhancements. You’ll need a […]

Solr Result Grouping / Field Collapsing Improvements

December 17, 2010 in lucene / search / solr / Uncategorized tagged field collapsing / lucidworks / result grouping / solr / solr 4.0 by yonik (updated on April 28, 2015)

Solr has been able to produce JSON results for a long time, by adding wt=json to any query. A new capability has recently been added to allow indexing in JSON, as well as issuing other update commands such as deletes and commits. All of the functionality that was available through […]

Indexing JSON in Solr 3.1

December 8, 2010 in Uncategorized by yonik (updated on April 28, 2015)

Result Grouping, also called Field Collapsing, has been committed to Solr! This functionality limits the number of documents for each “group”, usually defined by the unique values in a field (just like field faceting). You can think of it like faceted search, except instead of just getting a count, you […]

Solr Result Grouping / Field Collapsing

September 16, 2010 in search / solr tagged field collapsing / geo search / result grouping / solr / solr 4.0 / spatial search by yonik (updated on April 28, 2015)

Solr 'n Stuff

Open source search and analytics