Solr 5.2 Features


Here’s an overview of some of the new features in Solr 5.2
Also see Solr download links and upcoming features of the next Solr release.

Solr caches can be limited by memory use

Caches using the LRUCache implementation can specify a new parameter maxRamMB that will evict based on RAM use rather than number of elements in the cache. Least recently used items are evicted until the RAM use is brought under the limit.
RAM use calculations do not currently cover the cache keys, so using this for the query cache and caching large queries can still lead to greater memory use than expected.

Restoring an index backup

To make a backup, we can send a request to the replication handler:

curl -XPOST "http://localhost:8983/solr/demo/replication?command=backup&name=my_backup100"

This will create a backup of the index in Solr’s data directory (this can be changed via the location parameter) named snapshot.my_backup100

This index snapshot can later be restored with the following command:

curl -XPOST "http://localhost:8983/solr/demo/replication?command=restore&name=my_backup100"

Flatter request structure for the JSON Facet API

Here’s an example of a terms facet in Solr 5.1:

top_authors : { terms : {
  field : author,
  limit : 5,
}}

In the Solr 5.2 JSON Facet API, the “type” can optionally be specified in the same object as the facet arguments:

top_authors : {
  type : terms,
  field : author,
  limit : 5
}

Range facet mincount

Range facets now support the mincount parameter to screen out range facet buckets that don’t meet a minimum document count.

prices:{
  type:range,
  field:price,
  mincount:1,
  start:0, end:100, gap:10
}

unique() support for numerics and dates

The unique facet function now works on numeric and date fields.
Example:

json.facet={
  num_codes : "unique(error_code)"
}

Multi-select Faceting

Multi-select faceting is a powerful faceting style that allows users to see and select multiple facet constraints (facet values) for a facet. For example, one may want to select multiple price ranges or multiple colors they are interested in.

The new Facet Analytics Module / JSON Facet API now supports multi-select faceting via filter exclusions. A new excludeTags parameter will disregard any top-level filters with matching tags.

Here’s a Multi-Select Faceting Example, using the JSON Facet API.

HyperLogLog based distributed cardinality

Both the older Stats component and the new Facet Analytics Module have added support for HyperLogLog based statistical cardinality estimate.
For the JSON Facet API, a new hll facet function was added as an alternative to the existing faster (but less accurate for high cardinality) unique function. Example:

json.facet={ numProducts : "hll(product_id)" }

See Solr Count Distinct functionality for examples that calculate the number of distinct values in a given field per facet bucket.

“facet.range.method” (traditional query-parameter API)

Add a new “facet.range.method” parameter to let users choose how to do range faceting between an implementation based on filters (previous algorithm, using “facet.range.method=filter”) or DocValues (“facet.range.method=dv”). Input parameters and output of both methods are the same.

Raw JSON/XML DocTransformers

If you have a field value that consists of well formed XML or JSON, you can return those raw values in the appropriate response writer.
Example: ?fl=id,name,json_s:[json],xml_s:[xml]

Rule-Based Replica Assignment

This new SolrCloud feature allows the specification of rules which govern placement of replicas in the cluster.
Rules are specified during collection creation and persisted in zookeeper.

See the blog post from LucidWorks for further details and examples.

Streaming Expressions

Solr Streaming Expressions adds an expression based interface to the Streaming API added in Solr 5.1.

Some examples from include

// merge two distinct searches together on common fields
merge(
  search(collection1, q="id:(0 3 4)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s asc"),
  search(collection2, q="id:(1 2)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s asc"),
  on="a_f asc, a_s asc")

// find top 20 unique records of a search
top(
  n=20,
  unique(
    search(collection1, q=*:*, fl="id,a_s,a_i,a_f", sort="a_f desc"),
    over="a_f desc"),
  sort="a_f desc")

See the Solr Reference Guide for more documentation.

Solr Security

An authentication framework and Kerberose authentication module. See the Security section of the Solr Reference Guide.