Solr Subfacets


Subfacets (also called Nested Facets) is a more generalized form of Solr’s current pivot faceting that allows adding additional facets for every bucket produced by a parent facet.

Subfacet advantages over pivot faceting:

  • Subfacets work with facet functions (statistics), enabling powerful real-time analytics
  • Can add a subfacet to any facet type (field, query, range)
  • A subfacet can be of any type (field/terms, query, range)
  • A given facet can have multiple subfacets
  • Just like top-level facets, each subfacet can have it’s own configuration (i.e. offset, limit, sort, stats)

Subfacet Syntax

Subfacets are part of the new Facet Module, and are naturally expressed in the JSON Facet API. Every facet command is actually a sub-facet since there is an implicit top-level facet bucket (the domain) defined by the documents matching the main query and filters. Simply add a facet section to the parameters of any existing facet command.

For example, a terms facet on the “genre” field looks like:

  top_genres:{ 
    type: terms,
    field: genre,
    limit: 5
  }

Now if we wanted to add a subfacet to find the top 4 authors for each genre bucket:

  top_genres:{
    type: terms,
    field: genre,
    limit: 5,
    facet:{
      top_authors:{
        type: terms,
        field: author,
        limit: 4
      }
    }
  }

Complex Subfacet Examples

Assume we want to do the following complex faceting request:

  • Facet on the “genre” field and find the top buckets
  • For ever “genre” bucket generated above, find the top 7 authors
  • For ever “genre” bucket, create a bucket of high popularity items (defined by popularity 8 – 10) and call it “highpop”
  • For ever “highpop” bucket generated above, find the top 5 publishers

In short, this request finds the top authors for each genre and finds the the top publishers for high popularity books in each genre.
Using the JSON Facet API, the full request (using curl) would look like the following:

$ curl http://localhost:8983/solr/query -d 'q=*:*&
json.facet=
{
  top_genres:{ 
    type: terms,
    field: genre,
    facet:{
      top_authors: {
        type : terms,  // nested terms facet
        field: author,
        limit: 7
      },
      highpop:{
        type : query,               // nested query facet
        q: "popularity:[8 TO 10]",  // lucene query string
        facet:{
          publishers:{
            type: terms,   // nested terms facet under the nested query facet
            field: publisher,
            limit: 5
          }
        }
      }
    }
  }
}
'

An example response would look like the following:

[...]
  "facets":{
    "top_genres":{
      "buckets":[{
          "val":"Fantasy",
          "count":5432,
          "top_authors":{  // these are the top authors in the "Fantasy" genre
            "buckets":[{
                "val":"Mercedes Lackey",
                "count":121},
              {
                "val":"Piers Anthony",
                "count":98}]}},
          "highpop":{  // bucket for books in the "Fantasy" genre with popularity between 8 and 10
            "count":876
            "publishers":{  // top publishers in this bucket (highpop fantasy)
              "buckets":[{
                  "val":"Bantam Books",
                  "count":346},
                {
                  "val":"Tor",
                  "count":217}]}},

        {
          "val":"Science Fiction",  // the next genre bucket
          "count":4188,
[...]

 
All the reporting and sorting was done using document count (i.e. number of books). If instead, we wanted to find top authors by total revenue (assuming we had a “sales” field), then we could simply change the author facet from the previous example as follows:

      top_authors:{ 
        type: terms,
        field: author,
        limit: 7,
        sort: "revenue desc",
        facet:{
          revenue: "sum(sales)"
        }
      }

 

Try it out

Facet functions and Subfacets are in Solr 5.1 and later, but the syntax used on this page requires Solr 5.3 or later. Download the latest release and give it a spin!