Unofficial Solr Guide


This is the Unofficial Solr Guide

What is Solr?

Apache Solr is the popular, blazing fast, open source NoSQL search platform originally from the Apache Lucene Project.

Getting Started

Solr is a snap to install – simply download and extract the .zip file!

Try this simple Solr Tutorial

 

Historical Note

This document was written a long time ago (although links recently updated) before Solr’s documentation was well organized. Thanks to all the hard work by many Solr devs, there is now very good official documentation in the form of the Apache Solr Reference Guide!

Solr Features


Document Oriented

Document oriented storage enables high scalability. Solr is data-format agnostic, and does is not tied to any particular serialization. Documents can be added to Solr in JSON, XML, CSV, or binary format.
See this Solr Tutorial for an example of updating Solr with both JSON and CSV.

Distributed

Split a big index across multiple machines and query it as if it were a single document collection.

Fault Tolerant

There are no single points of failure. Documents are replicated to multiple nodes for fault-tolerance, high availability, and increased query scalability.

Atomic Updates

Atomic field modifiers for highly scalable document modification.

Optimistic Concurrency

Versioning and conditionally updates based on document versions.

Faceted Search

Dynamic category counting for search results.
Solr lets you slice-and-dice on the fly!
Also called guided navigation.

Hit Highlighting

Also called “keyword in context”, this feature returns snippets of documents with matching query terms highlighted.

Spatial Search

Find documents within a certain distance from a given point on Earth.

Full-Text Search

Solr uses Lucene as it’s primary index format to provide world class full-text search capabilities.

Pseudo-Join

Although Solr is primarily document oriented, we recognize that certain database operations like JOIN
can be the right tool for the job in some circumstances. This functionality selects a set of documents based on their
relationship to a second set of documents. Also see related block-join / nested objects.

Grouping

This feature limits the number of documents shown per category. For example, one could limit the number of website pages shown per domain or the number of pages shown per book.

Streaming Expressions

Streaming Expressions in Solr are powerful building blocks for arbitrary distributed computation. This is also used to implement the powerful Parallel SQL capabilities.