Apache Solr resources

Elasticsearch and Apache Solr are open source search engines, and they are the most widely used search servers. This post provides resources about Apache Solr.

Apache Solr is a fast open-source Java search server.

Solr enables you to easily create search engines which searches websites, databases and files.

Solr (pronounced “solar”) is an open source enterprise search platform, written in Java, from the Apache Lucene project. Its major features include full-text search, hit highlighting, faceted search, real-time indexing, dynamic clustering, database integration, NoSQL features and rich document (e.g., Word, PDF) handling. Providing distributed search and index replication, Solr is designed for scalability and fault tolerance. Solr is the second-most popular enterprise search engine after Elasticsearch.

Solr runs as a standalone full-text search server. It uses the Lucene Java search library at its core for full-text indexing and search, and has REST-like HTTP/XML and JSON APIs that make it usable from most popular programming languages. Solr’s external configuration allows it to be tailored to many types of application without Java coding, and it has a plugin architecture to support more advanced customization.

An Elasticsearch / Apache Solr index is the equivalent of a SQL table.

An Elasticsearch or Solr server (aka Solr instance, aka Solr engine) can maintain several indexes.

(Elasticsearch index configuration is done with HTTP / JSON commands. No files required. You define types, mappings, analysis with simple commands.)

In Apache Solr, each index is defined by a schema.xml file (it’s not mandatory in Solr 5/6, but recommended in production), and a solrconfig.xml file. The index schema is equivalent to a SQL table schema definition.  (See this post for Solr Schema related resources.)

An index contains several documents, equivalent to SQL table rows. Each document contains fields, equivalent to SQL table columns.

When an index document is inserted/updated/deleted, we say it is “indexed”.

To retrieve documents from an index, Elasticsearch (json) / Apache Solr (xml, json) provide an http API, with a proprietary syntax.

Elasticsearch and Apache Solr are web applications. A client will use their http API to query or store data.

A full-text search engine is built from the ground to tackle problems that a SQL search find difficult or impossible. The list of those features is huge: multi-language, dedicated plugins to extend the engine, synonyms, stop words, facets, boosts, …

The core search engine of Elasticsearch and Apache Solr is Apache LuceneThe relationship between Elasticsearch / Apache Solr and Lucene, is like that of the relationship between a car and its engine.

You can access Solr admin from your browser: http://localhost:8983/solr/

use the port number used in installation.

See below for some useful Solr related resources:

Check out his Unofficial Solr Guide (e.g., Solr 6.5 Features)

Configuring

Integrating Solr

Leave a Reply

Your email address will not be published. Required fields are marked *