Apache Solr schema explained

Elasticsearch and Apache Solr are open source search engines, and they are the most widely used search servers. This page provides some explanations about Apache Solr schema. (See this post for Solr related resources.)

Let is first look at what (XML) schema means. (XML schema, a way to define the structure, content, and to some extent, the semantics of XML documents)

(Elasticsearch index configuration is done with HTTP / JSON commands. No files required. You define types, mappings, analysis with simple commands.)

Solr index configuration is done through 2 files: schema.xml and solrconfig.xml.

  • schema.xml— it defines the schema of the documents that are indexed/ingested into Solr (i.e. the set of fields that they contain). A news article may contain title, body, tags, article date etc.  It also defines the datatype of those fields. It configures the document structure (a document is made of fields with field types), and how field types are processed during indexing and querying.
  • solrconfig.xml — it contains the request handlers and other config options. It configures the “handlers”. Handlers are urls , executing plugins (java code) with their default configuration.

 

See below for some good explanations about Solr basic concepts, including Solr schema.

This section discusses how Solr organizes its data into documents and fields, as well as how to work with a schema in Solr.

This section includes the following topics:

Overview of Documents, Fields, and Schema Design: An introduction to the concepts covered in this section.

Solr Field Types: Detailed information about field types in Solr, including the field types in the default Solr schema.

Defining Fields: Describes how to define fields in Solr.

Copying Fields: Describes how to populate fields with data copied from another field.

Dynamic Fields: Information about using dynamic fields in order to catch and index fields that do not exactly conform to other field definitions in your schema.

Schema API: Use curl commands to read various parts of a schema or create new fields and copyField rules.

Other Schema Elements: Describes other important elements in the Solr schema.

Putting the Pieces Together: A higher-level view of the Solr schema and how its elements work together.

DocValues: Describes how to create a docValues index for faster lookups.

Schemaless Mode: Automatically add previously unknown schema fields using value-based field type guessing.

Check out his Unofficial Solr Guide for more useful tutorials and resources (e.g., Solr 6.5 Features)

 

 

Leave a Reply

Your email address will not be published.