Elasticsearch and Apache Solr are open source search engines, and they are the most widely used search servers. This page provides some explanations about Apache Solr schema. (See this post for Solr related resources.)
Let is first look at what (XML) schema means. (XML schema, a way to define the structure, content, and to some extent, the semantics of XML documents)
(Elasticsearch index configuration is done with HTTP / JSON commands. No files required. You define types, mappings, analysis with simple commands.)
Solr index configuration is done through 2 files: schema.xml and solrconfig.xml.
- schema.xml— it defines the schema of the documents that are indexed/ingested into Solr (i.e. the set of fields that they contain). A news article may contain title, body, tags, article date etc. It also defines the datatype of those fields. It configures the document structure (a document is made of fields with field types), and how field types are processed during indexing and querying.
- solrconfig.xml — it contains the request handlers and other config options. It configures the “handlers”. Handlers are urls , executing plugins (java code) with their default configuration.
See below for some good explanations about Solr basic concepts, including Solr schema.
- schema.xml (pdf)– this is a very good and concise explanation about schema.xml in Solr configuration.
- Basic Solr Concepts (pdf) – this concise introduction mentioned indexing and schema.
- Solr Concept and Architecture (pdf) — it talked about Solr schema and configuration.
- Solr Tutorial – this is a pretty good Solr tutorial, see the website for more useful posts about Solr
- Getting Started with Solr (pdf)
- Cores, Collections and Clusters
- Setting Up Your Index
- Adding Documents
- Querying the Index
This section discusses how Solr organizes its data into documents and fields, as well as how to work with a schema in Solr.
This section includes the following topics:
Overview of Documents, Fields, and Schema Design: An introduction to the concepts covered in this section.
Solr Field Types: Detailed information about field types in Solr, including the field types in the default Solr schema.
Defining Fields: Describes how to define fields in Solr.
Copying Fields: Describes how to populate fields with data copied from another field.
Dynamic Fields: Information about using dynamic fields in order to catch and index fields that do not exactly conform to other field definitions in your schema.
Schema API: Use curl commands to read various parts of a schema or create new fields and copyField rules.
Other Schema Elements: Describes other important elements in the Solr schema.
Putting the Pieces Together: A higher-level view of the Solr schema and how its elements work together.
DocValues: Describes how to create a docValues index for faster lookups.
Schemaless Mode: Automatically add previously unknown schema fields using value-based field type guessing.
- How does SOLR work? What is an explanation for the principle in layman’s terms? (pdf) — It gave pretty good explanation how Solr works, including schema explanation.
- What is an Elasticsearch / Apache Solr index ? (pdf)
- What are Elasticsearch and Apache Solr ? (pdf)
- Top 10 Performance Tips for Apache SOLR (pdf)
- Solr in 5 minutes (pdf) (check more Solr resources on this website, see the right navigation bars. See below for some I selected.)