Understanding Crawl and Index components interactions in SharePoint 2013 – Part 1


I’m writing this post to describe the main components in
SharePoint 2013 Search and related components.
As you know, SharePoint 2013 has a slight change in Search
components than SharePoint 2010 since they re-architected search components and
their dependencies from the previous version.

I will explain the purpose of the 4 main components (marked with asterisk below) so it will
be easier to understand and help you when you plan for your SharePoint Farm
deployment since Search is a crucial component to scale for SharePoint collaboration and
user adoption.

Main SharePoint 2013 Search Components:

1)      Crawl component*.

2)      Content Processing component*.

Analytics processing component.

4)      Index component*.

5)      Query Processing component*.

Search Administration component.

*These components are covered in this article.

SharePoint 2013 Databases:

Crawl database.

Link database.

Analytics reporting database.

Search Administration database.

First and foremost, is to understand the Crawl components interaction, here I will
talk about 3 main components:

Crawl Component

Content Processing component

Crawl database

n  Crawl
component: The crawl component is responsible for crawling content sources. It
delivers crawled items – both the actual content as well as their associated
metadata – to the content processing component.

The crawl component invokes
connectors or protocol handlers that interact with content sources to retrieve
data. Multiple crawl components can be deployed to crawl simultaneously.

Note: The crawl component uses one
or more crawl databases to temporarily store information about crawled items
and to track crawl history.

n  Crawl
database: The crawl database contains detailed tracking and historical information
about crawled items. This database holds information such as the last crawl
time, the last crawl ID and the type of update during the last crawl.

n  Content
processing component: The content processing component is placed between the
crawl component and the index component. It processes crawled items and feeds
these items to the index component.

The content processing component transforms
crawled items into artifacts that can be included in the search index by
carrying out operations such as document parsing and property mapping.

Both the content processing component and
the query processing component perform linguistics processing. Examples of
linguistics processing during content processing are language detection and
entity extraction.· The content processing component writes information about
links and URLs to the link database.

 Below shows the dependency data flow
between Crawl and Index components:

àStore à
Crawl DB
crawled items

Content processing
àcontent processing & extractionà
Index Component

Second, once the data has been processed to Index Component,
The next step contains other 3 main components are interacting to provide search
capability in SharePoint:

Index Component

Index Partition

Query Processing Component

 n  Index
component: An index component is the logical representation of an index

In the search architecture, you have to
provision one index component for each index replica.

The index component receives processed
items from the content processing component and writes those items to an index

The index component receives queries from
the query processing component and provides results sets in return.

Queries are sent to the index replicas through
the query processing component. The system routes and load balances the
incoming queries to the index replicas.

n  Index
partition: An index partition is a logical portion of the entire search index.

The search index is the aggregation of all
index partitions.

n  Query
processing component: The query processing component is between the search
front-end and the index component.

The query processing component analyzes and
processes search queries and results.

Both the query processing component and the
content processing component perform linguistics processing. Examples of
linguistics processing during query processing are word-breaking and stemming.

When the query processing component
receives a query from the search front-end, it analyzes and processes the query
to attempt to optimize precision, recall, and relevancy. The processed query is
then submitted to the index component.

The index component returns a result set
based on the processed query back to the query processing component, which in
turn processes that result set before sending it back to the search front-end.
There are my thoughts on the main Search interacting components in SharePoint 2013, Drop me a line if you have any additions or questions.

Leave a Reply

Your email address will not be published. Required fields are marked *