Indexes, Content Sources, Source Groups, Search Scopes… Oh My!

I guess all of these cute little critters that make up the search experience aren’t all that straight forward.  Of course if you already got them licked you can Mooooooove along.  This is just kind of a basic overview of how the pieces of the puzzle go together.  If you want some detail stuff on the gatherer and search check out the SharePoint Resource Kit starting with chapter 21.

 

Let’s start at the beginning.  The index is the file that holds all of the information that is collected during a crawl.  When you first create a portal you get a portal_content and non_portal_content index.  If you enable advanced search administration mode you can create other indexes if that makes sense in your environment.  A couple of things to consider when you are creating indexes is you can not create an index with a space in the name and you can specify an alternate location on the file system for the new index.

 

Just as a couple of side notes if you want to move the default indexes you can use this blog entry http://bes.xs4all.nl/blog/archive/2004/07/18/258.aspx from Berry Schreuder.  With catutil.exe you can also move the property store and log files.  More information from Microsoft at http://support.microsoft.com/?kbid=825484. 

 

The index is filled with data by the crawl which is run against all of the content sources that you specify for that index.  Content sources can be SharePoint sites or portals, regular web sites, file shares, & even exchange public folders.  The relationship between indexes and content sources is one index can have multiple content sources but a content source can only belong to one index.  So anytime you create a new content source you must choose the index that will hold the data.

 

Now that you have chosen an index and your options for the crawl configuration (we are skipping that part) you need to choose a source group to put your content source in.  Once again the relationship here is 1 content source can only belong to 1 source group but a source group can contain multiple content sources.  Now you will notice when you fill out the address for the content source it automatically fills in the description and the source group with information from your address (both fields are editable).  You can choose to use this as a new source group (which will be created when you click finish), change this default name to anything you like (creating a new source group also) or choose one of the existing source groups from the list. 

 

Ok now we have these indexes filled up with data from our content sources and divided up into source groups.  Why did we go through all of that effort?  Well, now we are going to create search scopes.  Search scopes are basically a way of defining a subset of sources you want to search.  If you look at your fresh portal you will see next to the search box you have your search scopes.  And if you haven’t added a search scope yet you will only see “All Sources” in that box.  This search scope does what it says and returns results from all of your indexes.  Now to build our own search scope we go to Manage Search Scopes and click New Search Scope at the top of the screen.  Here we will give the scope a name and we will either choose Include No Topic Or Area In This Scope or Limit The Search Scope To Items In The Following Topics Or Areas.  If we select either of these options you will the bottom section become available to choose content sources.  Now we can either choose to include none or all of our content sources in the scope or we can take what is behind door number three and choose to limit our scope to any of the source groups we would like.  Once last tidbit on search scopes.  Sometimes they will appear right away in the drop down box where you execute searches other times you have to do an IISreset to get them to show up.  So before you start banging your head on the desk when you create one and nothing happens try the IISreset first.  J  Those few brain cells you have left will thank you.

 

Now this kind of makes sense, right?  Well let’s do a quick example to drive it home.

 

My client wants to index 3 large file shares and make the results available in SharePoint through a search scope called Network Drives.  Here is a quick run down of what I would do.

 
  1. Enable advance administration search mode
  2. Create a new index name file_shares_index (this isn’t possible without advanced mode)
  3. Now I would create a new content source called File Share 1 and put it in the files_shares_index
  4. I would also create a new source group called File Share Source Group and put the content source in there
  5. I would now repeat step 3 for File Share 2 and File Share 3
  6. I would add both of those source groups to the File Share Source Group
  7. Now I would create a new search scope called Network Drives and I would choose to Include no topic or area in this scope
  8. I would also choose to Limit the scope to the following groups of content sources and would only select File Share Source Group
  9. Now I would do an IISreset (when no body was on of course) and my search scope would appear
  10. Also, so my search scope could have something to return I would run a full update on my file_shares_index from the manage indexes screen.
 

I hope this makes everything a little less fuzzy for people trying to get a handle on search config in SharePoint.

 

Can you guess what I did at my client today?  J

  

Shane – The Farmer on the Dell

35 thoughts on “Indexes, Content Sources, Source Groups, Search Scopes… Oh My!”

  1. " The relationship between indexes and content sources is one index can have multiple content sources but a content source can only belong to one index. " While technically correct, that statement can be misleading. You can put the same content url into two different SharePoint content source objects, and have the same content in two different indexes. That is one way I test new approaches to indexing the dotText blogs. I have one procedure that works quite well in one content source object, and the same source URL in the trial/test content source and I don’t lose my working index when I reset the trial/test index. It seems like we bandy these words around like magicians handkerchiefs and white doves, which come and go as we manipulate the user. But, keeping track of the real objects is very useful.

  2. So is there a way to verify the index when it says its got lots of documents in it (e.g. 5,000,000, from a Notes Database crawled via HTTP), but doesn’t return a single result in SharePoint.

  3. ������ � ������ ��� ���� ������������� Mac OS X,
    ����������� ��� ������������ ������� ������� �� ������� ��.
    ����� �� ����� �� ������� ����� ������ ���� ��� Mac OS X

  4. ������ ���� �����.��� �� ����� http://www.w717.com
    ������ ���� ����������.
    �������� �������.
    ����� ���������� – ������ ���������� .
    �������� �������� �������.

    ������� ���������� ��������� ���� ��� 20/10
    ����������: 1986�. �-� ��� ��.�.�.������, �.���������
    ����������������, � 20/10,3�
    ����� �����, �.
    – ��������� ������� = 29,5 �
    -���������������� ������� 30,5 �
    ������ �����, �. 10,5 �

    ������� ���������� ���������� ���� “����” 1971� ����������������� 6�,
    � ������� ���������.
    ���������� ������������� ��������������� ���������� ���� “����”
    � �������������� ������ �������, � ������� ���������

    ��������� ������� �������� 1972
    ���������������� 6�
    ������ �����: 10,5�
    ����� ����� 8-30�
    ������ �������: ��� ������� – 23� ��� ������� – 15�

Leave a Reply

Your email address will not be published. Required fields are marked *


*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>