Monday, February 27, 2006

SP2003: Third Search Engine?

I've blogged several times before about the 2 different search engines in SharePoint 2003. One search engine is used when you are on the main SharePoint Portal, while the other is used when you are on the Windows SharePoint Sites. This alone is frustrating, as users will get different results depending on where they are doing the search.

Yesterday a user discovered what appears to be a third search engine...


When you are on a Windows SharePoint Site (WSS), and you browse down to a Shared Document Library, and run a sech on that page, you get yet another set of results. From what I can tell, the idea behind this search is that it is focused just on what is in that Shared Document Library. Actually this works too if you browse to a specific list and run a search. That search will just be on that list's contents. That's fine if it was just a filtered version of the site search, but it appears to be yet another search engine. For instance, a site only has one Shared Document Library. If you run a search on the main site page, and then run the same search on the Shared Document Library page, you should expect the same results. Unfortunately you do not get the same results.

I looked around and could only find this:
http://www.codecomments.com/archive355-2005-8-520421.html

I followed the suggested quick fix, to just modify the templates so that the Shared Document Library page just runs the site search engine. This is somewhat frustrating as it is not a filtered search, but just an overall site search.

<div align=left>In
<code>web server extensions60TEMPLATE1033STSLISTSDOCLIBAllItems.aspx</code>

replace
<code><SharePoint:ViewSearchForm ID="L_SearchView" Prompt="Search this view" Go="Go" runat="server"/></code>

with
<code><SharePoint:ViewSearchForm ID="L_SearchView" Prompt="Search on this Site" Go="Start" Action="searchresults.aspx" runat="server"/></code></div>

A number of questions are still outstanding on this one:
1) Why?
2) Where is this search algorithm? Is it a stored procedure in SQL Server?
3) Is it possible to modify the algorithm to be the same as the site search, or focus the site search to just the document library when run from that page.

Tuesday, February 14, 2006

SP2003: Advanced Searching

With some work you can improve SharePoint's search features to include Boolean, Wildcard, and Phrase Searching.


SharePoint Portal Server 2003 has some very frustrating search quirks. The first stems from the fact that there are actually 2 search engines running on the site. I've discussed this before in the context of searching for PDF's:
http://sharepointblog.com/?p=6

The search engines also respond differently when trying to run advanced searches that include things like boolean commands (AND, OR, NOT) and enclosing phrases in quotes "" to find exact matches.

The two search engines are the main SharePoint Portal Server search and the Windows SharePoint Sites search. Both can handle a fairly basic search, but have limitations when it comes to advanced searching. Here are some differences between them:

SharePoint Portal Server search (from main SharePoint home page):
This search engine allows the use of quotes "" to enclose a phrase when searching for an exact match.
It does not allow Boolean searches (AND, OR, NOT)
There is an Advanced Search option that can be reached by clicking on the magnifying glass. The Advanced Search allows you to narrow your search by searching the properties of documents, lists, and other types of items. This feature does allow AND and OR boolean searches, but is limitied to file properties and will not work on file contents.

Windows SharePoint Sites search (from a team site):
This search engine will not use quotes "" to search for a phrase. Instead matches to any word in the phrase will be included in the search results.
It does not allow Boolean searches (AND, OR, NOT)
There is no Advanced Search page.

Here are some more specifics on the WSS search from the Help file:

The search engine automatically includes variations of words based upon the base stem, such as plurals. For example, searching for the word "page" also returns results for "pages."

You cannot use the asterisk "*" character.

The search engine does not support Boolean functions such as AND and OR.

The search engine automatically ignores common words such as "the," "it," and "by" as well as single-digit numbers.

The search engine is not case sensitive.

Attachments to list items do not appear in search results.

You cannot search for information in items (rows) that are not included in the current view. For example, if you search in a view that shows only items created by you, any items created by another user are not searched. However, searches include all information in the items that are not filtered, including information in columns that are hidden. Searches also include information in items that exceed the current view's item limit.

There may be an internal error on the search server. Contact your server administrator for more information.


The lack of Boolean capabilities in paticular is very frustrating. It appears that the search uses a FREETEXT function. FREETEXT automatically stems all of the words (a search for "fish" will include "fishes" but not "fishbowl") and puts an OR between them. This makes it impossible to run Boolean searches with the standard search engines.

Looking around, I found this useful article:
http://wss.collutions.com/Lists/FAQ/DispForm.aspx?ID=420

It details the stored procedures used to preform the search... at least the Windows SharePoint Site search. The main procedure is called "proc_FetchDocSearchResults", and it contains the FreeTextTable function that is causing the problems. This article describes several ways to circumvent it and instead use the boolean friendly "ContainsTable" function.

There are 3 solutions provided, but each is lacking. The first destroys the existing procedure in favor of just using the ContainsTable function. The second uses the clever idea of just pushing Boolean searches to a new procedure with the ContainsTable function, while keeping all other searches in the standard procedure. The only thing lacking here is that it only works on AND and OR Boolean searches. The third option provides alot more functionality including NOT, quotes "" for phrase searching, and * wildcard searching. The only problem here is that it requires that you type "FT" in front of your search if you want to use the default WSS FreeText search, otherwise all searches are done in this new procedure.

I wanted one that defaulted to the standard WSS FreeText search, unless one of a variety of Advanced search features were used, including quotes "", Boolean (AND, OR, NOT) and wildcards *. To do this, I took the samples and created the following:
proc_FetchDocSearchResults
proc_FetchListItemSearchResults

Wednesday, February 8, 2006

SP2003: Searching for PDF files, and other file types

SharePoint Portal Server can be configured to search PDF files, with some work...

SharePoint Portal Server 2003 has some strange search engine issues. The main issue is that there are actually 2 search engines.

Search Engine 1: SharePoint Portal Server search. This is the search that is run when you are on the main SharePoint home page. It is based on the old SharePoint Portal Server software that used to store everything in flat files, and it uses the SharePoint server indexing client to prepare files for searching.

Search Engine 2: Windows SharePoint Sites search. This is the search that runs when you are on a site, like a Team Site, or a Document Workspace. It is based on the old Team Services software which stored everything in SQL server. It uses the SQL server indexing client to prepare files for searching.

This is somewhat annoying as search results can be different depending on where you are in the system. It also requires that changes be made in 2 places.

By default, SharePoint can index Microsoft Office documents and several other file types. It can also be extended via "iFilters" to index other file types. The most common file type you may want to add for indexing is the Adobe Acrobat PDF.

You can download the PDF iFilter for free from Adobe's site. The link to version 6.0 is:
http://www.adobe.com/support/downloads/detail.jsp?ftpID=2611

Adobe provides a nice README with installation instructions, but unfortunately it's packed in the installer, so you must install the iFilter before you can read how to install it. The key things to know from this document are the server requirements:

PDF iFilter 6.0 requires one of the following environments:

  • Microsoft Windows 2000 Professional, Service Pack 2

  • Microsoft Windows XP Professional, Service Pack 1

  • Microsoft Windows 2000 Server, Service Pack 3

  • Microsoft Windows 2003 Server

    Each of these environments must also contain Microsoft Indexing Services.
    In addition, PDF iFilter 6.0 has been tested in the following environments:

  • Microsoft Windows 2000 Server (Service Pack 3) with Microsoft Office SharePoint Portal Server 2001

  • Microsoft Windows 2003 Server with Microsoft Office SharePoint Portal Server 2003

  • Once you've downloaded the PDF iFilter, and your server has all of the required service packs, you must first install the iFilter on the SharePoint server. Microsoft has provided detailed instructions on how to do this:
    http://support.microsoft.com/default.aspx?scid=kb;EN-US;832809

    The only thing that document isn't clear on is the exact location of the PDF icons...
    So far I've discovered the following are needed:
    In web server extensions\60\TEMPLATE\IMAGES
    16x16 gif titled icpdf.gif
    16x16 gif titled PDF16.gif
    32x32 gif titled PDF32.gif

    right-click on these images and choose "Save Picture As..." to download them.

    After you've finished that installation, searches on the main SharePoint home page should be successfully showing PDF files.

    Next you must install the iFilter on the SQL server. Make sure your SQL server has the latest service packs, or you may run into this issue:
    http://support.microsoft.com/default.aspx?scid=kb;en-us;323040
    This install is similar to the SharePoint server install, with a few differences.
    First stop the Indexing Service
    Next install the software
    Next register the dll
    Finally restart the Indexing Service

    NOTE: This may not work on SQL 2005. Apparently there are additional "security" measures that make it very very difficult to install iFilters on SQL Server 2005. I've found several solutions online, but so far none work... hope to update this soon with a real solution.

    If you've already installed the icons for the SharePoint install then you don't need to worry about installing the icons for the SQL install.
    You will need to re-index the existing files on the site, and this is done on the SQL server.
    The easiest way to do this is to use the SQL Server Enterprise Manger.
    Open the SharePoint Databases, and find any that have any Catalogs listed under "Full-Text Catalogs". For our site there was only one catalog, and it was under the Server_SITE database. Next, right-click on the "Full-Text Catalogs" icon and choose "Repopulate All Catalogs". This can take several hours as the server will re-index all existing files.

    Once the SQL server indexing is finished then PDF files will show up in searches done on the site pages.

    There are many iFilters out there for different file types. Many companies like Adobe provide free iFilters for their document types. Others can be purchased. One site for purchasing iFilters is: http://www.ifiltershop.com/



    Monday, February 6, 2006

    SP2003: Creating top level sites

    SharePoint has a strange way of organizing new sites. Typically new sites are added at a URL such as servername/sites/sitename. This is the default setup, and if you want to allow users to create sites then they will all follow this format. The problem is that servername/sites is not a site itself. This can be confusing and annoying. It is possible to make a top level site like this by following these steps:

    In Windows SharePoint Services Central Administration go to
    Virtual Server Configuration / Create a top-level Web site

    In the Virtual Server List, Click on the server name

    On the Create Top-level Web Site page, under Web Site Address, click on the link that appears in the sentence "To add a new URL Path go to the Define Managed Paths page".

    On the Define Managed Paths page, scroll to the bottom under Add a New Path
    Path: Type in the name for this top level site. If you want the final URL to be "servername/portals", then type in "portals".
    Type: Choose "Included path"
    Type: Choose Explicit inclusion
    Click OK

    This adds your new path (in this case "portals") to the list of Included Paths.

    Go back up to Windows SharePoint Services Central Administration by clicking the WSS quick link in the left column

    Go back to Virtual Server Configuration / Create a top-level Web site

    Click the server name again in the Virtual Server List

    This time in the Web Site Addresses section, choose "Create site at this URL:"
    URL path = choose the path you just set up (example: "portals")

    Fill in the Site Collection Owner, Secondary Owner, Quota Template and Site Language fields as needed, and click OK

    Your new top level site has been created. You are given a link where you can go to the site and choose a template. You can access the site via the url servername/sitename (example: servername/portals)

    There are a few things to keep in mind when setting up sites like this:

    Microsoft's instructions warn: "Note: Web server performance declines lineraly with the number of inclusions and exclusions. You can minimize the performance impact by using wildcard inclusions rather than many explicit inclusions, and by putting as many excluded applications under the same excluded path as possible."

    The list of inclusions also gets unwieldy after too many additions, as it doesn't sort in alphabetical order

    Also, if you are on the main SharePoint site, and you choose "Sites" in the top menu bar, and then "Create Site" in the Actions list in the left hand quick link bar, you will only be allowed to create sites under paths created with Wildcard inclusions, such as the default "sites". You won't see paths created with Explicit inclusions. This is fairly annoying, as paths created with Wildcard inclusion are not sites themselves. They only have sites under them, so servername/sites is not a site, while servername/sites/sitename is a site.

    To create subsites to a new Explicit inclusion site, simply go to that site (servername/portals), choose Create in the top menu, and then choose Sites and Workspaces.

    Wednesday, February 1, 2006

    Finding Styles

    When editing the CSS for a site, I've found the "Style Under Cursor" by Todd Bleeker to be very helpful. You simply paste this code into a content editor webpart, and it will show you the styles applied to an item as you hover over it.
    http://mindsharpblogs.com/todd/archive/2005/10/25/798.aspx


    Below is the code:


    <script language="JavaScript">
    function elementInfo()
    {
    //Output CSS Class Hierarchy
    var currElement = window.event.srcElement;
    var classTree = "";
    var n = 50;

    //Show first n characters of tagName in TAG cell
    if(currElement.tagName != null)
    {
    ststag.innerText = "<" + currElement.tagName + ">";
    if(ststag.innerText.length > n)
    ststag.innerText = ststag.innerText.substring(1,n) + "...";
    }
    else
    ststag.innerText = "";

    //Show first n characters of id in ID cell
    if(currElement.id != null)
    {
    stsid.innerText = currElement.id;
    if(stsid.innerText.length > n)
    stsid.innerText = stsid.innerText.substring(0,n) + "...";
    }
    else
    stsid.innerText = "";

    //Show first n characters of name in NAME cell
    if(currElement.name != null)
    {
    stsname.innerText = currElement.name;
    if(stsname.innerText.length > n)
    stsname.innerText =
    stsname.innerText.substring(0,n) + "...";
    }
    else
    stsname.innerText = "";

    //Show entire class parentage in the CLASS cell
    if(currElement != null)
    {
    do
    {
    if(currElement.className != null &&
    currElement.className != "")
    {
    if(classTree != "")
    classTree = currElement.className + "n" + classTree;
    else
    classTree = currElement.className;
    }
    currElement = currElement.parentElement;
    } while (currElement != null);
    stsclass.innerText = classTree;
    }
    else
    {
    stsclass.innerText = "";
    }
    }

    //Run code on all mouse over events
    window.document.body.onmouseover = elementInfo;
    </script>

    <table border="1" width="100%" height="220">
    <tr>
    <td valign="top">
    <a href="http://MindsharpBlogs.com/Todd"
    target="_blank" Title="Todd's Blog">
    <img src="/_layouts/images/pagelogo.gif"
    border="0"></img></a>
    </td>
    <td valign="top" width="100%">
    <table>
    <tr>
    <td>TAG:</td>
    <td id="ststag" width="100%"></td>
    </tr>
    <tr>
    <td>ID:</td>
    <td id="stsid"></td>
    </tr>
    <tr>
    <td>NAME:</td>
    <td id="stsname"></td>
    </tr>
    <tr>
    <td valign="top">CLASS:</td>
    <td id="stsclass"></td>
    </tr>
    </table>
    </td>
    </tr>
    </table>

    SP2003: Calendar layout

    Sometimes it is useful to have a calendar on the home page of a site, but the Calendar view of the Events list item is very large, and doesn't fit well on the home page. Read below to see how to change the layout of the Calendar so that it is smaller and fits in with the other web parts.

    I had a request to have a traditional calendar on the main page of a site. I set up an Events list, dragged that webpart to the home page, and then set the view to the Calendar view.
    When using the calendar view for a webpart the month layout takes up a ton of space. It is not possible to adjust the Height using the appearance settings. The user wanted it to take up less space.

    If you drag your mouse over the month view of the calendar you notice that it is structured to display 4 lines of text per day. If there are more than 4 appointments per day then it displays a "more..." in the 4th slot. This site calendar would rarely have 4 appointments, so showing only 3 would make things much smaller.

    The calendar view is hardcoded and difficult to change. It is rendered via javascript and the code for it is in the ows.js file.

    Looking through ows.js, I found the following:
    this.SetDate(yr, mon, day):
    if (this.iperiod == 0 ){
    this.cchanMin = 4;
    this.cchanMax = 4;
    }

    Searching online for "cchanMin", I discovered that yes, these are the min and max settings for the number of items that are displayed in the calendar view.
    http://www.sharepointu.com/forums/m_13078/mpage_1/tm.htm

    The "this.iperiod == 0" references the month view. There are also settings for the week and day views.

    First I made a backup of the ows.js file, and then changed the min and max settings both to 3:
    this.SetDate(yr, mon, day):
    if (this.iperiod == 0 ){
    this.cchanMin = 3;
    this.cchanMax = 3;
    }

    In addition to this change, I also went to work on the CSS to tighten up some of the spacing.
    First I found the correct styles via this tool:
    http://sharepointblog.com/?p=4

    There is way more white space than needed in the calendar layout. Luckily that can all be addressed in the THEME.CSS file. I made the following changes to reduce the font size and the white space:

    .ms-caltop I changed height from 30px to 12px
    .ms-caltop I added font-size:7px;
    .ms-calhead I added font-size:10pt;
    .ms-calmid I changed height from 20px to 12px
    .ms-calmid I added font-size:7pt;
    .ms-calspacer I changed height from 4px to 9px
    .ms-appt I changed height from 18px to 14px
    .ms-apptsingle I change height from 18px to 12px
    .ms-apptsingle I added font-size:7pt;

    These changes really tighten up the overall layout and make it much more front page friendly. Of course this is a change to a theme, so viewing the change requires that you first set the site to a different theme, then set it back to this theme, then hit F5 to reload the page and see the changes.

    I have yet to find a way to push an updated theme out to all sites. If you know of a way, please post it.