All posts tagged SharePoint 2010

SharePoint Crawl - Content Processing - Indexing

SharePoint crawl types: Full, Incremental, Continuous

Crawling is the process of gathering the content for search. To retrieve information, the crawl component connects to the content sources by using the proper out-of-the-box or custom connectors. After retrieving the content, the Crawl Component passes crawled items to the Content Processing Component.
There are three main types of SharePoint crawl: Full Crawl, Incremental Crawl and Continuous Crawl. Read more…

Upcoming Events

SharePoint Search is alive and well, as the market changes – by Jeff Fried, CTO of BA Insight

See the first part of this series here: Is SharePoint Search Dead?

Despite all these signals, there has continued to be a quite healthy investment from Microsoft in search, and SharePoint search is a remarkably capable and very affordable product.    The search market was commoditized for quite some time, due largely to Microsoft and Google.

The market is changing, though.    And there are a lot of positive signs for Microsoft in Enterprise Search. Read more…


Is SharePoint Search Dead? “- by Jeff Fried, CTO of BA Insight

I’m asked regularly whether Microsoft has abandoned the Enterprise Search market.   This was a frequent question in 2015, and less frequent in 2016, but there’s been a recent uptick, and I got this question 10 times last month.   As a long-standing search nerd that lives close to Microsoft, I know the answer is NO.   But I was baffled about why this question keeps coming up.

So I decided to investigate.   This blog takes you through what I’ve found and how you can answer the question when it comes up.    Search Explained is the perfect place to publish it. Read more…

Time Machines vs. Incremental Crawl

Recently I’ve been working with a customer where my job was to make their SQL based content management system searchable in SharePoint. Nice challenge. One of the best ones was what I call “time machine“.

Imagine a nice, big environment, where a full crawl takes more than 2 weeks. There are several thing during these project where we need full crawl, for example when working with managed properties, etc. But if a full crawl is such a long, it’s always a pain. You know, when you can go even for a holiday while it’s running 😉

We’re getting close to the end of the project, incrementals are scheduled, etc., but turned out there’re some items that have been put into the database nowadays, with some older “last modified date”. How this can happen? With some app for example, or if the users can work offline and upload their docs later (depending on the source system’s capabilities, sometimes these docs get the original time stamp, sometimes the current upload time as “last modified date”). If we have items with linear “last modified dates”, incremental crawls are easy to do, but imagine this sequence:

  1. Full crawl, everything in the database gets crawled.
  2. Item1 has been added, last_modified_date = ‘2013-08-09 12:45:27’
  3. Item2 has been modified, last_modified_date = ‘2013-08-09 12:45:53’
  4. Incremental crawl at ‘2013-08-09- 12:50:00’. Result: Item1 and Item2 crawled.
  5. Item 3 has been added, last_modified_date = ‘2013-08-09 12:58:02’
  6. Incremental crawl at ‘2013-08-09- 13:00:00’. Result: Item3 crawled.
  7. Item4 has been added by an external tool, last_modified_date = ‘2013-08-09 12:45:00’.
    Note that this time stamp is earlier than the previous crawl’s time.
  8. Incremental crawl at ‘2013-08-09- 13:10:00’. Result: nothing gets crawled.

The reason is: Item4’s last_modified_date time stamp is older than the previous crawl, and the crawler suppose every change got happened since that (i.e. no time machine built-in to the backend 😉 ).

What to do now?

First option is: Full crawl. But:

  1. If Full crawl takes more than 2 weeks, it’s not always an option. We have to avoid is if possible.
  2. We can suppose, the very same can happen anytime in the future, i.e. docs appering from the past, even before the last crawl time. And Full crawl not an option, see #1.

Obviously, customer would like to see these “time travelling” items in the search results as well, but looks like neither full nor incremental crawl is an option.

But, consider this idea: what if we could trick the incremental to think the previous crawl was not 10 minutes ago but a month (or two, or a year, depending on how old docs we can expect to appear newly in the database)? In this case, incremental crawl would not check the new/modified items since the last incremental, but for a month (or two, or a year, etc.) back. Time machine, you know… 😉

Guess what? – It’s possible. The solution is not official, not supported, but works. The “only” thing you have to do is modifying the proper time stamps in the MSSCrawlURL table, something like this:

Why? – Because the crawler determines the “last crawl time” by this table. If you trick the time stamps back, the crawler thinks the previous crawl was too long ago and goes back in time, to get the changes from that, longer period. And in this case, without doing a full crawl, you’ll get every item indexed, even the “time travelling” ones from the past.

Ps. Same can be done if you have last_modified_date values in the future. The best docs from the future I’ve seen so far were created in 2127…

v:* {behavior:url(#default#VML);}
o:* {behavior:url(#default#VML);}
w:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}





/* Style Definitions */
{mso-style-name:”Table Normal”;
mso-padding-alt:0in 5.4pt 0in 5.4pt;
font-family:”Times New Roman”,”serif”;}

The problem in this case is that as soon as you crawl any of these, crawler considers 2127 as the last crawl’s year, and nothing created before (in the present) will get crawled by any upcoming incrementals. Until 2127, of course 😉

Related Posts:


Word Breakers in SharePoint – from version to version

Recently, I’ve got a question about file naming conventions versus Search: which characters can be really used as word breakers, eg. replacing spaces in file names (see Susan Hanley’s great post about the file naming conventions here).

For example, let’s say you have a file with name “My Cool Test Document.docx”. If you store this document in a file share or upload to SharePoint and index it, you’ll be able to search for the words “cool” or “Test” or “Document”, and you’ll see this document on the result set.

Anyway, using this file name in SharePoint document libraries will give you the crazy “%20” characters in the URL (see Susan’s post, again).

If we simple eliminate the spaces and uses plain camel case in the file name (MyCoolTestDocument.docx), we won’t get the “%20″s but the search engine will not be able to separate the words.

The third option is, of course, using some other character instead of spaces, for example underscore, like My_Cool_Test_Document.docx.

Well, I have started to make some research and found that this area is not really well documented, so that I decided to make some tests by myself. I created some “lorem ipsum” documents with different names, including several special character for replacing space.

My test details:

  • SharePoint versions I tested against
    • MOSS 2007 Enterprise
    • SP 2010 Enterprise, no FAST
    • FS4SP
    • SP2013 Enterprise
    • O365
  • Content sources:
    • File share
    • SharePoint document library
  • Characters used in file names: “-“, “_”, “.”, “&”, “%”, “+”, “#”

The results are absolutely consistent with Susan’s recommendations:

  1. Content source: file share
Character MOSS 2007 SP2010 FS4SP SP2013
yes yes yes NO
_ NO yes yes yes
. yes yes yes yes
& yes yes yes yes
% yes yes yes yes
+ yes yes yes yes
# yes yes yes yes
  1. Content source: SharePoint document library
Character MOSS 2007 SP2010 FS4SP SP2013 O365
yes yes yes yes yes
_ NO yes yes yes yes
. yes yes yes yes yes
& invalid invalid invalid invalid invalid
% invalid invalid invalid invalid invalid
+ yes yes yes yes yes
# invalid invalid invalid invalid invalid

In these tables, YES means: the character works fine as a word breaker, for example:

NO means, the character does NOT work as a word breaker, the engine cannot split the words by this. For example, the character underscore “_” cannot be used as a word breaker in MOSS 2007, so that it’s better to use a different character on this, old version of SharePoint.

It’s quite interesting, that “-” is not a word breaker on SharePoint 2013 IF the content source is file share.

Invalid means it’s not allowed to use the character in file names in any SP doc library.

One more, last note for MOSS 2007: there are some issues with hit highlighting in MOSS 2007, see an example on the screenshot below:


Four Tips for Index Cleaning

If you’ve ever had fun with SharePoint Search, most likely you’ve seen (or even used) Index Reset there. This is very useful if you want to clear everything from your SharePoint index – but sometimes it’s not good enough:

  1. If you don’t want to clean the full index but one Content Source only.
  2. If you have FAST Search for SharePoint 2010.
  3. Both 🙂

1. Cleaning up one Content Source only

Sometimes you have too much content crawled, but need to clear one Content Source. In this case, clearing everything might be very painful – imagine to clear millions of documents, then crawling everything that should not have been cleaned…

Instead, why not cleaning one Content Source only?

It’s much easier than it seems to be:

  1. Open your existing Content Source.
  2. Check if there’s no crawl running on this Content Source. The status of the Content Source has to be Idle. If not, Stop the current crawl and wait until it gets done.
  3. Remove all Start Addresses from your Content Source (don’t forget to note them before clearing!).
  4. Wait until the index gets cleaned up.(*)
  5. Add back the Start Addresses (URLs) to your Content Source, and Save your settings..
  6. Enjoy!

With this, you’ll be able to clear only one Content Source.

Of course, you can use either the UI of SSA in Central Administration or PowerShell, the logic is the same. Here is a simple PowerShell script for removing the Start Addresses:

$contentSSA = “FAST Content SSA” $sourceName = “MyContentSource”

$source = Get-SPEnterpriseSearchCrawlContentSource -Identity $sourceName     -SearchApplication $contentSSA $URLs = $source.StartAddresses | ForEach-Object { $_.OriginalString }


Then, as soon as you’re sure the Index has been cleaned up(*), you can add back the Start Addresses, by this command:


ForEach ($address in $URLs){ $source.StartAddresses.Add($address) }

2. Index Reset in FAST Search for SharePoint

You most likely know Index Reset on the Search Service Application UI:


Well, in case of you’re using FAST Search for SharePoint 2010 (FS4SP), it’s not enough. Steps for making a real Index Reset are the followings:

  1. Make an Index Reset on the SSA, see the screenshot above.
  2. Open FS4SP PowerShell Management on the FAST Server, as a FAST Admin.
  3. Run the following command: Clear-FASTSearchContentCollection –Name <yourContentCollection>. The full list of available parameters can be found here. This deletes all items from the content collection, without removing the collection itself.

3. Cleaning up one Content Source only in FAST Search for SharePoint

Steps are the same as in case of SharePoint Search, see above.

4. Checking the status of your Index

In the Step #4 for above (*), I’ve mentioned you should wait until the index gets cleaned up, and it always takes time.

First place where you can go is the the SSA, there is a number that is a very good indicator:

Searchable Items

In case of FS4SP, you should use PowerShell again, after running the Clear-FASTSearchContentCollection command:

  1. Open FS4SP PowerShell Management on the FAST Server, as a FAST Admin.
  2. Run the following command: Get-FASTSearchContentCollection –Name <yourContentCollection>. The result is containing several information, including DocumentCount:

How to check the clean-up process with this?

First option: if you know how many items should be cleaned, just check the DocumentCount before you clean the Content Source, and regularly afterwards. If the value of DocumentCount is around the value you’re expecting AND not decreasing anymore, you’re done.

Second option: if you don’t know how many items will be cleared, just check the value of DocumentCount regularly, like in every five minutes. If this value stopped decreasing AND doesn’t get decreased for a while (eg. for three times you’re checking), you’re done.

As soon as you’re done, you can add back the Start Addresses to your Content Source, as mentioned above.


Debugging and Troubleshooting the Search UI

Recently, I have been giving several Search presentations, and some of them were focusing on Crawled and Managed Properties. In this post, I’m focusing on the User Experience part of this story, especially on the Debugging and Troubleshooting.

As you know, our content might have one (or more) unstructured part(s) and some structured metadata, properties. When we crawl the content, we extract these properties – these are the crawled properties. And based on the crawled properties, we can create managed properties.

Managed Properties in a Nutshell

Managed Properties are controlled and managed by the Search Admins. You can create them mapped to one or more Crawled Properties.

For example, let’s say your company has different content coming from different source systems. Office documents, emails, database entries, etc. stored in SharePoint, File System, Exchange or Lotus Notes  mailboxes, Documentum repositories, etc. For each content, there’s someone who created that, right? But the name of this property might be different in the several systems and/or for the several document types. For Office Documents, it might be Author, Created By, Owner, etc. For emails, usually it’s called From.

At this point, we have several different Crawled Properties, used for the same thing: tag the creator of the content. Why don’t display this in a common way for you, the End User? For example, we can create a Managed Property called ContentAuthor and map each of the Crawled Properties above to this (Author, Created By, Owner, From, etc.). With this, we’ll be able to use this properties in a common way on the UI: display on the Core Results Web Part, use as Refiner, or as a Sorting value in case of FAST.

(NOTE: Of course, you can use each Crawled Property for more than one Managed Properties.)

On the Search UI

If you check a typical SharePoint Search UI, you can find the Managed Properties in several ways:

Customized Search UI in SharePoint 2010 (with FS4SP)

1. Refiners – Refiners can be created by the Managed Properties. You can define several refiner types (text, numeric range, date range, etc.) by customizing this Web Part’s Filter Category Definition property. There are several articles and blog posts describing how to do this, one of my favorite one is this one by John Ross.

2. Search Result Properties – The out-of-the-box Search Result Set is something like this:

OOTB Search Results

This UI contains some basic information about your content, but I’ve never seen any environment where it should have not been customized more or less. Like the first screenshot, above. You can include the Managed Properties you want, and you can customize the way of displaying them too. For this, you’ll have to edit some XMLs and XSLTs, see below…

3. Property-based Actions – If you can customize the UI of properties on the Core Result Web Part, why don’t assign some actions to them? For example, a link to a related item. A link to more details. A link to the customer dashboard. Anything that has a (parameterized) URL and has business value to your Search Users…

4. Scopes and Tabs – Search Properties can be used for creating Scopes, and each scope can have its own Tab on the Search UI.

Core Result Web Part – Fetched Properties

If you want to add some managed properties to the Search UI, the first step is adding this property to the Fetched Properties. This field is a bit tricky though:

Fetched Properties

Edit the Page, open the Core Result Web Part’s properties, and expand the Display Properties. Here, you’ll see the field for Fetched Properties. Take a deep breath, and try to edit it – yes, it’s a single-line, crazy long XML. No, don’t try to copy and paste this by your favorite XML editor, because if you do and break it to lines, tabs, etc. and try to copy back here, you’ll have another surprise – this is really a single line text editor control. If you paste here a multi-line XML, you’ll get the first line only…

Instead, copy the content of this to the clipboard and paste to Notepad++ (this is a free text editor tool, and really… this is a Notepad++ :)). It seems like this:

Fetched Properties in Notepad++

Open the Language menu and select XML. Your XML will be still one-line, but at least, formatted.

Open the Plugins / XML Tools / Pretty Print (XML only – with line breaks) menu, and here you go! Here is your well formatted, nice Fetched Properties XML:

Notepad++ XML Tools Pretty print (XML Only - with line breaks)

So, you can enter your Managed Properties, by using the Column tag:

<Column Name=”ContentAuthor”/>

Ok, you’re done with editing, but as I’ve mentioned, it’s not a good idea to copy this multi-line XML and paste to the Fetched Properties field of the Core Results Web Part. Instead, use the Linarize XML menu of the XML Tools in Notepad++, and your XML will be one loooooooooong line immediately. From this point, it’s really an easy copy-paste again. Do you like it? 🙂

NOTES about the Fetched Properties:

  • If you enter a property name that doesn’t exist, this error message will be displayed:

Property doesn't exist or is used in a manner inconsistent with schema settings.

  • You’ll get the same(!) error if you enter the same property name more than once.
  • Also, you’ll get the same error if you enter some invalid property names to the Refinement Panel Web Part!

Debugging the Property Values

Once you’ve entered the proper Managed Property names to the Fetched Property field, technically, you’re ready to use them. But first, you should be able to check their values without too much effort. Matthew McDermott has published a very nice way to do this: use an empty XSL on the Core Results Web Part, so that you’ll get the plain XML results. You can find the full description here.

In summary: if you create a Managed Property AND add it to the Fetched Properties, you’re ready to display (and use) it on the Result Set. For debugging the property values, I always create a test page with Matthew’s empty XSL, and begin to work with the UI customization only afterwards.



Event-Driven Crawl Schedule

Recently I’ve been working for a customer where I’ve found some interesting requirements: they had several content sources and wanted to crawl them one by one after each other. Scheduling the incrementals for fix time was not a good solution as their content incrementals were very hectic: incremental crawl for the same content source took 5 min at one time, then 1.5 hours next time. And of course, they didn’t want idle time.

But we cannot define these kind of rules from the UI, so the ultimate solution was PowerShell.

First, we need to be able to start the crawl. Let’s talk about Incremental Crawl only this time. Here is the PowerShell script for this:

$SSA = Get-SPEnterpriseSearchServiceApplication -Identity “Search Service Application”

$ContentSourceName = My Content Source

$ContentSource = $SSA | Get-SPEnterpriseSearchCrawlContentSource -Identity $ContentSourceName


It’s an easy one, isn’t it?

Next step is checking the status of this content source. We need this for several reasons, for example we want to start the crawl only if it’s in Idle status, or we want to display the current status of the crawl in every minute, etc.

Here is the PowerShell command you need:


What values can it have? Here you are, the list of crawl statuses:

  • Idle
  • CrawlStarting
  • CrawlingIncremental / CrawlingFull
  • CrawlPausing
  • Paused
  • CrawlResuming
  • CrawlCompleting
  • CrawlStopping

Ok, we can decide the status now, we can start a crawl. How to make it event driven? Here is the logical sequence we have to follow:

  1. Start the crawl of a content source.
  2. Wait until it’s done.
  3. Take the next content source and repeat the steps 1. and 2. until you’re done with each content source.
  4. Repeat this sequence.

First step is creating a function if we want a nice code. Here you go, my first one:

function Crawl {             #Start crawling     $ContentSourceName = $args[0]     $ContentSource = $SSA | Get-SPEnterpriseSearchCrawlContentSource –Identity $ContentSourceName     $CrawlStarted = Get-Date

#Check crawl status     if (($ContentSource.CrawlStatus -eq “Idle”) -and ($CrawlNumber -eq 0)) {             $ContentSource.StartIncrementalCrawl()         Start-sleep 1         Write-Host $ContentSourceName ” – Crawl Starting…”

do {             Start-Sleep 60     # Display the crawl status in every 60 seconds             $Now = Get-Date             $Duration = $Now.Subtract($CrawlStarted)    # Duration of the current crawl             $Speed = $ContentSource.SuccessCount / $Duration.TotalSeconds    # Speed of the current crawl, docs/sec                         Write-Host $ContentSourceName ” – ” $ContentSource.CrawlState (Get-Date).ToString() “-” $ContentSource.SuccessCount”/” $ContentSource.WarningCount”/” $ContentSource.ErrorCount “(” (“{0:N2}” -f $Speed) ” doc/sec)”         } while (($ContentSource.CrawlStatus -eq “CrawlStarting” ) -or ($ContentSource.CrawlStatus -eq “CrawlCompleting”) -or ($ContentSource.CrawlStatus -eq “CrawlingIncremental”) -or ($ContentSource.CrawlStatus  -eq “CrawlingFull” ))

Write-Host $ContentSourceName ” – Crawling Finished”         Write-Host “”     } }

This is how you can call this function:

Crawl(“My Content Source”)

Some additional steps you might need:

  • If you want to run this script once a day (need daily incrementals only but would like to be done as quick as possible), just schedule this script as a Windows task.
  • If you want to run this script during your day only (and release the resources for some other jobs for nights, for example), you can do the start in the morning and start in the evening logic. I’ve made a simple example in my blog post a few months ago.
  • If you want to run this sequence all day long, you might insert this logic into an infinite loop. (But be careful, sometimes you’ll need to run full crawl and then you have to stop running this script.)
  • You can insert some other steps into this script too. If you want to do something (logging, sending some alerts, etc.) when the crawl starts / stops, just do that here. It’ll be your custom event handler on the crawl events.
  • You can even write the output of this script to a file, so that you’ll have your own crawl log.

The scripts above works fine with both SharePoint Search and FAST Search for SharePoint. Enjoy!

Why are some Refiner values hidden?

Refiners are cool either if you use SharePoint Search or FAST, it’s not a question. I very like them, they give so much options and power to the end users.

But there’s a very common question around them: the deep and shallow behavior. As you know the definitions very well: FAST Search for SharePoint has deep refiners, that means each result in the result set is processed and used when calculating the refiners. And SharePoint Search uses shallow refiners, where the refiner values are calculated from the first 50 results only.

These definitions are easy, right? But let’s think a bit forward, and try to answer the question that pops up at almost every conference: Why some Refiner values are not visible when searching? Moreover: why they’re visible when running Query1 and hidden when running Query2?

For example: let’s say you have a lot of documents crawled, and you enter a query where the result set contains many-many items. Thousands, tens of thousands or even more.

Let’s say you have some Excel workbook in the result set that might be relevant for you, but this Excel file is not boosted in the result set at all, let’s say the first Excel result is on the 51th position (you have a lot of Word, PowerPoint, PDF, etc. files on the positions 1-50).

What happens if you use FAST Search? – As the refiners are deep, each result will be processed, so your Excel workbook. For example, in the Result Type refiner you’ll see all the Word, PowerPoint, PDF file types as well as the Excel. Easy way, you can click on the Excel refiner and you’ll get what you’re looking for immediately.


But what’s the case if you don’t have FAST Search, only the SharePoint one? – As the first 50 results is processed for the refiner calculation, your Excel workbook won’t be included. This means, the Result Type refiner displays the Word, PowerPoint, PDF refiners but doesn’t display the Excel at all, as your Excel file is not amongst the top results. You’ll see the Result Type refiner as if it there wasn’t any Excel result at all!


Conclusion: the difference between the shallow and deep refiners doesn’t seem to be so much important for the first sight. But you have to be aware there’s a huge difference in a real production environment as you and your users might have some hidden refiners, and sometimes it’s hard to understand why.

In other words, if a refiner value shows up on your Refinement Panel, that means:

  • In case of FAST Search for SharePoint (deep refiner): There’s at least one item matching this refiner value in the whole result set. Exact number of the items match the refiner value is included.
  • In case of SharePoint Search (shallow refiner): There’s at least one item matching this refiner value in the first 50 results.

If you cannot see a specific value on the Refiner Panel, that means:

  • In case of FAST Search for SharePoint (deep refiner): There’s no result matching this refiner value at all.
  • In case of SharePoint Search (shallow refiner): There’s no result matching this refiner in the first 50 results.


Q&As of my Recent Webinar – Part 1.

Recently, I’ve made a webinar with Dave Coleman and MetaVis. Thanks to the great people attending, I’ve got lot more question I was able to answer then, so as an ultimate solution I promised to answer them on my blog. So here are my answers, part #1:

  1. What is the motivation behind prevent folders under Document Sets – is it search related or just an business level decision? – Unfortunately I don’t know the exact answer. It was a technical decision during design time, as I know. The property management and all inheritance things could be much more complicated in case we could have a hierarchy inside a Document Set.
  2. Will you provide the ppt’s after the webinar? – Yes, the PPTs have been uploaded to slideshare:
  3. Can the content uploaded and processed by content organizer be moved to different libraries in different site collections? – Out of the box, the Content Organizer Rules can move the documents inside a Site Collection. If you need to move them out, you need some custom solution.
  4. Is SharePoint 2010 search based on FAST? – No, FAST Search for SharePoint is based on the FAST Search product line, and SP2010 Search is based on the previous SharePoint products’ code.
  5. Can we import Term sets from external system like external data in other software running in the company like we can do this in via bcs? – Yes, you can import term sets in CSV format. You can find more information here:
  6. Have we been seeing any BA-Insight Longitude Search additions to FAST Search or was it all only FASt Search? – No, all my demos contained out-of-the-box capabilities only (both SP2010 and FAST Search).
  7. Is Search also limited to a single Site (Server) or could this be run over multiple Sites (Servers) at same time? – You can scale-out your Crawler and Query servers to multiple servers both in case of SP2010 Search and FAST Search.
  8. How do activate FAST Search? – FAST Search Server 2010 for SharePoint (FS4SP) is a separated product. Steps for activating FS4SP are the followings:
    1. You must have a SharePoint 2010 Enterprise farm.
    2. Install FAST Search server(s). Depending on your environment, you can have one or more FS4SP servers, logically it’s a farm too.
    3. Configure your SP2010 farm and the FS4SP farm to work together, create the Search Server Applications in SharePoint.
    4. If you’re interested in much more details, here are some useful links and troubleshooting steps for you:
  9. How to better integrate search in the custom applications (not from inside sharepoint)? – SharePoint has both an API (Object Model) and a collection of Web Services. You can call those Web Services from any remote application (with proper permissions, of course), while Object Model can be used on the SharePoint server only. You can find the QueryService reference here:

More to come soon (sooner than this one, I can tell you)…

SP2010 SP1 issues – Config Wizard!

Recently, I have seen several issues after installing SharePoint 2010 SP1. The fix is very easy, but first let me describe the symptoms I have seen.

1. The search application ‘Search Service Application’ on server MYSERVER did not finish loading. View the event logs on the affected server for more information.

This error appears on the Search Service Application. Event Log on the server contains tons of errors like this:

Log Name:      Application

Source:        Microsoft-SharePoint Products-SharePoint Server

Date:          7/19/2011 5:01:22 PM

Event ID:      6481

Task Category: Shared Services

Level:         Error


User:          MYDOMAINsvcuser

Computer:      myserver.mydomain.local


Application Server job failed for service instance Microsoft.Office.Server.Search.Administration.SearchServiceInstance (898667e4-126e-45d2-bb52-43f613669084).

Reason: The device is not ready.

Technical Support Details:

System.IO.FileNotFoundException: The device is not ready. 

   at Microsoft.Office.Server.Search.Administration.SearchServiceInstance.Synchronize()

   at Microsoft.Office.Server.Administration.ApplicationServerJob.ProvisionLocalSharedServiceInstances(Boolean isAdministrationServiceJob)

2. On a different server, the SharePoint needed to be upgraded from Standard to Enterprise (after SP1 has just been installed). After entering the license key, we got an Unsuccessful error on the User Interface, and this in the Event Log:

The synchronization operation of the search component: d289abde-9641-46d0-8d32-0345f1885704 associated to the search application: Search Service Application on server: BAIQATEST01 has failed. The component version is not compatible with the search database: Search_Service_Application_CrawlStoreDB_a737e7614f034544a8c1da6fe4a24f7b on server: BAIQATEST01. The possible cause for this failure is The database schema version is less than the minimum backwards compatibility schema version that is supported for this component. To resolve this problem upgrade this database.

The reason of these errors were the same in both cases: SharePoint 2010 Config Wizard should have been run after installing SP1. Once you run the Config Wizard, these errors disappear and everything starts to work fine – again!

How to check the Crawl Status of a Content Source

As you know I’m playing working with SharePoint/FAST Search a lot. I have a lot of tasks when I have to sit on the button F5 while crawling and check the status: is it started? is it still crawling? is it finished yet?…

I have to hit F5 in every minute. I’m too lazy, so decided to write a PowerShell script that does nothing but checking the crawl status of a Content Source and writes it to the console to me. And I can work on my second screen while it’s working and working and working – without touching F5.

The script is pretty easy:

$SSA = Get-SPEnterpriseSearchServiceApplication -Identity “Search Service Application” $ContentSource = $SSA | Get-SPEnterpriseSearchCrawlContentSource -Identity “My Content Source”

do {     Write-Host $ContentSource.CrawlState (Get-Date).ToString() “-” $ContentSource.SuccessCount “/” $ContentSource.WarningCount “/” $ContentSource.ErrorCount     Start-Sleep 5 } while (1)

Yes, it works fine for FAST (FS4SP) Content Sources too.

How to Schedule Crawl Start/Pause in SharePoint 2010 Search by PowerShell

In case of having not strong enough hardware there’s a pretty common request for start the crawl in the evening and pause in the next morning, before the work day starts. Scheduling the start of Full/Incremental Crawl is pretty easy from the admin UI, but you have to do some trick if you want to schedule the pause too. Here is my favorite trick: use PowerShell!

Here is what I do here:

  1. Create a script to start/resume the crawl (CrawlStart.ps1).
  2. Create a script to pause the crawl (CrawlPause.ps1).
  3. Schedule the script CrawlStart.ps1 to run in the evening (like 6pm).
  4. Schedule the script CrawlPause.ps1 to run in the morning (like 6am).

Is it simple, right? 😉

Here are some more details.

First, we have to know how to add the SharePoint SnapIn to PowerShell. Here is the command we need: Add-PSSnapin Microsoft.SharePoint.PowerShell.

Second, we have to get the Content Source from our Search Service Application:

$SSA = Get-SPEnterpriseSearchServiceApplication -Identity “Search Service Application” $ContentSource = $SSA | Get-SPEnterpriseSearchCrawlContentSource -Identity “My Content Source”

Then we have to know how to check the status of this content source’s crawl: $ContentSource.CrawlStatus. Here are the available values:

  • Idle
  • CrawlStarting
  • CrawlingIncremental / CrawlingFull
  • CrawlPausing
  • Paused
  • CrawlResuming
  • CrawlCompleting
  • CrawlStopping

Finally, we have to know how to start/pause/resume the crawling:

  • Start Full Crawl: $ContentSource.StartFullCrawl()
  • Start Incremental Crawl: $ContentSource.StartIncrementalCrawl()
  • Pause the current crawl: $ContentSource.PauseCrawl()
  • Resume the crawl: $ContentSource.ResumeCrawl()

That’s it. Here are the final scripts:

1. CrawlStart.ps1

Add-PSSnapin Microsoft.SharePoint.PowerShell

$SSA = Get-SPEnterpriseSearchServiceApplication -Identity “Search Service Application” $ContentSource = $SSA | Get-SPEnterpriseSearchCrawlContentSource -Identity “My Content Source”

if ($ContentSource.CrawlStatus  -eq “Idle” ) {         $ContentSource.StartIncrementalCrawl()     Write-Host “Starting Incremental Crawl”

if ($ContentSource.CrawlStatus  -eq “Paused” ) {         $ContentSource.ResumeCrawl()     Write-Host “Resuminging Incremental Crawl” }

2. CrawlPause.ps1

Add-PSSnapin Microsoft.SharePoint.PowerShell

$SSA = Get-SPEnterpriseSearchServiceApplication -Identity “Search Service Application” $ContentSource = $SSA | Get-SPEnterpriseSearchCrawlContentSource -Identity “My Content Source”

Write-Host $ContentSource.CrawlState

if (($ContentSource.CrawlStatus  -eq “CrawlingIncremental” ) -or ($ContentSource.CrawlStatus  -eq “CrawlingFull” )) {         $ContentSource.PauseCrawl()     Write-Host “Pausing the current Crawl”     }

Write-host $ContentSource.CrawlState

And finally, you have to schedule these tasks as a Windows job, by using these actions: powershell –command & ‘C:ScriptsCrawlStart.ps1’ to start and powershell –command & ‘C:ScriptsCrawlPause.ps1’ to pause your crawl.

Ps.: These scripts work fine for FAST Content Sources in SharePoint too, in this case you have to use the FAST Content SSA.


Troubleshooting: FAST Admin DB

The environment:

A farm with three servers: SharePoint 2010 (all rules), FAST Admin, FAST non-admin. SQL is on the SharePoint box too.

The story:

Recently, I had to reinstall the box with SP2010 and SQL. Everything seemed to be fine: installing SQL, SP2010, configuring FAST Content and Query Service Apps, crawl, search… It was like a dream, almost unbelievable. But after that I started to get an error on BA Insight Longitude Connectors admin site, when I started to play with the metadata properties: Exception configuring search settings: … An error occurred while connecting to or communicating with the database…

I went to the FAST Query / FAST Search Administration / Managed Properties, and got this error: Unexpected error occurred while communicating with Administration Service

Of course, I went to the SQL Server’s event log, where I found this error: Login failed for user ‘MYDOMAINsvc_user’. Reason: Failed to open the explicitly specified database On the Details tab I could see the ‘master’ as the related DB.

I went to SQL Server Profiler, but the Trace told the same.

Of course, I checked everything around FAST: the user was in the FASTSearchAdministrators group, permission settings were correct on SQL, etc.

Finally, I found what I was looking for: Event Log on the FAST admin server contained this error: System.Data.SqlClient.SqlException: Cannot open database “FASTSearchAdminDatabase” requested by the login. The login failed. Login failed for user ‘MYDOMAINsvc_user’

The solutions:

Yes, it was what I was looking for: I really forgot to restore the FASTSearchAdminDatabase. But what to do if you don’t have a backup about that?

Never mind, here is the Powershell command for you:

Install-FASTSearchAdminDatabase -DbServer SQLServer.mydomain.local -DbName FASTSearchAdminDatabase

Voilá, it’s working again! 🙂

PowerShell script for exporting Crawled Properties (FS4SP)

Recently, I was working with FAST Search Server 2010 for SharePoint (FS4SP) and had to provide a list of all crawled properties in the category MyCategory. Here is my pretty simple script that provides the list in a .CSV file:

$outputfile = “CrawledProperties.csv”

if (Test-Path $outputfile) { clear-content $outputfile }

foreach ($crawledproperty in (Get-FASTSearchMetadataCrawledProperty)) {     $category = $crawledproperty.categoryName     if ($category = “MyCategory”)     {

# Get the name and type of the crawled property         $name = $         $type = $crawledproperty.VariantType

switch ($type) {             20 {$typestr = “Integer”}             31 {$typestr = “Text”}             11 {$typestr = “Boolean”}             64 {$typestr = “DateTime”}             default {$typestr = other}

}         # Build the output: $name and $typestr separated by           $msg = $name + ” ” + $typestr

Write-output $msg | out-file $outputfile -append     } }

$msg = “Crawled properties have been exported to the file ” + $outputfile write-output “” write-output $msg write-output “” write-output “”

Load More