Marc Anderson, Co-Founder and President of Sympraxis Consulting, LLC. provides a good overview of an issue (challenge?) in Office 365 (and SharePoint) search: how to identify and manage duplicates in the result set, especially when you don’t know the reason of your issues is having duplicates. Very good post, with a lot of great details and suggestions.
“For years, I’ve been skeptical about search indexing in SharePoint, especially in SharePoint Online in Office 365. The fact that we can’t know when a search crawl has run – thus updating the indices – is a huge part of the problem. In the early days, before Content Search Web Parts (CSWPs) were available in SharePoint Online, we routinely saw delays between content creation and that content showing up in search results of days or even weeks. Later the CSWP was enabled on SharePoint Online, and it is a fantastically powerful tool, far better than the Content Query Web Part (CQWP) which it nominally replaced.
But the value of any search-driven mechanisms in SharePoint is directly tied to the recency and frequency of updates to the search index. While the CQWP is quite inefficient – since it actually goes out to look for content at the source every time it runs (though there may be some caching) – the CSWP uses the search index and can thus return results using fewer server resources in some cases. (One downside is that you can only retrieve up to 50 results with the CSWP.) Since we don’t know when the search crawls run in SharePoint Online, and we often seem to not see the results we expect, we tend to blame to indexing for the problem.”
Read the full post on Marc Anderson’s blog.