There are several issues to consider with planned maintenance, reboots, and other index server outages.
I have seen several questions floating around about best practices for server outages during a crawl and how to avoid index corruption. Here's a list of things to consider and some recommendation below:
First, the following events require a full crawl:
- An SSP administrator stopped the previous crawl.
- A content database was restored from backup.
- A farm administrator has detached and reattached a content database.
- A full crawl of the site has never been done.
- The change log does not contain entries for the addresses that are being crawled. Without entries in the change log for the items being crawled, incremental crawls cannot occur.
- The account assigned to the default content access account or crawl rule has changed.
- To repair a corrupted index. Depending upon the severity of the corruption, the system might attempt to perform a full crawl if corruption is detected in the index.
You should always try to avoid a shutdown during a crawl, as there is a potential for corruption of the index file. If possible always allow a crawl to complete before any server outage.
Some larger environments may have continuous crawls. In this case you should either modify the crawl schedule to give you an outage window or simply pause the crawl during the maintenance. Keeping in mind that a pause is not guaranteed to be fail safe.
If you are dealing with a smaller index (full crawl takes a few hours or less), I would opt to stop the crawl…knowing that a full index will be required when the service starts.
One documented bug involves rebooting the query server while an index merge is occurring. The query server will resume and serve queries, but crawling will stop until the query servers index is reinitialized from the index server.
The other related bug is when the index server is rebooted during a crawl, the index can become corrupt. This requires a catalog reset and a full crawl.
I will be updating this post with more information on these bugs and fixes.
Good luck and happy indexing.