December 9, 2010

dtSearch : Step 2 index your pages

See also:

The trivial way to index your pages with dtSearch is to start from the homepage and the let dtSearch browse on X level of pages. dtSearch will follow the html link in your html.

But the problems begin when you use any linkbutton, paging, button, … because it is not only html link but it is a postback using javascript (like: javascript:__doPostBack('body_1$rightzone_0$lbSearch','')). Of course you may try to adapt your code to only use html link but it is complicate and it not allow you to use and index the ajax requests.

The easiest way to index all your pages is to create a sitemap for dtSearch with all your pages, including (if possible) all the possibilities for the wildcard items and use this url ton index you site on 1 level.

If you need to deal with a slashpage if a cookie if not set for example, you will probalbly need to detect dtsearch to do some different operations to do that you have 2 possibilities:

You may use HttpContext.Current.Request.UserAgent and compare to the dtsearch useragent



Or you may also detect if the request come from a crawler, but you need to define dtsearch as a crawler.

To configure that you need to add this line in your web.config just before </system.web>:
<browsercaps configsource="App_Config\BrowserCaps.config">

You need a recent BrowserCaps.config (find it on google or download it here) and add this in it (already included in my file):
<case match="dtSearch*">
    browser=dtsearch
    crawler=true
    Unknown=false
    type=%{browser}
</case>

After this config you may detect the crawler (including dtsearch) using this simple code:
if (Request.Browser.Crawler)
 ...

No comments:

Post a Comment