|
The Atomz search feature works by creating an index of the HTML text it finds on the pages at the location you give for that account. If your site makes extensive use of graphic images of text, note that these words will not be able to be indexed, and therefore wont be searchable. The indexing process begins by looking at the default page supplied by the Web server: almost always the index.html file. It stores a copy of all the text if finds there, follows each internal link it finds, indexes any text content on those pages, and follows any further links. This process continues until all pages that are available within the supplied URL and that are findable, whether via being the default served page or through links, have been read.
This indexing process wont touch pages outside the URL you supply, so if you link to www.apple.com anywhere you dont have to worry about your searches throwing up hundreds of pages from there. If you have an area on your site which you dont want indexed but which is both within the URL you supplied and is linked to from one or more pages, you can use options in your Atomz account pages to specify pages or complete directories which shouldnt be indexed. This ability has a few useful options; for example you can choose to have a page followed but not indexed or not followed at all, using URL masks. This is found in the Options section of your Atomz account pages.
If you want to have a page indexed but none of the links followed, put its URL into the URL mask field on the site as an included (rather than excluded) URL mask, with nofollow after the address. The pages text content will be read, but any addresses will be ignored. Alternatively, if youd like a pages links to be followed but any text content to be ignored, for example with a table of contents listing, where it doesnt make much sense to use in searches but it does provide routes to many more suitable pages, put noindex at the end of the line, as in include http://www.mysite.com/contents.html noindex.
Conversely, if you want to index specific pages or areas which arent linked to from the main entrypoint, use the URL Entrypoints page to list these addresses. This is also useful if your site uses more than one domain for part of its content.
A number of common words will be automatically excluded from the indexing process. It doesnt normally make any sense keeping track of words such as a, an, the, is, and so on, and so these are automatically listed in the Excluded Words list in your Atomz account. You can add more to this list if you have any words that you dont want to be used in searches to keep irrelevant results from being offered.
And finally, if your site uses frames you may want to specify a specific frame target name to be used with links in the search results page. This ensures that pages are opened in the appropriate frame of your site rather than taking over the whole browser window. If you dont use frames then the default target of _self is almost certainly the most appropriate - although in some cases forcing a new window to open up for each clicked result by using _blank instead may be worth considering.
|
|