I was recently doing a crawl / indexing of my web-site on my local machine. I was surprised, though this particular site uses all absolute links rather than relative links, that Zoom was able to follow the absolute links on the local machine to crawl / index all of the files locally. And, when the Zoom files were uploaded to the server, the search engine worked accurately. I was pleasantly surprised but wondered how it was possible. ...How is it possible?
Announcement
Collapse
No announcement yet.
Absolute links when doing a local crawl
Collapse
X
-
Yes, I was in offline mode
Originally posted by wrensoft View PostIf you were using offline mode (rather than spider mode), links are not followed at all. It might just be a happy coincidence that it worked.
Comment
-
As David said, offline mode does not rely on links to find the pages of your site. It will simply index all files within a given folder (and its subfolders) which satisfy the scan and skip options in the Configuration window. This means that, yes, it will find all the files for your website (assuming all the files are within the start folder specified), regardless of the links.
It is one of the benefits of using offline mode, over spider mode. That is, the files do not need to be well-linked for the indexer to find them. It is also much faster, and uses up no internet traffic. However, the main disadvantage of offline mode is that it is unable to index dynamically generated pages (such as PHP or ASP pages) which must be executed by the server before a meaningful page is rendered. For sites with such pages, you would need to use spider mode.
Comment
-
Originally posted by Ray View PostAs David said, offline mode does not rely on links to find the pages of your site. It will simply index all files within a given folder (and its subfolders) which satisfy the scan and skip options in the Configuration window. This means that, yes, it will find all the files for your website (assuming all the files are within the start folder specified), regardless of the links.
It is one of the benefits of using offline mode, over spider mode. That is, the files do not need to be well-linked for the indexer to find them. It is also much faster, and uses up no internet traffic. However, the main disadvantage of offline mode is that it is unable to index dynamically generated pages (such as PHP or ASP pages) which must be executed by the server before a meaningful page is rendered. For sites with such pages, you would need to use spider mode.
Comment
Comment