Team,
I am using Zoom search engine Enterprise edition 6.0 (build 1023). I am indexing my java based portal application with using session based login, the indexing is working fine when I index portal with GUI screen (i.e when I click “Start Indexing” button), after that I have added this indexing process to scheduling. The schedule is calling perfectly but its unable to index the whole project, even I tried to run directly in command prompt but its not indexing the complete portal.
Please help me on this issue.
Indexing log :
1. GUI
10|03/28/11 18:18:31|Start indexing (spider mode) at Mon Mar 28 18:18:31 2011
03|03/28/11 18:28:01|All index files will be written to: E:\Apache2.2\htdocs\search_kb\GB\employee
03|03/28/11 18:28:01|Writing index data for CGI/Win32 search... (Please wait)
03|03/28/11 18:28:01|Created pagedata data file (zoom_pagedata.zdat)
03|03/28/11 18:28:01|Created pagetext data file (zoom_pagetext.zdat)
03|03/28/11 18:28:01|Created pageinfo data file (zoom_pageinfo.zdat)
13|03/28/11 18:28:01|Deleting presaved index data...
13|03/28/11 18:28:01|Deleting pageinfo data...
13|03/28/11 18:28:01|Deleting miscellaneous buffers...
13|03/28/11 18:28:01|Deleting URL history...
13|03/28/11 18:28:01|Deleting duplicate page history...
13|03/28/11 18:28:01|Writing out the dictionary...
03|03/28/11 18:28:01|Created dictionary data file (zoom_dictionary.zdat)
03|03/28/11 18:28:01|Created wordmap data file (zoom_wordmap.zdat)
03|03/28/11 18:28:01|Created script settings file (settings.zdat)
10|03/28/11 18:28:01|Indexing completed at Mon Mar 28 18:28:01 2011
12|03/28/11 18:28:01|INDEX SUMMARY
12|03/28/11 18:28:01|Files indexed: 213
12|03/28/11 18:28:01|Files skipped: 339
12|03/28/11 18:28:01|Files filtered: 0
12|03/28/11 18:28:01|Files downloaded: 213
12|03/28/11 18:28:01|Unique words found: 5063
12|03/28/11 18:28:01|Variant words found: 5948
12|03/28/11 18:28:01|Total words found: 135117
12|03/28/11 18:28:01|Avg. unique words per page: 23.77
12|03/28/11 18:28:01|Avg. words per page: 634
12|03/28/11 18:28:01|Start index time: 18:18:31 (2011/03/2
12|03/28/11 18:28:01|Elapsed index time: 00:09:30
12|03/28/11 18:28:01|Peak physical memory used: 68 MB
12|03/28/11 18:28:01|Peak virtual memory used: 160 MB
12|03/28/11 18:28:01|Errors: 0
12|03/28/11 18:28:01|URLs visited by spider: 230
12|03/28/11 18:28:01|URLs in spider queue: 0
12|03/28/11 18:28:01|Total bytes scanned/downloaded: 9302804
12|03/28/11 18:28:01|File extensions:
12|03/28/11 18:28:01| .htm indexed: 0
12|03/28/11 18:28:01| .html indexed: 0
12|03/28/11 18:28:01| .php indexed: 0
12|03/28/11 18:28:01| .asp indexed: 0
12|03/28/11 18:28:01| .cfm indexed: 0
12|03/28/11 18:28:01| .aspx indexed: 0
12|03/28/11 18:28:01| .php3 indexed: 0
12|03/28/11 18:28:01| .php4 indexed: 0
12|03/28/11 18:28:01| .action indexed: 212
12|03/28/11 18:28:01| .jsp indexed: 0
12|03/28/11 18:28:01| .ssi indexed: 0
12|03/28/11 18:28:01| .shtml indexed: 0
02|03/28/11 18:28:01|Cleaning up memory used for index data... please wait.
13|03/28/11 18:28:01|Deleting wordmap data...
13|03/28/11 18:28:01|Deleting presaved index data...
13|03/28/11 18:28:01|Deleting pageinfo data...
13|03/28/11 18:28:01|Deleting miscellaneous buffers...
13|03/28/11 18:28:01|Deleting URL history...
13|03/28/11 18:28:01|Deleting duplicate page history...
02|03/28/11 18:28:01|Finished cleaning up memory.
03|03/28/11 18:28:01|Copied search script to: E:\Apache2.2\htdocs\search_kb\GB\employee\search.c gi
2. Command line "E:\Zoom Search Engine 6.0\ZoomIndexer.exe" -c -s "E:\Zoom Search Engine 6.0\conf\company_kb_GB_E.zcfg")
02|03/28/11 18:30:45|Config file loaded: E:\Zoom Search Engine 6.0\conf\<my company>_kb_GB_E.zcfg
10|03/28/11 18:30:47|Start indexing (spider mode) at Mon Mar 28 18:30:47 2011
02|03/28/11 18:30:47|Maximum number of words: 300000
02|03/28/11 18:30:47|Maximum number of files: 200000
02|03/28/11 18:30:47|Will scan files with extensions
02|03/28/11 18:30:47| .htm
02|03/28/11 18:30:47| .html
02|03/28/11 18:30:47| .php
02|03/28/11 18:30:47| .asp
02|03/28/11 18:30:47| .cfm
02|03/28/11 18:30:47| .aspx
02|03/28/11 18:30:47| .php3
02|03/28/11 18:30:47| .php4
02|03/28/11 18:30:47| .action
02|03/28/11 18:30:47| .jsp
02|03/28/11 18:30:47| .ssi
02|03/28/11 18:30:47| .shtml
02|03/28/11 18:30:47|Spider from: <my java portal>/home.action
02|03/28/11 18:30:47|Web site URL: <my java portal>/
02|03/28/11 18:30:47|Estimated RAM required during index process: 666397 KB
13|03/28/11 18:30:47|Retrieving global memory status...
13|03/28/11 18:30:47|Memory calculations completed. Initializing structures...
13|03/28/11 18:30:47|Allocating dictionary...
13|03/28/11 18:30:47|Allocating stemming...
13|03/28/11 18:30:47|Initializing pageinfo file...
13|03/28/11 18:30:47|Allocating buffers for URLs, titles, and descriptions...
13|03/28/11 18:30:47|Allocating wordmap...
13|03/28/11 18:30:47|Allocating main text buffers...
13|03/28/11 18:30:47|Allocating history structures for CRC...
13|03/28/11 18:30:47|Initializing locale settings...
13|03/28/11 18:30:47|Initializing local desc path...
13|03/28/11 18:30:47|Clearing skip page lists per start point...
13|03/28/11 18:30:47|Initializing synonyms...
13|03/28/11 18:30:47|Initializing skip words list...
13|03/28/11 18:30:47|Initializing extensions list...
13|03/28/11 18:30:47|Initialization for data tables completed.
13|03/28/11 18:30:47|Allocating URL list...
13|03/28/11 18:30:47|Initializing assistant thread for robots, etc.
04|03/28/11 18:30:47|Downloading login page at <my java portal>/welcome.action
13|03/28/11 18:30:47|Sending HTTP POST request
13|03/28/11 18:30:47|HTTP POST request sent
03|03/28/11 18:30:48|All index files will be written to: E:\Apache2.2\htdocs\search_kb\GB\employee
03|03/28/11 18:30:48|Writing index data for CGI/Win32 search... (Please wait)
03|03/28/11 18:30:48|Created pagedata data file (zoom_pagedata.zdat)
03|03/28/11 18:30:48|Created pagetext data file (zoom_pagetext.zdat)
03|03/28/11 18:30:48|Created pageinfo data file (zoom_pageinfo.zdat)
13|03/28/11 18:30:48|Deleting presaved index data...
13|03/28/11 18:30:48|Deleting pageinfo data...
13|03/28/11 18:30:48|Deleting miscellaneous buffers...
13|03/28/11 18:30:48|Deleting URL history...
13|03/28/11 18:30:48|Deleting duplicate page history...
13|03/28/11 18:30:48|Writing out the dictionary...
03|03/28/11 18:30:48|Created dictionary data file (zoom_dictionary.zdat)
03|03/28/11 18:30:48|Created wordmap data file (zoom_wordmap.zdat)
03|03/28/11 18:30:48|Created script settings file (settings.zdat)
10|03/28/11 18:30:48|Indexing completed at Mon Mar 28 18:30:48 2011
12|03/28/11 18:30:48|INDEX SUMMARY
12|03/28/11 18:30:48|Files indexed: 3
12|03/28/11 18:30:48|Files skipped: 94
12|03/28/11 18:30:48|Files filtered: 0
12|03/28/11 18:30:48|Files downloaded: 10
12|03/28/11 18:30:48|Unique words found: 442
12|03/28/11 18:30:48|Variant words found: 226
12|03/28/11 18:30:48|Total words found: 1147
12|03/28/11 18:30:48|Avg. unique words per page: 147.33
12|03/28/11 18:30:48|Avg. words per page: 382
12|03/28/11 18:30:48|Start index time: 18:30:47 (2011/03/2
12|03/28/11 18:30:48|Elapsed index time: 00:00:01
12|03/28/11 18:30:48|Peak physical memory used: 46 MB
12|03/28/11 18:30:48|Peak virtual memory used: 118 MB
12|03/28/11 18:30:48|Errors: 0
12|03/28/11 18:30:48|URLs visited by spider: 10
12|03/28/11 18:30:48|URLs in spider queue: 0
12|03/28/11 18:30:48|Total bytes scanned/downloaded: 57941
12|03/28/11 18:30:48|File extensions:
12|03/28/11 18:30:48| .htm indexed: 0
12|03/28/11 18:30:49| .html indexed: 0
12|03/28/11 18:30:49| .php indexed: 0
12|03/28/11 18:30:49| .asp indexed: 0
12|03/28/11 18:30:49| .cfm indexed: 0
12|03/28/11 18:30:49| .aspx indexed: 0
12|03/28/11 18:30:49| .php3 indexed: 0
12|03/28/11 18:30:49| .php4 indexed: 0
12|03/28/11 18:30:49| .action indexed: 2
12|03/28/11 18:30:49| .jsp indexed: 0
12|03/28/11 18:30:49| .ssi indexed: 0
12|03/28/11 18:30:49| .shtml indexed: 0
02|03/28/11 18:30:49|Cleaning up memory used for index data... please wait.
13|03/28/11 18:30:49|Deleting wordmap data...
13|03/28/11 18:30:49|Deleting presaved index data...
13|03/28/11 18:30:49|Deleting pageinfo data...
13|03/28/11 18:30:49|Deleting miscellaneous buffers...
13|03/28/11 18:30:49|Deleting URL history...
13|03/28/11 18:30:49|Deleting duplicate page history...
02|03/28/11 18:30:49|Finished cleaning up memory.
03|03/28/11 18:30:49|Copied search script to: E:\Apache2.2\htdocs\search_kb\GB\employee\search.c gi
03|03/28/11 18:30:49|Successfully created all required files
Thanks,
Prathap Puppala
I am using Zoom search engine Enterprise edition 6.0 (build 1023). I am indexing my java based portal application with using session based login, the indexing is working fine when I index portal with GUI screen (i.e when I click “Start Indexing” button), after that I have added this indexing process to scheduling. The schedule is calling perfectly but its unable to index the whole project, even I tried to run directly in command prompt but its not indexing the complete portal.
Please help me on this issue.
Indexing log :
1. GUI
10|03/28/11 18:18:31|Start indexing (spider mode) at Mon Mar 28 18:18:31 2011
03|03/28/11 18:28:01|All index files will be written to: E:\Apache2.2\htdocs\search_kb\GB\employee
03|03/28/11 18:28:01|Writing index data for CGI/Win32 search... (Please wait)
03|03/28/11 18:28:01|Created pagedata data file (zoom_pagedata.zdat)
03|03/28/11 18:28:01|Created pagetext data file (zoom_pagetext.zdat)
03|03/28/11 18:28:01|Created pageinfo data file (zoom_pageinfo.zdat)
13|03/28/11 18:28:01|Deleting presaved index data...
13|03/28/11 18:28:01|Deleting pageinfo data...
13|03/28/11 18:28:01|Deleting miscellaneous buffers...
13|03/28/11 18:28:01|Deleting URL history...
13|03/28/11 18:28:01|Deleting duplicate page history...
13|03/28/11 18:28:01|Writing out the dictionary...
03|03/28/11 18:28:01|Created dictionary data file (zoom_dictionary.zdat)
03|03/28/11 18:28:01|Created wordmap data file (zoom_wordmap.zdat)
03|03/28/11 18:28:01|Created script settings file (settings.zdat)
10|03/28/11 18:28:01|Indexing completed at Mon Mar 28 18:28:01 2011
12|03/28/11 18:28:01|INDEX SUMMARY
12|03/28/11 18:28:01|Files indexed: 213
12|03/28/11 18:28:01|Files skipped: 339
12|03/28/11 18:28:01|Files filtered: 0
12|03/28/11 18:28:01|Files downloaded: 213
12|03/28/11 18:28:01|Unique words found: 5063
12|03/28/11 18:28:01|Variant words found: 5948
12|03/28/11 18:28:01|Total words found: 135117
12|03/28/11 18:28:01|Avg. unique words per page: 23.77
12|03/28/11 18:28:01|Avg. words per page: 634
12|03/28/11 18:28:01|Start index time: 18:18:31 (2011/03/2
12|03/28/11 18:28:01|Elapsed index time: 00:09:30
12|03/28/11 18:28:01|Peak physical memory used: 68 MB
12|03/28/11 18:28:01|Peak virtual memory used: 160 MB
12|03/28/11 18:28:01|Errors: 0
12|03/28/11 18:28:01|URLs visited by spider: 230
12|03/28/11 18:28:01|URLs in spider queue: 0
12|03/28/11 18:28:01|Total bytes scanned/downloaded: 9302804
12|03/28/11 18:28:01|File extensions:
12|03/28/11 18:28:01| .htm indexed: 0
12|03/28/11 18:28:01| .html indexed: 0
12|03/28/11 18:28:01| .php indexed: 0
12|03/28/11 18:28:01| .asp indexed: 0
12|03/28/11 18:28:01| .cfm indexed: 0
12|03/28/11 18:28:01| .aspx indexed: 0
12|03/28/11 18:28:01| .php3 indexed: 0
12|03/28/11 18:28:01| .php4 indexed: 0
12|03/28/11 18:28:01| .action indexed: 212
12|03/28/11 18:28:01| .jsp indexed: 0
12|03/28/11 18:28:01| .ssi indexed: 0
12|03/28/11 18:28:01| .shtml indexed: 0
02|03/28/11 18:28:01|Cleaning up memory used for index data... please wait.
13|03/28/11 18:28:01|Deleting wordmap data...
13|03/28/11 18:28:01|Deleting presaved index data...
13|03/28/11 18:28:01|Deleting pageinfo data...
13|03/28/11 18:28:01|Deleting miscellaneous buffers...
13|03/28/11 18:28:01|Deleting URL history...
13|03/28/11 18:28:01|Deleting duplicate page history...
02|03/28/11 18:28:01|Finished cleaning up memory.
03|03/28/11 18:28:01|Copied search script to: E:\Apache2.2\htdocs\search_kb\GB\employee\search.c gi
2. Command line "E:\Zoom Search Engine 6.0\ZoomIndexer.exe" -c -s "E:\Zoom Search Engine 6.0\conf\company_kb_GB_E.zcfg")
02|03/28/11 18:30:45|Config file loaded: E:\Zoom Search Engine 6.0\conf\<my company>_kb_GB_E.zcfg
10|03/28/11 18:30:47|Start indexing (spider mode) at Mon Mar 28 18:30:47 2011
02|03/28/11 18:30:47|Maximum number of words: 300000
02|03/28/11 18:30:47|Maximum number of files: 200000
02|03/28/11 18:30:47|Will scan files with extensions
02|03/28/11 18:30:47| .htm
02|03/28/11 18:30:47| .html
02|03/28/11 18:30:47| .php
02|03/28/11 18:30:47| .asp
02|03/28/11 18:30:47| .cfm
02|03/28/11 18:30:47| .aspx
02|03/28/11 18:30:47| .php3
02|03/28/11 18:30:47| .php4
02|03/28/11 18:30:47| .action
02|03/28/11 18:30:47| .jsp
02|03/28/11 18:30:47| .ssi
02|03/28/11 18:30:47| .shtml
02|03/28/11 18:30:47|Spider from: <my java portal>/home.action
02|03/28/11 18:30:47|Web site URL: <my java portal>/
02|03/28/11 18:30:47|Estimated RAM required during index process: 666397 KB
13|03/28/11 18:30:47|Retrieving global memory status...
13|03/28/11 18:30:47|Memory calculations completed. Initializing structures...
13|03/28/11 18:30:47|Allocating dictionary...
13|03/28/11 18:30:47|Allocating stemming...
13|03/28/11 18:30:47|Initializing pageinfo file...
13|03/28/11 18:30:47|Allocating buffers for URLs, titles, and descriptions...
13|03/28/11 18:30:47|Allocating wordmap...
13|03/28/11 18:30:47|Allocating main text buffers...
13|03/28/11 18:30:47|Allocating history structures for CRC...
13|03/28/11 18:30:47|Initializing locale settings...
13|03/28/11 18:30:47|Initializing local desc path...
13|03/28/11 18:30:47|Clearing skip page lists per start point...
13|03/28/11 18:30:47|Initializing synonyms...
13|03/28/11 18:30:47|Initializing skip words list...
13|03/28/11 18:30:47|Initializing extensions list...
13|03/28/11 18:30:47|Initialization for data tables completed.
13|03/28/11 18:30:47|Allocating URL list...
13|03/28/11 18:30:47|Initializing assistant thread for robots, etc.
04|03/28/11 18:30:47|Downloading login page at <my java portal>/welcome.action
13|03/28/11 18:30:47|Sending HTTP POST request
13|03/28/11 18:30:47|HTTP POST request sent
03|03/28/11 18:30:48|All index files will be written to: E:\Apache2.2\htdocs\search_kb\GB\employee
03|03/28/11 18:30:48|Writing index data for CGI/Win32 search... (Please wait)
03|03/28/11 18:30:48|Created pagedata data file (zoom_pagedata.zdat)
03|03/28/11 18:30:48|Created pagetext data file (zoom_pagetext.zdat)
03|03/28/11 18:30:48|Created pageinfo data file (zoom_pageinfo.zdat)
13|03/28/11 18:30:48|Deleting presaved index data...
13|03/28/11 18:30:48|Deleting pageinfo data...
13|03/28/11 18:30:48|Deleting miscellaneous buffers...
13|03/28/11 18:30:48|Deleting URL history...
13|03/28/11 18:30:48|Deleting duplicate page history...
13|03/28/11 18:30:48|Writing out the dictionary...
03|03/28/11 18:30:48|Created dictionary data file (zoom_dictionary.zdat)
03|03/28/11 18:30:48|Created wordmap data file (zoom_wordmap.zdat)
03|03/28/11 18:30:48|Created script settings file (settings.zdat)
10|03/28/11 18:30:48|Indexing completed at Mon Mar 28 18:30:48 2011
12|03/28/11 18:30:48|INDEX SUMMARY
12|03/28/11 18:30:48|Files indexed: 3
12|03/28/11 18:30:48|Files skipped: 94
12|03/28/11 18:30:48|Files filtered: 0
12|03/28/11 18:30:48|Files downloaded: 10
12|03/28/11 18:30:48|Unique words found: 442
12|03/28/11 18:30:48|Variant words found: 226
12|03/28/11 18:30:48|Total words found: 1147
12|03/28/11 18:30:48|Avg. unique words per page: 147.33
12|03/28/11 18:30:48|Avg. words per page: 382
12|03/28/11 18:30:48|Start index time: 18:30:47 (2011/03/2
12|03/28/11 18:30:48|Elapsed index time: 00:00:01
12|03/28/11 18:30:48|Peak physical memory used: 46 MB
12|03/28/11 18:30:48|Peak virtual memory used: 118 MB
12|03/28/11 18:30:48|Errors: 0
12|03/28/11 18:30:48|URLs visited by spider: 10
12|03/28/11 18:30:48|URLs in spider queue: 0
12|03/28/11 18:30:48|Total bytes scanned/downloaded: 57941
12|03/28/11 18:30:48|File extensions:
12|03/28/11 18:30:48| .htm indexed: 0
12|03/28/11 18:30:49| .html indexed: 0
12|03/28/11 18:30:49| .php indexed: 0
12|03/28/11 18:30:49| .asp indexed: 0
12|03/28/11 18:30:49| .cfm indexed: 0
12|03/28/11 18:30:49| .aspx indexed: 0
12|03/28/11 18:30:49| .php3 indexed: 0
12|03/28/11 18:30:49| .php4 indexed: 0
12|03/28/11 18:30:49| .action indexed: 2
12|03/28/11 18:30:49| .jsp indexed: 0
12|03/28/11 18:30:49| .ssi indexed: 0
12|03/28/11 18:30:49| .shtml indexed: 0
02|03/28/11 18:30:49|Cleaning up memory used for index data... please wait.
13|03/28/11 18:30:49|Deleting wordmap data...
13|03/28/11 18:30:49|Deleting presaved index data...
13|03/28/11 18:30:49|Deleting pageinfo data...
13|03/28/11 18:30:49|Deleting miscellaneous buffers...
13|03/28/11 18:30:49|Deleting URL history...
13|03/28/11 18:30:49|Deleting duplicate page history...
02|03/28/11 18:30:49|Finished cleaning up memory.
03|03/28/11 18:30:49|Copied search script to: E:\Apache2.2\htdocs\search_kb\GB\employee\search.c gi
03|03/28/11 18:30:49|Successfully created all required files
Thanks,
Prathap Puppala
Comment