Hello. I have a purely file-based intranet search set up using Zoom. We have about 30,000 files, mostly XLS and a few thousand DOCs.
I have the system all set up just how I want it, except its painfully slow. Its on a P4-2.8Ghz w/1GB RAM and it still takes 60 seconds or so for a basic query. I'm using the CGI engine, of course..
I have a few questions regarding how we may speed the system up.
First, is there any way, or could you modify the CGI or whatever is needed, to have the engine skip indexing of numeric values? Since we are indexing XLS files, the total "unique" word count comes out to 1.7million.
While indexing the numbers would be nice if it were extremely fast (for quick employee ID number searches and things like that), it is so slow that its not worth the effort.
I read the README with xlhtml.zip, and it seems that this is an open source program. I suppose I could modify and recompile it to ignore numeric values, if necessary.
Also, would it be any faster if it ran on a true database backend? While I know your proprietary database system (whatever it may be) is probably faster for small sites, is there any chance of getting Zoom connected to mysql, mssql, postgresql, firebird, etc. in the near future? Thus you could pass off much of the programming logic onto the optimized database server..
Maybe I'm pushing your software too far and asking too much of a $99 product (when Google sells their mini for $4000 or so) but it really seems to have a lot of promise, even at the high end, and I appreciate the relatively open nature of it.
Anything you can do will be appreciated.
Thanks,
Chuck
I have the system all set up just how I want it, except its painfully slow. Its on a P4-2.8Ghz w/1GB RAM and it still takes 60 seconds or so for a basic query. I'm using the CGI engine, of course..
I have a few questions regarding how we may speed the system up.
First, is there any way, or could you modify the CGI or whatever is needed, to have the engine skip indexing of numeric values? Since we are indexing XLS files, the total "unique" word count comes out to 1.7million.
While indexing the numbers would be nice if it were extremely fast (for quick employee ID number searches and things like that), it is so slow that its not worth the effort.
I read the README with xlhtml.zip, and it seems that this is an open source program. I suppose I could modify and recompile it to ignore numeric values, if necessary.
Also, would it be any faster if it ran on a true database backend? While I know your proprietary database system (whatever it may be) is probably faster for small sites, is there any chance of getting Zoom connected to mysql, mssql, postgresql, firebird, etc. in the near future? Thus you could pass off much of the programming logic onto the optimized database server..
Maybe I'm pushing your software too far and asking too much of a $99 product (when Google sells their mini for $4000 or so) but it really seems to have a lot of promise, even at the high end, and I appreciate the relatively open nature of it.
Anything you can do will be appreciated.
Thanks,
Chuck
Comment