I have a PHP routine that uploads a PDF from a user's system, then uses ghostscript to break the PDF into separate files (one page per file). For this test, I am only taking the first 8 pages of the main file, and putting each page into its own file. That means there are 9 files (the original and the small 1 page files of the first 8 pages).
Here's the page that is in development, so you can see the PDFs:
http://207.158.22.22/~admin32/CurrentIssue.php
You can see 8 of the small page files here...but if you do a search, you'll see that it isn't picking up any of them in the index. When I try to index the PDFs, it seems to index the main file, but does not index the small files at all. I've tried in Offline mode and spider mode.
The indexing status shows 4169 unique words found and it shows that it indexed 9 files...but the only file that seems to be indexed is the big file.
In offline mode, I removed the large file from the directory and indexed again, and it shows 194 unique words in 8 files indexed. I don't know where the 194 words are coming from, because when I search on words that should be in the files, no results found.
I'm stuck - any ideas? Could there be something different about the PDFs I've created through ghostscript?
Here's the page that is in development, so you can see the PDFs:
http://207.158.22.22/~admin32/CurrentIssue.php
You can see 8 of the small page files here...but if you do a search, you'll see that it isn't picking up any of them in the index. When I try to index the PDFs, it seems to index the main file, but does not index the small files at all. I've tried in Offline mode and spider mode.
The indexing status shows 4169 unique words found and it shows that it indexed 9 files...but the only file that seems to be indexed is the big file.
In offline mode, I removed the large file from the directory and indexed again, and it shows 194 unique words in 8 files indexed. I don't know where the 194 words are coming from, because when I search on words that should be in the files, no results found.
I'm stuck - any ideas? Could there be something different about the PDFs I've created through ghostscript?
Comment