PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Indexing php files - does not search PHP file

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Indexing php files - does not search PHP file

    Hi,

    I am using ZoomSearch for an intranet.
    Scanning pdf, xls and word-files works fine,
    but the Engine does not search a php-file which gets its content dynamically from a database.
    The structure to access each entry is like:

    "https://www.my-intranet.com/pro.php?item=1&sub_kat=12&main_kat=3"


    So what can I do to make ZoomSearch work?

    Thanx for your help!

  • #2
    If you are indexing in offline mode, then PHP script are not executed by the server (as there is no server involved) and so any content they might produce is not indexed.

    You need to use spider mode to index PHP files correctly. In spider mode, Zoom requests the PHP files from the server and the PHP files get executed.

    URLs with parameters (as in your example) is no problem when using Spider mode.

    If you are already using spider mode then maybe the page is skipped for another reason?.

    Comment


    • #3
      Thanx for your answer,
      but it did not solve my problem.
      I am running the search in spider mode and every single bit of the content can be reached via the menu.

      I am using only one index.php to display all the content as it gets called by the items of the menu.
      Mybe the problem is, that one first has to click on an item to get to the submenu, and only after clicking on an item of the submenu one gets the depending content.

      I have already tried to build a sitemap where all the content is displayed. This page gets scanned completly by ZoomSearch. But then I have the problem to link the results to the index.php with the right content to be displayed.

      I would be thankfull for an other answer.

      Comment


      • #4
        If you are relying on a Javascript menu to provide the links to navigate your website, then you should refer to this FAQ:
        Q. Why are links in my Javascript menus being skipped?

        I would suspect that is the problem, and the above FAQ will give you more information and provide solutions.

        To clarify the original question, Zoom has no problem indexing and searching PHP files with dynamic content generated from a database. The Spider Mode is designed specifically for this purpose, and this is not the problem here.
        --Ray
        Wrensoft Web Software
        Sydney, Australia
        Zoom Search Engine

        Comment


        • #5
          Hi Raymond,

          there is no Javascript involved.

          And the index.php is scanned by ZoomSearch, but only the few words which are always on the page, like a short description, etc. is indexed.

          Comment


          • #6
            Maybe the HTML generated by the page is invalid and can't be parsed?

            You said it was an intranet, so we can't see the problem from here.

            Do you have a public web site? Can you upload a few example files to a site where we can see them.

            Comment


            • #7
              Originally posted by Freeflow View Post
              there is no Javascript involved.
              Are you sure? Are there any "onload" or "onclick" attributes in your HTML code? This would be Javascript as well and some users are unaware of this. DHTML (Dynamic HTML) is Javascript too.

              Originally posted by Freeflow View Post
              Mybe the problem is, that one first has to click on an item to get to the submenu, and only after clicking on an item of the submenu one gets the depending content.
              This sounds a lot like Javascript dependent to me. I'm not sure if it is actually possible without Javascript. If you are saying that the submenu appears on the same page, after clicking on the first item, then I would say that there is definitely Javascript involved. The only pure CSS menus I know of can only work by rollover/hover and not onclick.
              --Ray
              Wrensoft Web Software
              Sydney, Australia
              Zoom Search Engine

              Comment


              • #8
                Hi again,

                at http://www.vorneweg.de/hpl/ you can find a small testsite of the intranet we are talking about.
                You have to login with the username 'guest' and the password 'guest'.
                Then please choose 'myproject' and then in the last category, called 'sonstiges' you will find a subcategory with some test entries.

                As you might see in the source code, there is 'really' no javascript in the code and every single bit of the html code is valid.

                Now the idea grew in me, that the problem might be, that I am using sessions to keep unwanted guests away.
                What do you think?

                Thanx again for your help.

                Comment


                • #9
                  Yes, using a login with a password will prevent people and search spiders from viewing your site. You probably should have mentioned the use of passwords to protect your content in your initial post.

                  There are some techniques for dealing with password protected sites here.

                  Comment


                  • #10
                    Hi again,

                    Yes, using a login with a password will prevent people and search spiders from viewing your site. You probably should have mentioned the use of passwords to protect your content in your initial post.
                    Yes, you are absolutly right. I am sorry.
                    But I have to bother you again.
                    I have made an offline copy of the intranet and scanned it in offline mode and put the results online. This works fine. Via the search results I can access all the pdf, doc and xls files on the server.
                    And even the search results of the php file, which gets its content from a database are displayed. But the parameters are missing here, which means that when I click on a search result I get an empty page.
                    In the configuration of ZoomSearch I rewrote the URL from internal to external.
                    But I do not know what else I have to do, to solve that problem.
                    I hope you did not loose your patience yet.

                    Comment


                    • #11
                      Offline Mode will not work for indexing PHP files (i.e. dynamically generated pages). See our first response to your original post at the top of this thread.

                      For PHP pages, they need to be executed by a web server for it to be indexed correctly (otherwise you would be attempting to index the PHP source code, and the parameters would be ignored).

                      Please see the FAQ on using Zoom with sites requiring authentication (posted in David's previous response, but here it is again) for instructions on how to use Spider Mode to index a password protected site. Your site appears to be using cookie-based/session-based authentication, so that would be the section that is relevant to you.

                      Hope that clears things up
                      --Ray
                      Wrensoft Web Software
                      Sydney, Australia
                      Zoom Search Engine

                      Comment


                      • #12
                        Hi,

                        I read the FAQ. That's why I decided to set up a copy of the database and website on my localhost using a wamp-server. And because it is on my harddisk, I tried to spider it in offline mode.
                        But after your last post I realizied that this was a wrong assumption. So I spidered my local site in spider mode. And it worked!!!

                        So thank you very much for your quick and constant help and for this great software.

                        Comment


                        • #13
                          Glad to hear you have it working now.

                          Yes, all mentions of "spidering" refers to indexing in Spider Mode. It is not actually possible to "spider it in offline mode" so I guess this misconception would have led to some confusion. What you would be doing then is "indexing in offline mode". Hope that clears things up!
                          --Ray
                          Wrensoft Web Software
                          Sydney, Australia
                          Zoom Search Engine

                          Comment

                          Working...
                          X