PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Can zoom can search phpbb2 bulletin board?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Can zoom can search phpbb2 bulletin board?

    I would like to search a phpbb2 forum, can this be done? Thanks all. Perry

  • #2
    Yes, see this FAQ page for details.
    http://www.wrensoft.com/zoom/support/msgboards.html

    -----
    David

    Comment


    • #3
      That was a very good article I'll try the suggestions so I don't have it follow too much. Which page do I use as the starting point?

      I realize each phpbb2 forum has its forum in its own folder and their own path to folder. I haven't decided yet which forum I want to include but all those I am interested in use phpbb2. I tried a couple of them using their full path. A couple of them had this directory for their forum..... /phpbb2/

      But zoom didn't seem to like trying to search a whole directory, which is understandable, I am pretty sure it wanted a file.

      All forums I am interested in seem to start on the same file index.php so I tried full path of a few forums which used phpbb2 as their forum directory. With full path plus ......../phpBB2/index.php. None of these worked. Zoom said it was unable to follow path. I thought since you have a forum like this you might know which file to use as the starting point in the phpbb2 programming. Thank you.

      Comment


      • #4
        The start point should be the index.php file in the forum directory.

        http://www.exampledomain.com/forum/index.php

        The /forum/ directory might have a different name on different sites, but the the index.php file will always have that name with PHP2.

        If it doesn't work, then you have some other configuration problem in Zoom (e.g. you are not indexing .php files) or some type of connection probme (e.g. you aren't connected to the internet).

        What was the URL that you tried?

        -----
        David

        Comment


        • #5
          Thanks for the clues, got it, turns out their site capitalized BB2 and I was keeping it lower case.

          Comment


          • #6
            I notice even though I tell zoom to skip part of a forum file like is in your list
            /forum/faq.php
            etc.... it indexes most of it anyway. Wondering what the best way to approach this before I start adding a bunch of skips. I read the users guide. I usually avoid it but if I have to, am I allowed to add whole urls into the skip? Or if you have other areas of references for skip please let me know.

            I wouldn't want to put in the directory /forum/ because that would keep the whole forum from being searched as I am sure you know. For now I guess I'll go ahead and do this unless you give me a better suggesion.
            skipping these....
            /forum/faq.php?sid=f84138def680a0fc192b08
            etc.
            etc.
            othewise the only thing else I can think of is the whole url.
            Appreciated.

            Comment


            • #7
              After some detailed work I'll tell you what I did in case you want to log it or not log it for anyone else. I kept all the entries you mentioned skipping on your reference page above for forums. Then just a few I had to adjust and add more to keep things from being duplicated. So on all these following I had to include the full url but doesn't matter it works. I tried it the other way but I needed these too. So all these full urls are added to the skip list for the forum program phpBB2, along with what you already mentioned before.

              http://etc./forum/login.php?
              " " "/forum/profile.php?
              " " " /forum/faq.php?
              /forum/search.php?
              /forum/viewonline.php?
              /forum/viewtopic.php?
              /forum/index.php?c=

              On the last entry it is important to include it that way because the c entries
              in index.php are duplicates of all the forums, and they are not needed.

              Doing it this way took care of ALL the problems I can see, and now the only things listed in the search results are the the posts only. Works just fine, thank you.

              Comment


              • #8
                Glad you got it working.

                Just to clarify. If you enter the URL fragment,
                /forum/faq.php
                into the skip list it will skip every page whose URL contains this text.

                Adding and additional entry like,
                /forum/faq.php?
                is redundant.

                If you saw any other behaviour, then there is another problem somewhere.

                Yes, you can enter URL fragments or entire URLs into the skip list.

                ------
                David

                Comment


                • #9
                  Two problems I had with phpBB2

                  Hi Guys,

                  I had the exact same problem as Perry, but I think I figured out the problem. When I copied and pasted the list of URL's to skip, there was a space added to the end of each line. So the program was looking for URL's with a space in them, which is never going to happen, so the program never skips those URL's. The fix is to simply delete those spaces after pasting the list into the program.

                  The other problem I had was the phpBB2 indexing got into an endless loop because the &sid=9a89e89234.... kept changing in the URL, so the spidering kept hitting it's limit of 1,000 files, even though I only about 125 phpBB2 pages. It seems that every time you (or the spider) refreshes the main index.php, you get assigned a new SID, which creates a whole new set of links for the spider to check out. I was hoping the CRC option would prevent multiple indexes of the same phpBB2 files, but apparently that didn't help.

                  In my case, the problem was that my non-phpBB2 pages were creating their own SID called YourVisitID, which was part of the site's session management. This YourVisitID was also changing when certain pages were viewed. Then the spider would visit phpBB2 again, creating a new SID, and then continue spidering and eventually create a new YourVisitID, and so on. This is what's called a "race condition", where it becomes an endless loop. My solution was to add "YourVisitID=" to my Skip Options, preventing this endless loop. Just wanted to share this with someone who might encounter a similar problem.

                  Suggestion for the Zoom application:
                  Automatically remove whitespace before and after the URL's in the Skip Options.

                  Thanks,
                  Nick

                  Comment


                  • #10
                    Thanks for the additional information. Your suggestion is noted, and we'll consider it for a future release.

                    Might also add for other readers to note, that the other common mistake is not realizing that the skip list is case sensitive (as URLs need to be, by protocol).
                    Last edited by Ray; Jan-29-2007, 02:24 AM.
                    --Ray
                    Wrensoft Web Software
                    Sydney, Australia
                    Zoom Search Engine

                    Comment

                    Working...
                    X