PHP Classes

File: overview.txt

Recommend this page to a friend!
  Classes of greg jackson   Spider Class   overview.txt   Download  
File: overview.txt
Role: Documentation
Content type: text/plain
Description: Overview for spiderClass.php
Class: Spider Class
Crawl a site following and retrieving linked pages
Author: By
Last change:
Date: 18 years ago
Size: 1,036 bytes
 

Contents

Class file image Download
This class enables you to establish a spider, and then call one page at a time. Methods available: spiderStart($strStartPage) spiderNextPage() getPage($pageToGet) getLinks($strURL="", $strPageContents, $strScrapeRegExp="") Note that: getPage and getLinks can be used 'standalone' without a spider.... ... But the main use of the class is along the lines of: open a new object: $objXXX = new spiderScraper; [note; this doesn't start the spider; instead it allows you to access methods which do start the spider, as well as other methods such as link scraping or page fetching] then start the spider $objXXX -> spiderStart($strStartURL); set the regexps for the spider [see example for use]: $objSportSpider -> arrLinksRegex = $arrLinksRegex; set the spider's [max] depth: objSportSpider -> intCrawlDepth = 4; then call pages one at a time: for ($i = 1; $i <= 250; $i++) { $arrFetchedPage = $objSportSpider -> spiderNextPage(); } SEE EXAMPLES AND SCRIPT COMMENTS FOR FULL USAGE