PHP Classes

Web Analyzer: Analyze contents of pages retrieved with a browser

Recommend this page to a friend!
  Info   View files Example   View files View files (22)   DownloadInstall with Composer Download .zip   Reputation   Support forum   Blog    
Ratings Unique User Downloads Download Rankings
Not enough user ratingsTotal: 54 All time: 10,587 This week: 89Up
Version License PHP version Categories
web-analyzer 1.0.0MIT/X Consortium ...5HTML, PHP 5, Parsers
Description 

Author

This package can analyze the contents of pages retrieved with a browser.

It provides a class that can retrieve the page contents with a URL using a browser like Firefox.

The package provides another class to analyze the HTML of the retrieved page contents and retrieves the page JavaScript, CSS, and images to store in files.

Picture of Chun-Sheng, Li
  Performance   Level  
Name: Chun-Sheng, Li <contact>
Classes: 29 packages by
Country: Taiwan Taiwan
Age: 30
All time rank: 22036 in Taiwan Taiwan
Week rank: 25 Up1 in Taiwan Taiwan Up
Innovation award
Innovation award
Nominee: 14x

Winner: 1x

Example

<?php

require_once './Analyzer.php';
require_once
'./Browser.php';

use
web\analyzer\Analyzer;
use
web\analyzer\Browser;

//exec('bash ./req-url.sh');

$filePath = './url-lists.txt';
$rootPath = '/home/lab223/web-curl/';
$firefoxPath = '/home/lab223/firefox/firefox';
if(
file_exists($filePath)) {
   
$webFilePath = [
       
'root-path' => $rootPath,
       
'paths' => [],
    ];
   
$handler = fopen($filePath, 'r');
   
$urlList = [];
   
$lists = [];
    while(!
feof($handler)) {
       
$str = trim(fgets($handler, 4096));
        if(
stristr($str, '#') !== false) {
            continue;
        }
        if(
stristr($str, '(') !== false) {
           
$strArr = explode('(', $str);
            @
mkdir($rootPath.$strArr[0]);
           
$lists[] = $rootPath.$strArr[0];
           
$url = str_replace([')', ' '], '', $strArr[1]);
           
$urlList[] = $url;
        }
    }
   
fclose($handler);
   
$index = 0;
   
$str = file_get_contents('./marcos.template');

    foreach(
$urlList as $reqUrl) {
       
$str .= 'URL GOTO='.$reqUrl.PHP_EOL;
       
$str .= 'SET !TIMEOUT_TAG 120'.PHP_EOL;
       
$str .= 'WAIT SECONDS=20'.PHP_EOL;
       
$str .= 'SAVEAS TYPE=HTM FOLDER='.$lists[$index].' FILE=index.html'.PHP_EOL;
       
$webFilePath['paths'][$index] = $lists[$index];
       
$index++;
    }
   
$str .= 'TAB CLOSE';
   
file_put_contents('./marcos.iim', $str);
   
//system($firefoxPath.' "imacros://run/?m=marcos.iim"');

    //compress image size
    //$analyer = new Analyzer($webFilePath);
    //$analyer->analyze('DOM');

    //check CSS2 or CSS3
    //$analyer = new Analyzer($webFilePath);
    //$analyer->cssVersion();

    //evaluate time before running command: php -S localhost:8000 -t /path/to/web-curl
   
$browser = new Browser($webFilePath, 'firefox', $rootPath, $firefoxPath);
   
$browser->eveluateTime();

} else {
    echo
'The '.$filePath.' is not existed...';
}


Details

web-analyzer

Analyze the web HTML contents


  Files folder image Files  
File Role Description
Files folder imagedatabase (1 file)
Files folder imagehtml (8 files, 1 directory)
Plain text file Analyzer.php Class Class source
Plain text file Browser.php Class Class source
Accessible without login Plain text file browser.template Data Auxiliary data
Accessible without login Plain text file click-btn.back.js Data Auxiliary data
Accessible without login Plain text file composer.json Data Auxiliary data
Accessible without login Plain text file composer.lock Data Auxiliary data
Accessible without login Plain text file index.php Example Example script
Accessible without login Plain text file INSTALL.sh Data Auxiliary data
Accessible without login Plain text file marcos.template Data Auxiliary data
Accessible without login Plain text file performance-time.js Data Auxiliary data
Accessible without login Plain text file README.md Doc. Documentation
Accessible without login Plain text file run-crawler.sh Data Auxiliary data

  Files folder image Files  /  database  
File Role Description
  Plain text file Database.php Class Class source

  Files folder image Files  /  html  
File Role Description
Files folder imagechrome-r-summary (1 file)
  Accessible without login Plain text file generate.txt Doc. Documentation
  Accessible without login Plain text file gen_csv.php Aux. Auxiliary script
  Accessible without login Plain text file html.php Aux. Auxiliary script
  Accessible without login Plain text file inte_chrome_csv.php Aux. Auxiliary script
  Accessible without login Plain text file load-page.js Data Auxiliary data
  Accessible without login Plain text file phantomjs.php Aux. Auxiliary script
  Accessible without login Plain text file run.sh Data Auxiliary data
  Accessible without login Plain text file web-defet.js Data Auxiliary data

  Files folder image Files  /  html  /  chrome-r-summary  
File Role Description
  Accessible without login Plain text file avg_record.php Aux. Auxiliary script

 Version Control Unique User Downloads Download Rankings  
 100%
Total:54
This week:0
All time:10,587
This week:89Up