Also needs Headless Chromium PHP and chromium executable, if you want to use this library with headless browsing support(includes by standart in packagist version).
$ composer require shamanhead/phpporser
This might works on Windows, MacOs and Linux.
So, go on the official chromium browser downloading page and download it.
After doing this step, unpack archive and move to necessary place.
Then, specify path in your script:
require_once "vendor/autoload.php";
use HeadlessChromiumPage;
use ShamanHeadPhpPorserAppDom as Dom;
$dom = new Dom();
$dom->setHref('file:///home/shamanhead/dev/porser/phpporser-master/test.html');
$dom->setBrowserPath('PATH_TO_CHROME');If you done all right, parser would work. If you have any errors occuring during this step, you can go see here, is there solution to solve your problem. In other case, please, open new issue here or on Headless Chromium PHP page.
First of all, let's try to get 'Computer sciense' string on top of the page:
<?php
require_once "vendor/autoload.php";
use ShamanHeadPhpPorserAppDom as Dom;
$dom = new Dom();
$dom->setHref('https://en.wikipedia.org/wiki/Computer_science');
print_r($dom->tag('h1')->class('firstHeading')->text()->merge());
?>It's works! But how? Let's me explain:
<?php
require_once "vendor/autoload.php";
use ShamanHeadPhpPorserAppDom as Dom;
$dom = new Dom();
$dom->setHref('href to file');
print_r($dom->tag('h1')->array()); //finds by tag name 'h1'
print_r($dom->id('firstHeading')->array()); //finds by id name 'firstHeading'
print_r($dom->class('wrapper__main')->array()); //finds by class name 'wrapper_main'
print_r($dom->custom(['name', 'button'])->array()); //finds by 'name' attribute value 'button'
?>You can combine search methods with each other, to find elements in special way:
<?php
require_once "vendor/autoload.php";
use ShamanHeadPhpPorserAppDom as Dom;
$dom = new Dom();
$dom->setHref('href to file');
print_r($dom->class('main')->id('firstHeading')->tag('h1')->array());
?><?php
require_once "vendor/autoload.php";
use ShamanHeadPhpPorserAppDom as Dom;
$dom = new Dom();
$dom->setHref('href to file');
$divText = $dom->tag('div')->id('someDiv')->text();
$divText->contents(); //Returns all text in array form.
$divText->merge('symbol'); //Returns all text in string form with 'symbol' separator
//'n' by default.
$divText->first(); //Returns first founded text.
$divText->last(); //Returns last founded text.
?>