phpporser
v0.3
如果您想將此庫與無頭瀏覽支持使用(包括Standart In Packagist版本),則還需要無頭的Chromium PHP和Chromium可執行文件。
$ composer require shamanhead/phpporser
這可能在Windows,MacOS和Linux上起作用。
因此,請繼續下載官方的鉻瀏覽器下載頁面並下載。
執行此步驟後,解開存檔並移至必要的位置。
然後,在腳本中指定路徑:
require_once " vendor/autoload.php " ;
use HeadlessChromium Page ;
use ShamanHead PhpPorser App Dom as Dom ;
$ dom = new Dom ();
$ dom -> setHref ( ' file:///home/shamanhead/dev/porser/phpporser-master/test.html ' );
$ dom -> setBrowserPath ( ' PATH_TO_CHROME ' );如果您做得很好,解析器將起作用。如果您在此步驟中發生任何錯誤,可以在此處查看,是否有解決問題的解決方案。在其他情況下,請在此處或無頭Chromium PHP頁面上打開新問題。
首先,讓我們嘗試在頁面頂部獲取“計算機科學”字符串:
<?php
require_once " vendor/autoload.php " ;
use ShamanHead PhpPorser App Dom as Dom ;
$ dom = new Dom ();
$ dom -> setHref ( ' https://en.wikipedia.org/wiki/Computer_science ' );
print_r ( $ dom -> tag ( ' h1 ' )-> class ( ' firstHeading ' )-> text ()-> merge ());
?>這是有效的!但是如何?讓我解釋一下:
<?php
require_once " vendor/autoload.php " ;
use ShamanHead PhpPorser App Dom as Dom ;
$ dom = new Dom ();
$ dom -> setHref ( ' href to file ' );
print_r ( $ dom -> tag ( ' h1 ' )-> array ()); //finds by tag name 'h1'
print_r ( $ dom -> id ( ' firstHeading ' )-> array ()); //finds by id name 'firstHeading'
print_r ( $ dom -> class ( ' wrapper__main ' )-> array ()); //finds by class name 'wrapper_main'
print_r ( $ dom -> custom ([ ' name ' , ' button ' ])-> array ()); //finds by 'name' attribute value 'button'
?>您可以相互結合搜索方法,以特殊的方式找到元素:
<?php
require_once " vendor/autoload.php " ;
use ShamanHead PhpPorser App Dom as Dom ;
$ dom = new Dom ();
$ dom -> setHref ( ' href to file ' );
print_r ( $ dom -> class ( ' main ' )-> id ( ' firstHeading ' )-> tag ( ' h1 ' )-> array ());
?> <?php
require_once " vendor/autoload.php " ;
use ShamanHead PhpPorser App Dom as Dom ;
$ dom = new Dom ();
$ dom -> setHref ( ' href to file ' );
$ divText = $ dom -> tag ( ' div ' )-> id ( ' someDiv ' )-> text ();
$ divText -> contents (); //Returns all text in array form.
$ divText -> merge ( ' symbol ' ); //Returns all text in string form with 'symbol' separator
//'n' by default.
$ divText -> first (); //Returns first founded text.
$ divText -> last (); //Returns last founded text.
?>