This is a HTML parser I've written because I scrape a lot of web sites to look for structured, repetitive data. This parser allows me to easily cleanup HTML, split it into chunks and find the right data in each chunk It does not use a DOM parser, so it also works on partial or invalid HTML
You can install the package via composer:
composer require pforret/pf_pageparser
$pp=New PfPageparser(["cacheTtl" => 300]);
$pp->load_from_url("http://www.example.com/products")
->trim("<table","</table>")
->split_chunks('</tr>')
->filter_chunks('product_id')
->parse_from_chunks('|Price: [\d\.]*|',true);
$prices=$pp->results();
composer test
Please see CHANGELOG for more information what has changed recently.
Please see CONTRIBUTING for details.
If you discover any security related issues, please email [email protected] instead of using the issue tracker.
The MIT License (MIT). Please see License File for more information.
This package was generated using the PHP Package Boilerplate.