pf_pageparser

This is a HTML parser I've written because I scrape a lot of web sites to look for structured, repetitive data. This parser allows me to easily cleanup HTML, split it into chunks and find the right data in each chunk It does not use a DOM parser, so it also works on partial or invalid HTML

Installation

You can install the package via composer:

composer require pforret/pf_pageparser

Usage

$pp=New PfPageparser(["cacheTtl" => 300]);

$pp->load_from_url("http://www.example.com/products")
    ->trim("<table","</table>")
    ->split_chunks('</tr>')
    ->filter_chunks('product_id')
    ->parse_from_chunks('|Price: [\d\.]*|',true);

$prices=$pp->results();

Testing

composer test

Changelog

Please see CHANGELOG for more information what has changed recently.

Contributing

Please see CONTRIBUTING for details.

Security

If you discover any security related issues, please email [email protected] instead of using the issue tracker.

Credits

Peter Forret

License

The MIT License (MIT). Please see License File for more information.

PHP Package Boilerplate

This package was generated using the PHP Package Boilerplate.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.github/workflows		.github/workflows
src		src
tests		tests
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.scrutinizer.yml		.scrutinizer.yml
.styleci.yml		.styleci.yml
.travis.yml		.travis.yml
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.md		LICENSE.md
README.md		README.md
TODO.md		TODO.md
composer.json		composer.json
phpunit.xml.dist		phpunit.xml.dist

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pf_pageparser

Installation

Usage

Testing

Changelog

Contributing

Security

Credits

License

PHP Package Boilerplate

About

Releases

Packages

Languages

License

freewhite4/pf_pageparser

Folders and files

Latest commit

History

Repository files navigation

pf_pageparser

Installation

Usage

Testing

Changelog

Contributing

Security

Credits

License

PHP Package Boilerplate

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages