Skip to content

A service which has an endpoint to extract data from an URL using CSS selectors

License

Notifications You must be signed in to change notification settings

netsi1964/extract-content

Repository files navigation

extract-content

A service which has an endpoint to extract data from an URL using CSS selectors

You may see example of use on this pen on CodePen.io.

I have also made a test page where you can try it out: Extract content from any website.

You can read a series of post about the project on Medium.com: Let’s build a content extract endpoint.

Endpoints

/

Returns a JSON object with fetched texts from the specified selectors.

Example request:

from=https://www.dr.dk &extract=%7B"overskrift"%3A".dredition-summary"%7D

The extract values have been encoded using encodeURIComponent.

Original json value:

{"overskrift":"p"}

Same json value encoded:

%7B%22overskrift%22%3A%22p%22%7D
parameter description required
from The url to fetch content from Yes
extract json object with name:selector yes

/html

Returns a JSON with either the whole HTML from the specified from url, or as with / endpoint, but not the text but the html.

parameter description required
from The url to fetch content from Yes
extract json object with name:selector no

Created by Sten Hougaard, Marts 2018. @netsi1964.

About

A service which has an endpoint to extract data from an URL using CSS selectors

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published