Skip to content

Low output latency streaming HTML parser/rewriter with CSS selector-based API. Written in Rust, with bindings to C and other languages.

License

Notifications You must be signed in to change notification settings

novusnota-forks/lol-html

 
 

Repository files navigation

LOL HTML

The logo is generated from https://openmoji.org/data/color/svg/1F602.svg by Emily Jäger which is licensed under CC BY-SA 4.0 (https://creativecommons.org/licenses/by-sa/4.0/)

Low Output Latency streaming HTML rewriter/parser with CSS-selector based API.

It is designed to modify HTML on the fly with minimal buffering. It can quickly handle very large documents, and operate in environments with limited memory resources. More details can be found in the blog post.

The crate serves as a back-end for the HTML rewriting functionality of Cloudflare Workers, but can be used as a standalone library with a convenient API for a wide variety of HTML rewriting/analysis tasks.

Documentation

https://docs.rs/lol_html/

Bindings for other programming languages

  • C
  • Lua
  • Go (unofficial, not coming from Cloudflare)
  • Ruby (unofficial, not coming from Cloudflare)

Example

Rewrite insecure hyperlinks:

use lol_html::{element, HtmlRewriter, Settings};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut output = vec![];

    let mut rewriter = HtmlRewriter::new(
        Settings {
            element_content_handlers: vec![
                element!("a[href]", |el| {
                    let href = el
                        .get_attribute("href")
                        .expect("href was required")
                        .replace("http:", "https:");

                    el.set_attribute("href", &href)?;

                    Ok(())
                })
            ],
            ..Settings::new()
        },
        |c: &[u8]| output.extend_from_slice(c)
    );

    rewriter.write(b"<div><a href=")?;
    rewriter.write(b"http://example.com>")?;
    rewriter.write(b"</a></div>")?;
    rewriter.end()?;

    assert_eq!(
        String::from_utf8(output)?,
        r#"<div><a href="https://example.com"></a></div>"#
    );
    Ok(())
}

License

BSD licensed. See the LICENSE file for details.

About

Low output latency streaming HTML parser/rewriter with CSS selector-based API. Written in Rust, with bindings to C and other languages.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Rust 86.6%
  • C 13.2%
  • Shell 0.2%