A golang package that "resolves" a given URL by issuing a GET request, following any redirects, canonicalizing the final URL, and attempting to extract the title from the final response body.
A URL is resolved by issuing a GET
request and following any redirects until
a non-30x
response is received.
The final URL is aggressively canonicalized using a combination of
PuerkitoBio/purell and some manual heuristics for removing
unnecessary query params (e.g. utm_*
tracking params), normalizing case (e.g.
twitter.com/Thresholderbot
and twitter.com/thresholderbot
are the same).
Canonicalization is optimized for URLs that are shared on social media.
TL;DR: Use safedialer.Control
in the transport's dialer to
block attempts to resolve URLs pointing at internal, private IP addresses.
Exposing functionality like this on the internet can be dangerous, because it could theoretically allow a malicious client to discover information about your internal network by asking it to resolve URLs whose DNS points at private IP addresses.
The dangers, along with a golang-specific mitigation, are outlined in Andrew Ayer's excellent "Preventing Server Side Request Forgery in Golang" blog post.
To mitigate that danger, users are strongly encouraged to use
safedialer.Control
as the Control
function in the dialer used
by the transport given to urlresolver.New
.
See github.com/mccutchen/urlresolverapi for a productionized example, deployed at https://urlresolver.com.