All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Adds support for parsing the body of the web page(
p,h1,h2,h3,h4,h5,h6,blockquote
)
- Adds Ruby 2.7 to travis CI
- Updates the gem description
- Removes the "no framework detected" warning when Rails could not be loaded
- Removes useless rails require directive and
railtie.rb
- Adds
download_size_limit
configuration to raiseLinkThumbnailer::DownloadSizeLimit
when the body of the request is too big. Defaults to10 * 1024 * 1024
bytes. - Adds
favicon_size
configuration to allow to choose which favison size the gem should prefer. Defaults to the first favicon found otherwise.
- Fixes string encoding in previous versions of Ruby
- Fixes favicon by providing the full path.
- When HTML charset cannot be found in the HTML header, we now try to find it in the body.
- Closes the HTTP connection upon completion
- 401 HTTP errors now raise
LinkThumbnailer::HTTPError
- Upgrades ImageInfo gem
- Frozen strings #125
- Gem upgrade (json)
- Allows to configure overrided http headers
LinkThumbnailer.configure do |config|
config.http_override_headers = { 'Accept-Encoding' => 'none', ... }
end
- Fixes #88
- Override User-Agent header properly
- Match xpath nodes if attribute content is present
- Avoid nil urls in image parser
Makes scrapers configurable by allowing to set the scraping strategy:
LinkThumbnailer.configure do |config|
config.scrapers = [:opengraph, :default]
end
opengraph
use the Open Graph Protocol.
default
use a homemade algorithm
Allows to customize ideal description length
Pass the :ideal_description_length option to the Graders::Length initializer to customize the ideal description length of a website. In the rails initializer:
LinkThumbnailer.configure do |config|
config.graders = [
->(description) { ::LinkThumbnailer::Graders::Length.new(description, ideal_description_length: 500) },
->(description) { ::LinkThumbnailer::Graders::HtmlAttribute.new(description, :class) },
->(description) { ::LinkThumbnailer::Graders::HtmlAttribute.new(description, :id) },
->(description) { ::LinkThumbnailer::Graders::Position.new(description, weigth: 3) },
->(description) { ::LinkThumbnailer::Graders::LinkDensity.new(description) },
]
end
Will default to 120
characters. More information about how the gem manage to find the best description can be found at
http://www.codeids.com/2015/06/27/how-to-find-best-description-of-a-website-using-linkthumbnailer/
- Fixes #69
- Upgrade
video_info
gem
- Fix an issue when image sizes could not be retrieved.
- Grapers now accepts an optional parameter to customize the weigth of the grader in the probablity computation.
LinkThumbnailer::Graders::Position.new(description, weigth: 3)
Will give a 3 times more weigth to the Position
grader compare to other graders.
By default all graders have a weigth of 1
except the above position grader since position should play a bigger role in
order to find good description candidates.
- Fix an issue when dealing with absolute urls. #68
- Fix an issue with http redirection and location header not beeing present. #70
- Rescue and raise custom LinkThumbnailer exceptions. #71
- Replace FastImage gem dependency by ImageInfo to improve performances when fetching multiple images size information. Benchmark shows an order of magnitude improvement response time.
- Fixes #57
- Remove useless dependencies
- Improved description sorting.
- Refactored how graders work. More information here
- Fix remove useless dependency
- Introduce new
raise_on_invalid_format
option (false by default) to raiseLinkThumbnailer::FormatNotSupported
if httpContent-Type
is invalid. Fixes #61 and #64.
- Fix OpenURI::HTTPError exception raised when video_info gem is not able to parse video metadata. Fixes #60.
- Implement
Set-Cookie
header between http redirections to set cookies when site requires it. Fixes #55.
- Handles seamlessly
og:image
andog:image:url
- Handles seamlessly
og:video
andog:video:url
- Handles
og:video:width
andog:video:height
for one video only (please create a ticket if you want support for multiple videos/images width & height)
- Fix calling
as_json
onwebsite
to returnas_json
representation of videos and images, not just their urls
- Gem updates
- Handle connection through proxy automatically using the
ENV['HTTP_PROXY']
variable thanks to taganaka.
- Fix an issue with vimeo opengraph urls. Fixes #46
- Fix an issue with the link density grader caused by links with image instead of text. Fixes #45
- Add requested favicon scraper #40
Add :favicon
to config.attributes
in LinkThumbnailer initializer:
config.attributes = [:title, :images, :description, :videos, :favicon]
Then
o = LinkThumbnailer.generate('https://github.com')
o.favicon
=> "https://github.com/fluidicon.png"
- Fixes #41
- Fixes #41
- Fixes issue when computing link density ratio
- Add support for
og:video
- Add support for multiple
og:video
as well
LinkThumbnailer will return the following json for example:
{
id: 'x7lni3',
src: 'http://www.dailymotion.com/video/x7lni3',
size: [640, 360],
duration: 136,
provider: 'Dailymotion',
embed_code: '<iframe src="//www.dailymotion.com/embed/video/x7lni3" frameborder="0" allowfullscreen="allowfullscreen"></iframe>'
}
Add :videos
into your config/initializers/link_thumbnailer.rb
attributes
config in order to start scraping videos.
Ex:
config.attributes = [:title, :images, :description, :videos]
- Increased
og:image
scraping performance by parsingog:image:width
andog:image:height
attribute if specified - Introduced
image_stats
option to allow disabling image size and type parsing causing performance issues.
When disabled, size will be [0, 0]
and type will be nil
- Fixes #39
- Fixes #37
- Fixes couple of issues with
URI
class namespace
- Fixes issue with image parser (fastimage) when given an URI instance instead of a string
- Fully refactored LinkThumbnailer
- Introduced Graders
- Introduced Scrapers
- Ability to score descriptions
- Ability to fetch multiple
og:image
- Fixed memoized run-time options
- Fixed some website urls not working
- Refactor ugly code
- More specs
- Removed
PreviewsController
since it does not add much value. Simply create your own and use theto_json
method.
To update from 1.x.x
to 2.x.x
you need to run rails g link_thumbnailer:install
to get the new configuration file.
If you used the PreviewsController
feature, you need to build it yourself since it is not supported anymore.
- Fixes issue with FastImage URLs #31
- Fixes route helper not working under rails 4.
- Replace RMagick by FastImage
- Rename
rmagick_attributes
config intoimage_attributes
- Fixes issue when Location header used a relative path instead of an absolute path
- Update gemfile to be more flexible when using Hashie gem
- Thanks to juriglx, support for canonical urls
- Bug fixes
- Fixes an issue with the preview controller
- Fixes an issue when setting
strict
option. Always returning OG representation.
- Thanks to phlegx, support for timeout http connection through configurations.
- Fixes issue #7: nil img was returned when exception is raised. Now skiping nil images in results.
- Thanks to phlegx, support for SSL and User Agent customization through configurations.
- Fixes issue #5: Url was incorect in case of HTTP Redirections.
- Bug fix when doing
rails g link_thumbnailer:install
by explicitly specifying the scope of Rails
- User can now set options at runtime by passing valid options to
generate
method
- Refactor LinkThumbnailer#generate method to have a cleaner code
- Update readme
- Add PreviewController for easy integration with user's app
- Add link_thumbnailer routes for easy integration with user's app
- Refactor some code
- Change 'to_a' method to 'to_hash' in object model
- Update readme
- Add
to_a
to WebImage class - Add specs corresponding
- Refactor
to_json
for WebImage class
- Bug fix
- Remove
require 'rails'
from spec_helper.rb - Remove rails dependences (blank? method) in code
- Spec fix
- Add specs for almost all classes
- Add a method
to_json
for WebImage class to be able to get a usable array of images' attributes
- Add specs for LinkThumbnailer class
- Refactor config system, now using dedicated configuration class
- Added Rspec
- Now checking if attribute is blank for LinkThumbnailer::Object.valid? method
- First release 🎆