Skip to content

Commit

Permalink
Use GNU style flags
Browse files Browse the repository at this point in the history
  • Loading branch information
mahesh-hegde committed Aug 1, 2023
1 parent 64be733 commit db05789
Show file tree
Hide file tree
Showing 5 changed files with 37 additions and 82 deletions.
79 changes: 14 additions & 65 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
## rrip
# rrip - Bulk-download images from subreddits

Program to bulk-download image from reddit subreddits.

Expand All @@ -12,7 +12,7 @@ Program to bulk-download image from reddit subreddits.

* Log final download URLs to a file using a custom format string.

* Download images from Reddit preview links instead of source. (Experimental)
* Download images from Reddit preview links instead of source, saving some space.

* Scrape images from links that don't end with media extensions. (Experimental)

Expand Down Expand Up @@ -41,69 +41,19 @@ Download from Release section and unpack the binary executable somewhere in your
I wrote this on Linux. May not work well on Windows. A best-effort default option is enabled to sanitize filenames so that they can be saved on Windows / Android. But don't blame me if you face some quirks of Windows OS.

## Usage
```
Usage: rrip <options> <r/subreddit>
-after string
Get posts after the given ID
-allow-special-chars
Allow all characters in filenames except / and \, And windows-special filenames like NUL
-d DryRun i.e just print urls and names (devel)
-download-preview
download reddit preview image instead of posted URL
-entries-limit int
Number of entries to fetch in one API request (devel) (default 100)
-flair-contains string
Download if flair contains substring matching given regex
-flair-not-contains string
Download if flair does not contain substring matching given regex
-folder string
Target folder name
-help
Show this help message
-link-contains string
Download if posted link contains substring matching given regex
-link-not-contains string
Download if posted link does not contain substring matching given regex
-log-links string
Log media links to given file
-log-links-format string
Format of links logged. allowed placeholders: {{final_url}}, {{posted_url}}, {{id}}, {{author}}, {{title}}, {{score}} (default "{{final_url}}")
-max-files int
Max number of files to download (+ve), -1 for no limit (default -1)
-max-size int
Max size of media file in KB, -1 for no limit (default -1)
-max-storage int
Data usage limit in MB, -1 for no limit (default -1)
-min-score int
Minimum score of the post to download
-og-type string
Look Up for a media link in page's og:property if link itself is not image/video (experimental). supported values: video, image, any
-preview-res int
Width of preview to download, eg: 640, 960, 1080 (default -1)
-search string
Search for given term
-sort string
Sort: best|hot|new|rising|top-<all|year|month|week|day>
-title-contains string
Download if title contains substring matching given regex
-title-not-contains string
Download if title does not contain substring matching given regex
-useragent string
UserAgent string (default "rrip / Go CLI Tool")
-v Enable verbose output (devel)
```
Invoke `rrip` without arguments for up-to-date usage output.

## tl;dr

```sh
## Download only <200KB files from r/Wallpaper
rrip -max-size=200 r/Wallpaper
rrip --max-size=200 r/Wallpaper

## Download all time top from r/WildLifePhotography, without exceeding 20MB storage or 50 files
rrip -max-storage=20 -max-files=50 -sort=top-all r/WildlifePhotography
rrip --max-storage=20 --max-files=50 --sort=top-all r/WildlifePhotography

## Search "Neon" on r/AMOLEDBackgrounds and download top 20, sorted by top voted in past one year
rrip -search="Neon" -max-files=20 -sort=top-year r/AMOLEDBackgrounds
rrip --search="Neon" --max-files=20 --sort=top-year r/AMOLEDBackgrounds

## Download memes from r/LogicGateMemes, download reddit previews (640p)
## instead of original image, for space savings.
Expand All @@ -116,42 +66,41 @@ rrip -search="Neon" -max-files=20 -sort=top-year r/AMOLEDBackgrounds
## use -prefer-preview instead of -download-preview
## to download original URL if no preview could be found

rrip -download-preview -preview-res=640 -data-output-file=meme.txt -data-output-format="{{.final_url}} {{title}}" r/LogicGateMemes
rrip --download-preview --preview-res=640 --data-output-file=meme.txt --data-output-format="{{.final_url}} {{.title}}" r/LogicGateMemes

## Log all image links from r/ImaginaryLandscape
## without downloading files, using -d (dry run) option.
## (Reddit shows last 600 or so.., not really "all")
rrip -d -data-output-file=imaginary_landscapes.txt -data-output-format="{{score}} {{.final_url}} {{.quoted_title}} {{.author}}" r/ImaginaryLandscapes
rrip -d --data-output-file=imaginary_landscapes.txt --data-output-format="{{.score}} {{.final_url}} {{.quoted_title}} {{.author}}" r/ImaginaryLandscapes
```

### Using template options
Go `text/template` syntax can be used to do versatile filtering. It can also be used to do formatting of logged links.

```sh
## Inspect the JSON of post using `print-post-data`
./rrip -print-post-data -max-files=1 r/AMOLEDBackgrounds
## Inspect the JSON of post using --print-post-data
rrip --print-post-data --max-files=1 r/AMOLEDBackgrounds

## After inspecting the JSON, you can use the field values in `-template-filter` to filter based on any attribute.
## If the template evaluates to "false", "", or "0", the post will be skipped by rrip

## Example: only download gilded posts
./rrip -template-filter='{{gt .gilded 0.0}}' -max-files=20 -sort=top-y
rrip --template-filter='{{gt .gilded 0.0}}' --max-files=20 --sort=top-y
ear r/AMOLEDBackgrounds

## Example: only download posts by a given author, say u/temporary_08
./rrip -template-filter='{{eq .author "temporary_08"}}' -max-files=20 r/AMOLEDBackgrounds
rrip --template-filter='{{eq .author "temporary_08"}}' --max-files=20 r/AMOLEDBackgrounds

## Example: skip potentially unsafe content
./rrip -template-filter='{{not .over_18}}' -max-files=20 r/AMOLEDBackgrounds
rrip --template-filter='{{not .over_18}}' --max-files=20 r/AMOLEDBackgrounds

## Example: Log links to a file with author, upvote ratio, and quoted title.
## Use dry run (-d) to skip download
./rrip -d -data-output-file=amoled.txt -data-output-format='{{.upvote_ratio}} {{.author}} {{.quoted_title}}' r/AMOLEDBackgrounds
rrip -d --data-output-file=amoled.txt --data-output-format='{{.upvote_ratio}} {{.author}} {{.quoted_title}}' r/AMOLEDBackgrounds
```

## Caveats
* Can't handle crossposts when downloading preview image.
* No support for downloading albums.
* Some options don't work together
* Many other caveats I don't remember.

2 changes: 2 additions & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,5 @@ module github.com/mahesh-hegde/rrip
go 1.18

require golang.org/x/net v0.0.0-20220425223048-2871e0cb64e4

require github.com/spf13/pflag v1.0.5
2 changes: 2 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
github.com/spf13/pflag v1.0.5 h1:iy+VFUOCP1a+8yFto/drg2CJ5u0yRoB7fZw3DKv/JXA=
github.com/spf13/pflag v1.0.5/go.mod h1:McXfInJRrz4CZXVZOBLb0bTZqETkiAhM9Iw0y3An2Bg=
golang.org/x/net v0.0.0-20220425223048-2871e0cb64e4 h1:HVyaeDAYux4pnY+D/SiwmLOR36ewZ4iGQIIrtnuCjFA=
golang.org/x/net v0.0.0-20220425223048-2871e0cb64e4/go.mod h1:CfG3xpIq0wQ8r1q4Su4UZFWDARRcnwPjda9FqA0JpMk=
35 changes: 19 additions & 16 deletions rrip.go
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
package main

import (
"crypto/tls" // For disabling http/2!
"crypto/tls"
"encoding/json"
"flag"
"fmt"
"html"
"io"
Expand All @@ -14,8 +13,12 @@ import (
"regexp"
"strings"
"text/template"

flag "github.com/spf13/pflag"
)

// For disabling http/2!

const (
UserAgent = "rrip / Go CLI Tool"
DefaultLimit = 100
Expand Down Expand Up @@ -536,20 +539,20 @@ func main() {
var dataOutputFormat, templateFilter string

// option parsing
flag.BoolVar(&options.Debug, "v", false, "Enable verbose output (devel)")
flag.BoolVar(&options.DryRun, "d", false, "DryRun i.e just print urls and names (devel)")
flag.BoolVarP(&options.Debug, "verbose", "v", false, "Enable verbose output (devel)")
flag.BoolVarP(&options.DryRun, "dry-run", "d", false, "DryRun i.e just print urls and names (devel)")
flag.BoolVar(&options.AllowSpecialChars, "allow-special-chars", false,
"Allow all characters in filenames except / and \\, "+
"And windows-special filenames like NUL")
flag.BoolVar(&options.PrintPostData, "print-post-data", false, "Print posts data as JSON. Implies dry run")
flag.BoolVarP(&options.PrintPostData, "print-post-data", "P", false, "Print posts data as JSON. Implies dry run")
flag.StringVar(&options.After, "after", "", "Get posts after the given ID")
flag.StringVar(&options.UserAgent, "useragent", UserAgent, "UserAgent string")
flag.StringVarP(&options.UserAgent, "useragent", "U", UserAgent, "UserAgent string")
flag.Int64Var(&options.MaxStorage, "max-storage", -1, "Data usage limit in MB, -1 for no limit")
flag.Int64Var(&options.MaxSize, "max-size", -1, "Max size of media file in KB, -1 for no limit")
flag.Int64VarP(&options.MaxSize, "max-size", "z", -1, "Max size of media file in KB, -1 for no limit")
flag.StringVar(&options.Folder, "folder", "", "Target folder name")

flag.StringVar(&dataOutputFileName, "data-output-file", "", "Log media links to given file")
flag.StringVar(&dataOutputFormat, "data-output-format", defaultDataOutputFormat, "Template for saving post data")
flag.StringVarP(&dataOutputFileName, "data-output-file", "O", "", "Log media links to given file")
flag.StringVarP(&dataOutputFormat, "data-output-format", "f", defaultDataOutputFormat, "Template for saving post data")
flag.StringVar(&templateFilter, "template-filter", "", "Posts will be ignored if this template evaluates to \"false\", \"0\" or empty string")

flag.StringVar(&options.OgType, "og-type", "", "Look Up for a media link in page's og:property"+
Expand Down Expand Up @@ -604,9 +607,9 @@ func main() {

// validate some arguments
toCheck := map[string]int64{
"-max": int64(options.MaxFiles),
"-max-storage": options.MaxStorage,
"-max-size": options.MaxSize,
"--max": int64(options.MaxFiles),
"--max-storage": options.MaxStorage,
"--max-size": options.MaxSize,
}
for option, value := range toCheck {
if value < 1 && value != -1 {
Expand All @@ -622,17 +625,17 @@ func main() {

if options.PreviewRes > 0 && !options.DownloadPreview &&
!options.PreferPreview {
fatal("-download-preview or -prefer-preview should be used with " +
"-preview-res")
fatal("--download-preview or --prefer-preview should be used with " +
"--preview-res")
}

if options.PreferPreview && options.DownloadPreview {
fatal("Use only one of -prefer-preview and -download-preview")
fatal("Use only one of --prefer-preview and --download-preview")
}

og := options.OgType
if og != "" && og != "video" && og != "image" && og != "any" {
fatal("Only supported values for -og-type are image, video and any")
fatal("Only supported values for --og-type are image, video and any")
}

// if PrintPostData is enabled, enable dry run
Expand Down
1 change: 0 additions & 1 deletion template_util.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@ import (

func createTemplate(name string, tm string) *template.Template {
tmpl, err := template.New(name).Parse(tm)
log("create template")
check(err, "cannot parse template:", tm)
return tmpl
}
Expand Down

0 comments on commit db05789

Please sign in to comment.