Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace ImageMagick usage with NetPBM and libjpeg-tools programs #61

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,11 @@ Antivirus products can mitigate the risks of malware, but they are imperfect. Th
| Input | Output | Uses | Purpose |
|--------|--------|---------------------------------------------|---------|
| doc(x) | pdf | [LibreOffice](https://www.libreoffice.org/) | Removes any embedded macros etc and turns .doc(x) to portable PDF which can be e.g. embedded in HTML. |
| jpeg | jpeg | [ImageMagick](https://imagemagick.org/) | Strip away all metadata and extraneous bytes, keep only pixel-by-pixel color data. Conversion performed with intermediate [PPM](https://en.wikipedia.org/wiki/Netpbm) format. |
| jpeg | jpeg | [Independent JPEG Group libjpeg](https://ijg.org/) | Strip away all metadata and extraneous bytes, keep only pixel-by-pixel color data. Conversion performed with intermediate [PPM](https://en.wikipedia.org/wiki/Netpbm) format. |
| pdf | pdf/a | [Ghostscript](https://www.ghostscript.com/) | Clean up a PDF with conversion to [PDF/A](https://en.wikipedia.org/wiki/PDF/A) for archival purposes. Beware the potentially large file sizes. |
| pdf | jpeg | [Ghostscript](https://www.ghostscript.com/) | Converts the first page to jpeg for thumbnails or previews. |
| pdf | text | [Ghostscript](https://www.ghostscript.com/) | Extract plain text from a PDF. Does **not** perform [OCR](https://en.wikipedia.org/wiki/Optical_character_recognition). |
| png | png | [ImageMagick](https://imagemagick.org/) | Strip away all metadata and extraneous bytes, keep only pixel-by-pixel color data. Conversion performed with intermediate [PPM](https://en.wikipedia.org/wiki/Netpbm) format. |
| png | png | [Netpbm](https://netpbm.sourceforge.net/) | Strip away all metadata and extraneous bytes, keep only pixel-by-pixel color data. Conversion performed with intermediate [PPM](https://en.wikipedia.org/wiki/Netpbm) format. |
| xls(x) | pdf | [LibreOffice](https://www.libreoffice.org/) | Removes any embedded macros etc and turns .xls(x) to portable PDF which can be e.g. embedded in HTML. |

The `laundry` HTTP server provides an REST API and online tool to try out the conversions and antivirus scans directly from the browser. Optional API-key-based authorization is available.
Expand Down
3 changes: 2 additions & 1 deletion docker-build/Dockerfile.laundry-programs
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
FROM ubuntu:18.04
RUN apt-get update && apt-get -y upgrade && apt-get -y install \
ghostscript \
imagemagick \
libjpeg-progs \
netpbm \
libreoffice-writer \
libreoffice-calc \
&& rm -rf /var/lib/apt/lists/*
Expand Down
2 changes: 1 addition & 1 deletion programs/jpeg2jpeg
Original file line number Diff line number Diff line change
Expand Up @@ -15,5 +15,5 @@ docker run \
-i \
--rm \
laundry-programs \
/bin/bash -c 'convert jpg:- ppm:- |convert -quality 97 ppm:- jpg:-' \
/bin/bash -c 'djpeg | cjpeg -quality 97' \
< "$INPUT" > "$OUTPUT"
2 changes: 1 addition & 1 deletion programs/png2png
Original file line number Diff line number Diff line change
Expand Up @@ -15,5 +15,5 @@ docker run \
-i \
--rm \
laundry-programs \
/bin/bash -c 'convert png:- ppm:- |convert ppm:- png:-' \
/bin/bash -c 'pngtopnm | pnmtopng' \
< "$INPUT" > "$OUTPUT"