Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binary data. #10

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Binary data. #10

wants to merge 2 commits into from

Conversation

kousu
Copy link
Contributor

@kousu kousu commented Mar 31, 2019

The QR spec is supposed to support binary data, but there are this does not currently work. Not even zxing-cpp can handle this:
$ zxing --more --verbose tests/fixtures/amen-02.png
Hybrid binarizer failed: zxing::ReaderException: No code detected
Global binarizer failed: zxing::ReaderException: No code detected

so before this can be committed, zxing-cpp needs to be repaired,
or we need to do #7 first.

The test file is from https://sampleswap.org//samples-ghost/DRUM%20LOOPS%20and%20BREAKS/161%20to%20180%20bpm/128[kb]161_amenvar3.aif.mp3
and it should be public domain, plus it is an extremely short sample
in any case which should make it fall under fair use no matter what.

This does not currently work. Not even [zxing-cpp](https://github.com/glassechidna/zxing-cpp/)
can handle this:
$ zxing --more --verbose tests/fixtures/amen-02.png
Hybrid binarizer failed: zxing::ReaderException: No code detected
Global binarizer failed: zxing::ReaderException: No code detected

so before this can be committed, zxing-cpp needs to be repaired,
or we need to do #7 first.

The test file is from https://sampleswap.org//samples-ghost/DRUM%20LOOPS%20and%20BREAKS/161%20to%20180%20bpm/128[kb]161_amenvar3.aif.mp3
and it should be public domain, plus it is an extremely short sample
in any case which should make it fall under fair use no matter what.
@kousu
Copy link
Contributor Author

kousu commented Apr 3, 2019

This depends on glassechidna/zxing-cpp#80 getting merged.

Then, zxinglight needs to not use ::c_str() to get the data out. It needs to use ::size() and ::data().

@kousu kousu changed the title [WIP] Include a test for binary data. Binary data. Apr 3, 2019
I **changed the API** for this: now it outputs bytes instead of strs.
This probably needs to be thought through further, but, on my system
with glassechidna/zxing-cpp#80 applied underneath,
it makes my new test with the embedded null pass.
@kousu
Copy link
Contributor Author

kousu commented Apr 3, 2019

I've written a patch but it should be considered a first draft. I'd like some feedback on the shape of the API (re: #4 (comment)): should it have two output variables side by side, one for text and one for binary, like nu-book's and jsQR? Should it only be one?

🥳EDIT🥳: wtf? how did that test pass? @glassechidna hasn't merged my PR yet. 🤔

I think it would be nicest and most pythonic if it could only be one. The easiest flow, I think, is if you scan a textual code you get str and if you scan a binary code you get bytes, and you can check which it is just by looking at the type. This is complicated though: you can guess if binary is not text (eg. if it has nulls, or invalid UTF-8 sequences) but you can't be 100% sure (well, except for the null check, that's a dead giveaway), but you can't tell if binary is text for sure. We could tell the reader what type to expect, but that's unlikely to work.

The really tricky thing is that QR codes can mix character sets inline, so the decoder must pay attention as it runs and either canonicalize everything into a master character set (in practice, Unicode) or keep the chunks in separate pieces with their character sets marked. jsQR does both: it puts everything into JS strings (which are..UCS-2, I think?), transcoding them from UTF-8, but if that fails it leaves a blank in the .text output, but it records the source .chunks[]. Over in cozmo/jsQR#129 (comment) I suggest that we just declare as a community that UTF-8 is the only way to encode multilingual text (or any kind of text; it's identical with ISO8859-1 so assuming UTF-8 should be compatible with any ancient QR codes ). This solution would break the spec, but maybe this is a case where the spec needs to change. If we adopted that solution, zxinglight could return bytes up to python, try decoding it as UTF-8 and, if that fails, output the bytes instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant