Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement reading qvd files from Python IO #27

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

msimmoteit-neozo
Copy link

Hi,
I'm a big fan of this library. It has really helped me to work with QVD files. Thank you so much.

Currently this library only works when reading in qvd files via a file name, but I have a use case, where the data I want to read in is not available in to my Python interpreter in that format. I would like to read in qvd files from either in-memory bytes from Python or from Python File objects:

  • io.TextIOBase
  • io.BufferedIOBase
  • io.RawIOBase
  • io.IOBase

As a starting point I implemented this myself.
Now, with these changes, qvd files can be read in the current way:

from qvd import qvd_reader
qvd_reader.read("qvd/test_files/AAPL.qvd")

But also via

with open("qvd/test_files/AAPL.qvd", "rb") as fin:
    qvd_reader.read(fin)

I also added an error message for a common mistake one could make:

>>> with open("qvd/test_files/AAPL.qvd", "r") as fin:
...     qvd_reader.read(fin)
...
Traceback (most recent call last):
  File ".../qvd-utils/qvd/qvd_reader.py", line 18, in read_to_dict
    unpacked_data = file.read()
                    ^^^^^^^^^^^
  File "<frozen codecs>", line 322, in decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf6 in position 5816: invalid start byte

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/.../qvd-utils/qvd/qvd_reader.py", line 7, in read
    data_dict = read_to_dict(file)
                ^^^^^^^^^^^^^^^^^^
  File "/.../qvd-utils/qvd/qvd_reader.py", line 20, in read_to_dict
    raise Exception("Supply a raw file access. Use mode \"rb\" instead of mode \"r\"")
Exception: Supply a raw file access. Use mode "rb" instead of mode "r"

There is room for improvement here, as there is code duplication between the function read_qvd and read_qvd_from_buffer. I tried to unify it, but the BufRead of the Cursor over the Vec<u8> and the BufRead on the file behaved differently, so I let it like it is.

Maybe someone else finds a way to make it better?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant