Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Torrents with additonal "<blah>.utf-8" fields are unable to be parsed. #14

Open
wkpatrick opened this issue Aug 28, 2021 · 1 comment
Open
Labels
bug Something isn't working postpone This issue will be solved in the future

Comments

@wkpatrick
Copy link

This is due to :erlang.binary_to_existing_atom not being able to turn "name.utf-8" into an atom (the period is what causes it to trip up).

Stacktrace

** (exit) an exception was raised:
** (ArgumentError) errors were found at the given arguments:

  • 1st argument: invalid UTF8 encoding

    :erlang.binary_to_existing_atom("name.utf-8", :utf8)
    (bento 0.9.2) lib/bento/metainfo.ex:48: anonymous fn/1 in Bento.Metainfo.transform/1
    (elixir 1.12.0) lib/enum.ex:1553: Enum."-map/2-lists^map/1-0-"/2
    (elixir 1.12.0) lib/enum.ex:1553: Enum."-map/2-lists^map/1-0-"/2
    (bento 0.9.2) lib/bento/metainfo.ex:30: Bento.Metainfo.info/1
    (bento 0.9.2) lib/bento/metainfo.ex:41: Bento.Metainfo.info!/1
    (bento 0.9.2) lib/bento.ex:92: Bento.torrent!/1
    

"name.utf-8" appears to be from clients such as Vuze/Azeurus for handling old torrents that did not write the name/path's in utf-8 (see here).

I am not sure how to "properly" handle this, but for now I will just have it filter out any keys with .utf-8 in the data metainfo before parsing. Id be happy to merge those changes in, or any different change you would suggest.

Thanks for the useful library!

@folz
Copy link
Owner

folz commented Aug 28, 2021

Oh interesting, thanks for the report. I hadn't come across the nonstandard .utf-8 naming before. This seems reasonable to support in bento.

From the issue you linked, it's true that nonstandard key names aren't disallowed by the spec (they aren't mentioned). I guess that implies we should not be converting to atoms, because keys could be named anything - so we should represent key names as strings internally.

I am not presently doing anything with this library, so I would appreciate a PR if you want to put one up. I think we should make sure that .torrent() continues to validate that the metainfo has a torrent-compliant shape, but now will allow other keys. Please make sure it also includes (a) test case(s) covering this issue, and the results of benchmarking before and after the change.

@mogeko mogeko added bug Something isn't working postpone This issue will be solved in the future labels Feb 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working postpone This issue will be solved in the future
Projects
None yet
Development

No branches or pull requests

3 participants