Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Part2 solutions (code only) #8

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

chopmo
Copy link

@chopmo chopmo commented Apr 29, 2014

No description provided.

@practicingruby
Copy link
Member

Hi @chopmo,

I think that we can discuss more if you'd like once I post my solution, but generally this looks great! A couple minor concerns:

  1. You used fixed size fields when packing things into messagepack format, but you don't have any validations. Since you know the max size of a fixstr, fixmap, etc., it'd probably make sense to at least raise a NotImplementedError for data that won't fit in that space. This kind of validation will guard against unexpected bugs, and serve as a useful reminder if/when you decided to expand the functionality to cover a more complete set of MessagePack formats.

  2. You use pack("C*") for packing fixstr types into Ruby Strings. This will result in strings with binary encoding (i.e. no encoding), but the MessagePack spec calls for UTF-8. We're discussing on A minimal implementation of part 2 : msgpack #6 how to fix that.

Great work all around, hope those comments help.

@chopmo
Copy link
Author

chopmo commented May 9, 2014

Thanks for your comments @sandal!

ad 1) Yes, leaving out the validations was too sloppy. These checks are obviously candidates for further unit testing, but I simply added the checks for now.
ad 2) I really like encoding stuff...when compared to timezone stuff ;) Added a force_encoding call which should fix this problem if I understand your discussion with @eregon correctly.

@eregon
Copy link

eregon commented May 9, 2014

The unpack check is definitely good, but I think the pack one might be improved.
As we have built-in transcoding with String#encode, the simplest would be to utf8 = str.encode("UTF-8") and add utf8.bytes to the stream.
If I interpret correctly, your code checks if the bytes of the String are valid UTF-8 bytes and adds the original bytes to the stream. But the original bytes could be valid UTF-8 bytes, yet their interpretation quite different (the check of valid UTF-8 bytes is not strong enough to guarantee the string is encoded in UTF-8).

@eregon
Copy link

eregon commented May 9, 2014

For instance:

> "駧".encode(Encoding::ISO8859_1).force_encoding(Encoding::UTF_8).valid_encoding?
=> true
> "駧".encode(Encoding::ISO8859_1).force_encoding(Encoding::UTF_8)
=> "駧"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants