Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport of Python 3's csv module #69

Closed
ryanhiebert opened this issue Nov 11, 2015 · 12 comments
Closed

Backport of Python 3's csv module #69

ryanhiebert opened this issue Nov 11, 2015 · 12 comments

Comments

@ryanhiebert
Copy link
Collaborator

The working name for this module is xcsv. Currently, I plan to make it a pure Python implementation. It'll be slow, but it'll be API compatible with Python 3's csv module that also works on Python 2.

Work in Progress implementation at https://github.com/ryanhiebert/python-unicodecsv/blob/xcsv/unicodecsv/xcsv/py2.py. That's on the xcsv branch on my fork.

I copied the test suite from Python 3, and got it running on Python 2.6. Right now most of the tests except for tests involving the reader are passing under Python 2.6. What I'm planning to do with the reader is make a pure-python port of the C code from the _csv module in Python 3.


So the question is, should this be a separate package, or start out as a subpackage of unicodecsv? Either way, we could do the separate package later if it was appropriate, and unicodecsv can certainly have a dependency on it if it proves useful.

I think that this could be a solution to #59 , which breaks because of encodings having null bytes. However, it would also be significantly slower, since it would be a pure-python implementation. We'd probably want to watch for encodings that break the Python 2 version, and then swap out the slower implementation.

@ryanhiebert
Copy link
Collaborator Author

I got the tests passing! I need to add a little bit better test harnessing, and I need to decide where this will live.

@jdunck : Do you think it would be good to make this part of unicodecsv, or to split it out into a different package? We may end up wishing to (optionally) depend on it so we can implement a fix for #59.

@jdunck
Copy link
Owner

jdunck commented Nov 16, 2015

This is awesome work, Ryan. :) 👍

I think it makes sense as part of this package, but if you'd prefer to have it in another library, that's fine, too.

Perhaps the guidance on use could be "use unicodecsv if you primarily author py2. Use xcsv if you primarily use py3. If you use both, you probably want xcsv, but be aware that its API differs from py2's csv in the following ways..."

@ryanhiebert
Copy link
Collaborator Author

Do you think that unicodecsv.xcsv would be the right place for it to live?

@jdunck
Copy link
Owner

jdunck commented Dec 2, 2015

Is "x" a naming convention I'm not familiar with? What about csv6? Or csv_six?

@ryanhiebert
Copy link
Collaborator Author

No convention that I can think of. I used x to mean cross, as in cross compatible. The six names seem like a good tack, though neither of those options jump out as great to me.

@jdunck
Copy link
Owner

jdunck commented Dec 2, 2015

csixv? 😁

@ryanhiebert
Copy link
Collaborator Author

@singingwolfboy suggested that it could be part of the backports namespace package (https://pypi.python.org/pypi/backports), and backports.csv on PyPI. I kinda like that idea.

I've got an issue open at the repository for that asking for some guidance on whether the package should prefer the implementation from the stdlib if available. https://bitbucket.org/brandon/backports/issues/4/defaulting-to-the-stdlib The README of backports suggests that it would be better to not prefer the stdlib if available, but doing so would make testing easier.

What do you think, @jdunck?

@jdunck
Copy link
Owner

jdunck commented Dec 3, 2015

I think including it in backports makes good sense. Using built-in if available is risky for modules that are likely to change, but good for performance and security reasons. I doubt csv will be changing much. My guess is to use the built-in in this case.

@ryanhiebert
Copy link
Collaborator Author

Thanks for the feedback. I think I'll follow that approach unless I find some reason not to.

@ryanhiebert
Copy link
Collaborator Author

I've released backports.csv version 1.0. https://pypi.python.org/pypi/backports.csv

@jdunck
Copy link
Owner

jdunck commented Feb 11, 2016

Sweet. I updated python-unicodecsv's project description on github to:

"
Python2's stdlib csv module is nice, but it doesn't support unicode. This
module is a drop-in replacement which does. If you prefer python 3's
semantics but need support in py2, you probably want
https://github.com/ryanhiebert/backports.csv
"

On Wed, Feb 10, 2016 at 9:30 PM, Ryan Hiebert [email protected]
wrote:

I've released backports.csv version 1.0.
https://pypi.python.org/pypi/backports.csv


Reply to this email directly or view it on GitHub
#69 (comment)
.

@ryanhiebert
Copy link
Collaborator Author

Very cool, thank you. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants