-
Notifications
You must be signed in to change notification settings - Fork 11
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the contribution. Couple comments:
- Can you please create a quick test for this function? See the existing upload tests for examples.
- Lint CI is failing
Hmm should we keep the |
The existing tests are uploading a 5B file and verifying the upload matches the skylink for a 156KB pdf ... this does not seem correct ... EDIT: learning about |
Whew! Confused myself with tests
I've made the interface change you requested, but I'm noting that this changes the interface from how chunked uploading used to be here. I like the way the go library takes a map of filenames to content: do you think that would work here? |
That would be excellent! Ideally the SDKs are consistent with each other, especially the APIs, but I am short on time. So these kinds of fixes are really appreciated! |
Looking at this some, I'm noticing that chunked uploading (which wonderfully supports live streaming uploads) works for single file uploads only with the requests library. Additionally the go SDK does not have a feature for chunked uploads. It seems like supporting generic upload, which the go SDK has normalized, is a separate beast from supporting chunked upload. It might not be hard to upload an iterator as chunked if only one file is provided. |
with additional support for chunked uploading.
@m-cat I've squashed and rebased and simplified to addition of a method comparable to the generic one used by the go sdk. It takes a map of filenames to file-like objects or bytes data, and if there is only one file can do chunked uploading via the requests library's feature, if the file data is an iterator. Does this look good? |
Sorry for the delay in reviewing, we were super busy. Taking a look now. |
|
||
|
||
def upload_file_request_with_chunks(self, path, custom_opts=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't delete any methods because it would break backwards compatibility. I would keep this and just have it call the new method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i can do this but i think there are compatibility issues elsewhere too; i originally made this PR while porting my code to the newer interfaces. (EDIT: notably upload_file_request_with_chunks
has never functioned in the new api; ever since the path
argument, which is actually a data iterator, was treated as a string and normalised)
not isinstance(data, bytes) and | ||
not isinstance(data, str) and | ||
not hasattr(data, 'read')): | ||
# an iterator for chunked uploading |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good but it should be documented so people know this functionality is there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the documentation I added to upload_file_request_with_chunks
sufficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good apart from a few points. Thanks for taking the time to make this conform to the Go SDK. BTW, have you tested uploads to skynet with this code?
Just ran into an issue with the Go SDK that I think applies here too. If we upload a directory with a single file, it will be treated as a file upload instead of a directory upload. This can be unexpected for the user, and a directory should be uploaded even if it's only a single file. Edit: I'm fixing this in the Go SDK by checking if |
@m-cat I believe I've added all the changes you requested. I'm testing this locally and I discovered that python requests doesn't stream file objects when they are passed as multipart data (it reads the whole content of each file in one go). If you pass a single file as the body of a post request, it does stream it, reading it in a configurable chunk size defaulting to 8K. Would it be reasonable to change the behavior of python-skynet to pass all single files directly as body content? This would simplify the implementation and provide streaming for all single files. The tests would change to expect this. |
siaskynet/_upload.py
Outdated
|
||
ftuples = [] | ||
for filename, data in upload_data.items(): | ||
ftuples.append((fieldname, | ||
(filename, data))) | ||
|
||
if len(upload_data) == 1: | ||
if opts['portal_file_fieldname'] == fieldname: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't very elegant. I would change this check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed it. The reason the code is awkward here is I wanted to keep it similar to the go code to ease new changes. Let me know if you have more preferences around it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thanks for your patience! Approving this on its 1-month anniversary 🙂
Published version 2.1.0 which includes these changes 👍 |
Somehow in transitions it looks like the chunked uploading feature broke.
This change fixes it up again.
Chunked uploading is done by passing an iterator as the body data; see https://requests.readthedocs.io/en/master/user/advanced/#chunk-encoded-requests
I believe you can also pass a file-like object to stream with known size for #30