-
-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
memoryview is a Sequence but does not implement the full Sequence API #125420
Comments
cc @rhettinger for ABCs and @AlexWaygood for prior discussion on typeshed. |
I think it would be nice to have such addition. If we were to add them in 3.14, I'd like to help! (at least with one method; I think it's better if we don't add them all in one go). |
Go ahead, I haven't started coding yet. |
IIRC, the sequence protocol on |
By the way: >>> reversed(memoryview(b'a'))
<reversed object at 0x7f6ece3fd8b0> works and we actually have tests for that. However, I think there is somewhere an implicit cast or a generic reversed that is being used via indices + length maybe? So I don't know whether we should have a dedicated reversed method or just dispatch to the generic implementation. |
Yes,
|
Well... the |
I took a quick glance at the PRs and they seem fine to me, but I would be very cautious about making sure that e.g. |
That's a good point (though I'm not sure there are untold amounts of code using |
... I forgot to finish the multi-dimensional case. And the fact that it's not only basic memoryviews that should be supported but the entire protocol... I'll need to check the multi-dimensional case carefully.
(I'll try passing on that achievement) |
It appears iteration on multidimensional memoryviews currently isn't implemented at all, so we don't need to support it for these new methods either. |
I've implemented the missing (naive) interface. The performances should be better than just the generic implementations since we have less checks but some methods can be improved by directly accessing the underlying buffer instead of creating iterators. If such performance gain is needed, we'll address those concerns in a separate issue. |
I can look closer at the PRs sometime today. I have a few general comments at a quick glance:
|
Performances should really be addressed separately. We try to avoid using the private API if this is not performance critical or if it is exposed "almost directly" to the user. Before using the private API, we need benchmarks but I think it's first better not to do it in those PRs. We will definitely gain performances by directly accessing the buffers but those should be investigated. For assertions, which place do you have in in mind? I can find some in the |
When writing C code, I like to put assertions everywhere and anywhere. They add no overhead at runtime (they get compiled out on release builds) and make debugging a lot easier. Generally, if you make an assumption, add an assertion. |
That's a dangerous statement; it would be good to run some microbenchmarks on operations like I am going to wait a bit before merging any of the PRs here in case other core devs want to voice their opinion on whether this is a good idea. |
I'm conflicted about this one. On the plus side, this makes the APIs more consistent and will simplify the typeshed stubs. On the minus side, it feels a bit off to implement (and have to test and maintain) an actual concrete implementation for functionality that no user has ever needed. It seems a little like the tail wagging the dog. In the past, we've usually applied a principle of parsimony and resisted API expansions until good use cases have arisen. That has helped avoid cruft accumulating for features than never get used. |
On the plus side, we could also benefit from possible improvements though that's just a gut feeling. For instance materializing the reverse iterator is now slower but iterating over it is faster if the buffer is large enough. This could also serve for the multidimensional case but I have no idea if this is a feature that we plan to have or not. The APIs were simple enough to implement so it didn't bother me btw. |
I feel that we should add I don't feel as strongly about the missing dunders, |
Yeah index and count seem important since they are public and documented. For contains and reversed I think we can just rely on the default dunders. I'm actually wondering but does "being a sequence" means "having a |
Bug report
Bug description:
The
memoryview
builtin is registered as aSequence
, but doesn't implement the full API, because it doesn't inherit fromSequence
at runtime and doesn't implement the mixin methods.Among the methods listed for Sequence in the documentation, memoryview is missing
__contains__
,__reversed__
,index
, andcount
. This is causing problems for typing; in the typeshed stubs we have to either lie one way by claiming that memoryview has methods it doesn't have, or lie another way by claiming it is not a Sequence whenissubclass()
says otherwise at runtime (python/typeshed#12800). To fix this, we should either make memoryview not a Sequence, or add the missing methods. The former has compatibility implications, so I propose to add the methods in 3.14.CPython versions tested on:
CPython main branch
Operating systems tested on:
No response
Linked PRs
Sequence.__contains__
API onmemoryview
objects #125441Sequence.count
API onmemoryview
objects #125443Sequence.index
API onmemoryview
objects #125446Sequence.__reversed__
API onmemoryview
objects #125505The text was updated successfully, but these errors were encountered: