Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement all Ops in PyTorch (help welcome!) #821

Open
34 of 48 tasks
ricardoV94 opened this issue Jun 14, 2024 · 29 comments · Fixed by #926
Open
34 of 48 tasks

Implement all Ops in PyTorch (help welcome!) #821

ricardoV94 opened this issue Jun 14, 2024 · 29 comments · Fixed by #926
Labels
help wanted Extra attention is needed torch PyTorch backend

Comments

@ricardoV94
Copy link
Member

ricardoV94 commented Jun 14, 2024

Description

If you want to help implementing some of these Ops just leave a comment below saying which ones you are interested in. We'll give you some time to work on it (and then put it back up to grabs).

See the documentation for How to implement PyTorch Ops and tests: https://pytensor.readthedocs.io/en/latest/extending/creating_a_numba_jax_op.html

Example PR: #836

See #821 (comment) for suggestions on equivalent torch functions

Tensor creation Ops

Shape Ops

Math Ops

Indexing Ops

Branching Ops

Linalg Ops

SparseOps

  • ... (to be filled)

RandomVariable Ops

  • ... Need to figure out API differences

If you need an Op that's not in this list, comment below and we'll add it!

@ricardoV94 ricardoV94 added help wanted Extra attention is needed torch PyTorch backend labels Jun 14, 2024
@ricardoV94 ricardoV94 changed the title Torch: Implement other Ops (help welcome!) Implement all Ops in PyTorch (help welcome!) Jun 14, 2024
@HAKSOAT
Copy link
Contributor

HAKSOAT commented Jun 15, 2024

Hi there @ricardoV94, I'm attending the hackathon at Pydata London.

I'd like to work on Softmax.

@ricardoV94
Copy link
Member Author

Hi there @ricardoV94, I'm attending the hackathon at Pydata London.

I'd like to work on Softmax.

Sure!

@HAKSOAT
Copy link
Contributor

HAKSOAT commented Jun 17, 2024

Thanks @ricardoV94

I have been doing some reading around the docs, I noticed that #764 has the initial setup. I believe that contains ground work for adding other Ops, so I'd be following the PR and the Softmax Ops will likely use some code from it.

Am I thinking about this the right way?

@ricardoV94
Copy link
Member Author

@HAKSOAT yes, you should be able to start a branch already from #764. All the functionality is done, except some tests are failing because they depend non-optionally on GPU. We should make that optional and get it merged soon enough, but you need not wait :)

@ricardoV94
Copy link
Member Author

#764 is merged

@t3chw
Copy link
Contributor

t3chw commented Jun 20, 2024

Hello @ricardoV94, I would like to work on Reshape

@ricardoV94
Copy link
Member Author

Go ahead. We'll link and lock the Op when you open a PR

@HAKSOAT HAKSOAT mentioned this issue Jun 23, 2024
11 tasks
@HAKSOAT
Copy link
Contributor

HAKSOAT commented Jun 23, 2024

Hi @ricardoV94, I have opened a PR for the Softmax Ops. I see it has been grouped with LogSoftmax and Grads, so I can update the PR to include them.

@ricardoV94 ricardoV94 pinned this issue Jun 24, 2024
@HangenYuu
Copy link
Contributor

Hi @ricardoV94, I will work on the Dot Op now.

@ricardoV94
Copy link
Member Author

If someone wants to look through the codebase and populate the list of Ops above that would also be very helpful :)

@HangenYuu
Copy link
Contributor

If someone wants to look through the codebase and populate the list of Ops above that would also be very helpful :)

@ricardoV94 Does something like this work? There are correponding torch function/method attached each op.

  1. pytensor.tensor.elemwise
  • Elemwise:
  • CAReduce:
  • DimShuffle: torch.transpose.
  • Softmax: torch.nn.functional.softmax.
  • SoftmaxGrad: Hand-crafted like JAX or use PyTorch autograd.
  • LogSoftmax: torch.nn.functional.log_softmax.
  1. pytensor.tensor.extra_ops
  • Bartlett: torch.bartlett_window.
  • CumOp: torch.cumprod & torch.bincount.
  • FillDiagonal: torch.Tensor.fill_diagonal_.
  • FillDiagonalOffset: torch.diagonal_scatter with src parameter set to a vector of identical values with compatible shape.
  • RavelMultiIndex: no equivalent in PyTorch, must be crafted from native ops.
  • Repeat: torch.repeat_interleave.
  • Unique: torch.unique.
  • UnravelIndex: torch.unravel_index.
  • bincount: torch.bincount.
  • broadcast_to: torch.broadcast_to.
  1. pytensor.tensor.nlinalg
  • BatchedDot: torch.matmul with check for batch dimension or torch.mm.
  • Dot: torch.matmul.
  • MaxAndArgmax: torch.max & torch.argmax.
  • SVD: torch.linalg.svd.
  • Det: torch.linalg.det.
  • Eig: torch.linalg.eig.
  • Eigh: torch.linalg.eigh.
  • MatrixInverse: torch.linalg.inv.
  • MatrixPinv: torch.linalg.pinv.
  • QRFull: torch.linalg.qr.
  • SLogDet: torch.linalg.slogdet.
  1. pytensor.tensor.slinalg
  • BlockDiagonal: torch.block_diag.
  • Cholesky: torch.linalg.cholesky.
  • Solve: torch.linalg.solve.
  • SolveTriangular: torch.linalg.solve_triangular.
  1. Pytensor.tensor.random - Sampling from a distribution will have to happen through torch.distributions. The underlying random number generator modules are torch.Generator and torch.random. The documentation for different families are all listed on Probability distributions - torch.distributions — PyTorch 2.2 documentation.
  • RandomStateType(type): Random state or random seed in PyTorch is not exposed as a separate class. It is returned as a torch.Tensor via torch.get_rng_state() and also in get_state() of torch.Generator.
  • RandomVariable(func): Like JAX, I will have to implement a new class to wrap the random number generator of a specific distribution.
  • Generator: torch.Generator.
  • Generic family
    • BetaRV: torch.distributions.beta.Beta
    • DirichletRV: torch.distributions.dirichlet.Dirichlet
    • PoissonRV: torch.distributions.poisson.Poisson
    • MvNormalRV: torch.distributions.multivariate_normal.MultivariateNormal
  • Loc-scale family
    • CauchyRV: torch.distributions.cauchy.Cauchy
    • GumbelRV: torch.distributions.gumbel.Gumbel
    • LaplaceRV: torch.distributions.laplace.Laplace
    • LogisticRV:
    • NormalRV: torch.distributions.normal.Normal
    • StandardNormalRV: A special case of torch.distributions.normal.Normal
  • No datatype family:
    • BernoulliRV: torch.distributions.bernoulli.Bernoulli
    • CategoricalRV: torch.distributions.categorical.Categorical
  • Uniform density family:
    • RandIntRV: torch.randint
    • IntegersRV:
    • UniformRV: torch.distributions.uniform.Uniform
  • Shape-scale family:
    • ParetoRV: torch.distributions.pareto.Pareto
    • GammaRV: torch.distributions.gamma.Gamma
  • ExponentialRV: torch.distributions.exponential.Exponential
  • StudentTRV: torch.distributions.studentT.StudentT
  • ChoiceRV: torch.multinomial?
  • PermutationRV: torch.randperm
  • BinomialRV: torch.distributions.binomial.Binomial
  • MultinomialRV: torch.distributions.multinomial.Multinomial
  • VonMisesRV: torch.distributions.von_mises.VonMises
  1. Pytensor.scalar - still unsure how to convert this to PyTorch due to missing docstring and the lack of current knowledge about methods on Scalar in PyTorch.
  • ScalarOp:
  • Add:
  • Mul:
  • Sub:
  • IntDiv:
  • Mod:
  • Cast:
  • Clip:
  • Composite:
  • Identity:
  • Second:
  • BetaIncInv:
  • Erf:
  • Erfc:
  • Erfcinv:
  • Erfcx:
  • Erfinv:
  • GammaIncCInv:
  • GammaIncInv:
  • Iv:
  • Ive:
  • Log1mexp:
  • Psi:
  • TriGamma:
  • Softplus: torch.unique
  1. pytensor.scan.op.Scan - still unsure how to convert this to PyTorch
  • 8. pytensor.sparse - torch.sparse supports sparse matrices. Outside the same methods implemented in JAX, I can map more methods to Torch (unless I miss something and the current ones implemented turn out to be enough to create everything from there).
  • SparseTensorType: torch.tensor(layout=torch.sparse_<sparse_type>) via different variations of torch.tensor.to_sparse.
  • Dot: torch.smm.
  • StructuredDot: torch.sparse.mm() (currently not working for CSR matrix, especially on GPU).
  1. pytensor.tensor.basic
  • Alloc:
  • AllocEmpty:
  • ARange:
  • ExtractDiag:
  • Eye:
  • Join:
  • MakeVector:
  • ScalarFromTensor:
  • Split:
  • TensorFromScalar:
  • Tri:
  • SortOp:

@twaclaw
Copy link
Contributor

twaclaw commented Jul 2, 2024

I could help with the remaining Tensor creation ops to begin with. Let me know.

@ricardoV94
Copy link
Member Author

Thanks @twaclaw, feel free to open a PR

@twaclaw
Copy link
Contributor

twaclaw commented Jul 4, 2024

I will have a look at Repeat, Unique, etc. during the weekend.

@ricardoV94, regarding Unique:

  • is it desired to emulate the behaviour of return_index=True or rather throw NotImplementedError? (torch.unique doesn't feature that param)
  • is the axis param required? Your current JAX Op implementation doesn't support a non None axis

@ricardoV94
Copy link
Member Author

ricardoV94 commented Jul 5, 2024

@twaclaw in general we want to support exactly the same functionality from the original Op. When that is not possible or too complicated raising NotImplementedError is fine.

Regarding JAX, we probably cannot compile (JIT) any function that has unique in it because JAX can't handle dynamic shapes. So it's a bit moot whether we say we support axis or not, although the NotImplementedError could be removed and we could just dispatch to jax.numpy.unique rather straightforward. Feel free to open a PR for that if you want.

@HarshvirSandhu
Copy link
Contributor

I'll be working on the indexing Ops now

@twaclaw
Copy link
Contributor

twaclaw commented Jul 7, 2024

@twaclaw in general we want to support exactly the same functionality from the original Op. When that is not possible or too complicated raising NotImplementedError is fine.

Regarding JAX, we probably cannot compile (JIT) any function that has unique in it because JAX can't handle dynamic shapes. So it's a bit moot whether we say we support axis or not, although the NotImplementedError could be removed and we could just dispatch to jax.numpy.unique rather straightforward. Feel free to open a PR for that if you want.

@ricardoV94, regarding the JAX implementation of unique, one possible option would be to make the param size in jax.numpy.unique static. Anyhow, I think the current implementation might be broken (i.e., lax_numpy is undefined).

@ricardoV94
Copy link
Member Author

ricardoV94 commented Jul 7, 2024

I can't imagine many cases where I would know the size of the unique elements but not what/where they were? If I knew what/where they are I would just select them instead of using unique.

More importantly we don't have size in our Unique Op and I don't think it makes sense to add it for this edge case. A more general approach will be to be clever about what can and cannot be jitted. I think there's an issue open for that already.

For now we can probably just remove the implementation and let it raise NotImplementedError if as you say it's broken anyway

@twaclaw
Copy link
Contributor

twaclaw commented Jul 7, 2024

You are right, unique is not compatible with JIT.
You don't need to know the exact size but guess it (maybe overestimate it). I was wondering whether it is possible to insert parameters (like a Constant size) into the Op Graph 🤔.
But I agree a NotImplementedError would be appropriate.

@twaclaw
Copy link
Contributor

twaclaw commented Jul 8, 2024

I can take a look at the linear algebra ops.

@ricardoV94
Copy link
Member Author

Seems like Reshape is become more and more relevant, if anyone wants to tackle it

@twaclaw
Copy link
Contributor

twaclaw commented Jul 11, 2024

Seems like Reshape is become more and more relevant, if anyone wants to tackle it

Is someone working on this?

@ricardoV94
Copy link
Member Author

Seems like Reshape is become more and more relevant, if anyone wants to tackle it

Is someone working on this?

Not yet I think. You can go ahead

@Ch0ronomato
Copy link
Contributor

Is the checklist at the top up to date on what else is needed?

@ricardoV94
Copy link
Member Author

More or less up to date except linalg and indexing is being worked on

@ricardoV94
Copy link
Member Author

ricardoV94 commented Jul 17, 2024

If someone is interested we need to check whether we can bridge nicely between PyTensor and Torch random number generator APIs.

We have added a recent documentation page explaining how random variables work in PyTensor: #928

As a reminder we're targetting torch compile functionality in case that matters

@Ch0ronomato
Copy link
Contributor

@ricardoV94 I'll take a stab after i finish some of the operators in #939. I need to build a bit more familiarity with the pytensor code first

@danielgerardclaassen
Copy link

Coming from PyMC, adding a sparse solve would be useful I believe...

@ricardoV94
Copy link
Member Author

Coming from PyMC, adding a sparse solve would be useful I believe...

This issue is not very relevant for that request, since we first need it in PyTensor to begin with, before we add it to the PyTorch backend. We haven't done anything with Sparse stuff in the PyTorch backend to begin with

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed torch PyTorch backend
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants