Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addition of photons in coffea fails because they don't have mass and charge fields. #498

Open
ikrommyd opened this issue Aug 23, 2024 · 13 comments
Labels
feature New feature or request help wanted Extra attention is needed

Comments

@ikrommyd
Copy link

ikrommyd commented Aug 23, 2024

Vector Version

1.4.1

Python Version

3.11

OS / Environment

macOS but doesn't matter

Describe the bug

Currently in coffea if you do

from coffea.nanoevents import NanoEventsFactory
events = NanoEventsFactory.from_root({"tests/samples/DYto2E.root": "Events"}).events()
events.Photon + events.Photon

you will get an error

ValueError: array does not have temporal coordinates (t/E/e/energy or tau/M/m/mass): seediEtaOriX, cutBased, cutBased_Fall17V2, electronVeto, isScEtaEB, isScEtaEE, mvaID_Fall17V2_WP80, mvaID_Fall17V2_WP90, mvaID_WP80, mvaID_WP90, pixelSeed, seedGain, electronIdx, jetIdx, seediPhiOriY, vidNestedWPBitmap, vidNestedWPBitmap_Fall17V2, energyErr, energyRaw, esEffSigmaRR, esEnergyOverRawE, eta, etaWidth, haloTaggerMVAVal, hoe, hoe_PUcorr, mvaID, mvaID_Fall17V2, pfChargedIsoPFPV, pfChargedIsoWorstVtx, pfPhoIso03, pfRelIso03_all_Fall17V2, pfRelIso03_all_quadratic, pfRelIso03_chg_Fall17V2, pfRelIso03_chg_quadratic, phi, phiWidth, pt, r9, s4, sieie, sieip, sipip, trkSumPtHollowConeDR03, x_calo, y_calo, z_calo, electronIdxG, jetIdxG

This error occurred while calling

    numpy.add.__call__(
        <PhotonArray-typetracer [...] type='## * var * Photon[seediEtaOriX:...'>
        <PhotonArray-typetracer [...] type='## * var * Photon[seediEtaOriX:...'>
    )

If my understanding is correct, it's looking for mass field rather than an accessor. The photons don't have mass and a charge field right now in nanoaod. They used to at some point and it was set to zero. That's why you will not get this failure if you try this with the nano_dy.root file from coffea because it's old and thats why this wasn't caught by tests I assume.
Lindsey commented that this should be solved with behaviors that supply the necessary inputs and and not keep hard-requiring fields to be present.

cc @lgray

Any additional but relevant log output

No response

@ikrommyd ikrommyd added the bug (unverified) The problem described would be a bug, but needs to be triaged label Aug 23, 2024
@lgray
Copy link

lgray commented Aug 24, 2024

I don't think this is a bug, but rather a feature request.

@jpivarski @Saransh-cpp do you think it would be possible to specify behavior functions or properties as sources of four-momentum data in addition to only fields?

@ikrommyd
Copy link
Author

Yeah, I honestly didn't think about the label. Sorry. Someone with permissions please change it.

@jpivarski
Copy link
Member

So, for instance, in lines like

return AzimuthalAwkwardXY(array["x"], array["y"])

use array.x, array.y instead of array["x"], array["y"] (consistently, everywhere).

The problem I see with that is that properties take precedence over fields, and therefore an implementation would infinite-loop between, say, the property x that is implemented in terms of the property phi that is implemented in terms of the property x. It might be possible to ensure that one class has properties x and y defined and another class has rho and phi defined, but we'd have to be very careful to be sure that no class has both.

(This sounds like a big-ish project to make sure that all the ducks are in a row. There are a lot of ducks.)

@jpivarski jpivarski added feature New feature or request and removed bug (unverified) The problem described would be a bug, but needs to be triaged labels Aug 25, 2024
@ikrommyd
Copy link
Author

At least a hotfix somewhere soon is necessary for the time being. Diphotons are broken.

@Saransh-cpp
Copy link
Member

If I am not wrong, a hot fix can be overriding the method in coffea and passing it to the behavior dict (instead of copying the four-vector behavior).

@ikrommyd
Copy link
Author

ikrommyd commented Sep 4, 2024

Just a gentle ping on this as it's really a problem.

@lgray
Copy link

lgray commented Sep 4, 2024

@Saransh-cpp could you please make an example of the fix you have suggested?

@Saransh-cpp
Copy link
Member

Saransh-cpp commented Sep 7, 2024

Hi, I thought that the issue was performing addition on the charge field, which is why I suggested overriding the add method in behavior dict, but I was wrong. I looked at the issue wrong, everything is in fact handled well by Candidate.add.

I tried swapping the main pain point ["..."] occurrence with .field and though it works for the example above, it leads to infinite recursion in several other cases (as pointed out by @jpivarski above) -

Quick fix diff:
diff --git a/src/vector/backends/awkward.py b/src/vector/backends/awkward.py
index 5ca6804..6a581bb 100644
--- a/src/vector/backends/awkward.py
+++ b/src/vector/backends/awkward.py
@@ -325,8 +325,8 @@ class TemporalAwkward(CoordinatesAwkward, Temporal):
             return TemporalAwkwardTau(array["M"])
         elif "m" in fields:
             return TemporalAwkwardTau(array["m"])
-        elif "mass" in fields:
-            return TemporalAwkwardTau(array["mass"])
+        elif "mass" in fields or hasattr(array, "mass"):
+            return TemporalAwkwardTau(array.mass)
         else:
             raise ValueError(
                 "array does not have temporal coordinates (t/E/e/energy or tau/M/m/mass): "
Working MWE:
In [1]: from coffea.nanoevents import NanoEventsFactory
   ...: events = NanoEventsFactory.from_root({"DYto2E.root": "Events"}).events()

In [2]: events.Photon + events.Photon
Out[2]: dask.awkward<add, npartitions=1>
Infinite recursion trace:
In [1]: import vector; import awkward as ak

In [2]: vector.register_awkward()

In [3]: ak.zip({"mass": [1], "px": [1], "py": [1], "pz": [1]}, with_name="Momentum4D") ** 2
------------------------------------------------------------------------
RecursionError                         Traceback (most recent call last)
Cell In[3], line 1
----> 1 ak.zip(
      2             {"mass": [1], "px": [1], "py": [1], "pz": [1]}, with_name="Momentum4D"
      3         ) ** 2

File ~/Code/HEP/vector/.env/lib/python3.11/site-packages/awkward/_operators.py:53, in _binary_method.<locals>.func(self, other)
     51 if _disables_array_ufunc(other):
     52     return NotImplemented
---> 53 return ufunc(self, other)

File ~/Code/HEP/vector/.env/lib/python3.11/site-packages/awkward/highlevel.py:1511, in Array.__array_ufunc__(self, ufunc, method, *inputs, **kwargs)
   1509 name = f"{type(ufunc).__module__}.{ufunc.__name__}.{method!s}"
   1510 with ak._errors.OperationErrorContext(name, inputs, kwargs):
-> 1511     return ak._connect.numpy.array_ufunc(ufunc, method, inputs, kwargs)

File ~/Code/HEP/vector/.env/lib/python3.11/site-packages/awkward/_connect/numpy.py:466, in array_ufunc(ufunc, method, inputs, kwargs)
    458         raise TypeError(
    459             "no {}.{} overloads for custom types: {}".format(
    460                 type(ufunc).__module__, ufunc.__name__, ", ".join(error_message)
    461             )
    462         )
    464     return None
--> 466 out = ak._broadcasting.broadcast_and_apply(
    467     inputs, action, allow_records=False, function_name=ufunc.__name__
    468 )
    470 if len(out) == 1:
    471     return wrap_layout(out[0], behavior=behavior, attrs=attrs)

File ~/Code/HEP/vector/.env/lib/python3.11/site-packages/awkward/_broadcasting.py:1108, in broadcast_and_apply(inputs, action, depth_context, lateral_context, allow_records, left_broadcast, right_broadcast, numpy_to_regular, regular_to_jagged, function_name, broadcast_parameters_rule)
   1106 backend = backend_of(*inputs, coerce_to_common=False)
   1107 isscalar = []
-> 1108 out = apply_step(
   1109     backend,
   1110     broadcast_pack(inputs, isscalar),
   1111     action,
   1112     0,
   1113     depth_context,
   1114     lateral_context,
   1115     {
   1116         "allow_records": allow_records,
   1117         "left_broadcast": left_broadcast,
   1118         "right_broadcast": right_broadcast,
   1119         "numpy_to_regular": numpy_to_regular,
   1120         "regular_to_jagged": regular_to_jagged,
   1121         "function_name": function_name,
   1122         "broadcast_parameters_rule": broadcast_parameters_rule,
   1123     },
   1124 )
   1125 assert isinstance(out, tuple)
   1126 return tuple(broadcast_unpack(x, isscalar) for x in out)

File ~/Code/HEP/vector/.env/lib/python3.11/site-packages/awkward/_broadcasting.py:1086, in apply_step(backend, inputs, action, depth, depth_context, lateral_context, options)
   1084     return result
   1085 elif result is None:
-> 1086     return continuation()
   1087 else:
   1088     raise AssertionError(result)

File ~/Code/HEP/vector/.env/lib/python3.11/site-packages/awkward/_broadcasting.py:1055, in apply_step.<locals>.continuation()
   1053 # Any non-string list-types?
   1054 elif any(x.is_list and not is_string_like(x) for x in contents):
-> 1055     return broadcast_any_list()
   1057 # Any RecordArrays?
   1058 elif any(x.is_record for x in contents):

File ~/Code/HEP/vector/.env/lib/python3.11/site-packages/awkward/_broadcasting.py:623, in apply_step.<locals>.broadcast_any_list()
    620         nextinputs.append(x)
    621         nextparameters.append(NO_PARAMETERS)
--> 623 outcontent = apply_step(
    624     backend,
    625     nextinputs,
    626     action,
    627     depth + 1,
    628     copy.copy(depth_context),
    629     lateral_context,
    630     options,
    631 )
    632 assert isinstance(outcontent, tuple)
    633 parameters = parameters_factory(nextparameters, len(outcontent))

File ~/Code/HEP/vector/.env/lib/python3.11/site-packages/awkward/_broadcasting.py:1068, in apply_step(backend, inputs, action, depth, depth_context, lateral_context, options)
   1061     else:
   1062         raise ValueError(
   1063             "cannot broadcast: {}{}".format(
   1064                 ", ".join(repr(type(x)) for x in inputs), in_function(options)
   1065             )
   1066         )
-> 1068 result = action(
   1069     inputs,
   1070     depth=depth,
   1071     depth_context=depth_context,
   1072     lateral_context=lateral_context,
   1073     continuation=continuation,
   1074     backend=backend,
   1075     options=options,
   1076 )
   1078 if isinstance(result, tuple) and all(isinstance(x, Content) for x in result):
   1079     if any(content.backend is not backend for content in result):

File ~/Code/HEP/vector/.env/lib/python3.11/site-packages/awkward/_connect/numpy.py:378, in array_ufunc.<locals>.action(inputs, **ignore)
    376     custom = find_ufunc(behavior, signature)
    377     if custom is not None:
--> 378         return _array_ufunc_adjust(custom, inputs, kwargs, behavior)
    380 # Do we have any categoricals?
    381 if any(
    382     x.is_indexed and x.parameter("__array__") == "categorical" for x in contents
    383 ):

File ~/Code/HEP/vector/.env/lib/python3.11/site-packages/awkward/_connect/numpy.py:183, in _array_ufunc_adjust(custom, inputs, kwargs, behavior)
    174 def _array_ufunc_adjust(
    175     custom, inputs, kwargs: dict[str, Any], behavior: Mapping | None
    176 ):
    177     args = [
    178         wrap_layout(x, behavior)
    179         if isinstance(x, (ak.contents.Content, ak.record.Record))
    180         else x
    181         for x in inputs
    182     ]
--> 183     out = custom(*args, **kwargs)
    184     if not isinstance(out, tuple):
    185         out = (out,)

File ~/Code/HEP/vector/src/vector/backends/awkward.py:1512, in <lambda>(v, expo)
   1505 behavior[numpy.power, "Momentum2D", numbers.Real] = (
   1506     lambda v, expo: v.rho2 if expo == 2 else v.rho**expo
   1507 )
   1508 behavior[numpy.power, "Momentum3D", numbers.Real] = (
   1509     lambda v, expo: v.mag2 if expo == 2 else v.mag**expo
   1510 )
   1511 behavior[numpy.power, "Momentum4D", numbers.Real] = (
-> 1512     lambda v, expo: v.tau2 if expo == 2 else v.tau**expo
   1513 )
   1515 behavior["__cast__", VectorNumpy2D] = lambda v: vector.Array(v)
   1516 behavior["__cast__", VectorNumpy3D] = lambda v: vector.Array(v)

File ~/Code/HEP/vector/src/vector/_methods.py:3814, in Lorentz.tau2(self)
   3810 @property
   3811 def tau2(self) -> ScalarCollection:
   3812     from vector._compute.lorentz import tau2
-> 3814     return tau2.dispatch(self)

File ~/Code/HEP/vector/src/vector/_compute/lorentz/tau2.py:107, in dispatch(v)
    100 def dispatch(v: typing.Any) -> typing.Any:
    101     function, *returns = _from_signature(
    102         __name__,
    103         dispatch_map,
    104         (
    105             _aztype(v),
    106             _ltype(v),
--> 107             _ttype(v),
    108         ),
    109     )
    110     with numpy.errstate(all="ignore"):
    111         return v._wrap_result(
    112             _flavor_of(v),
    113             function(
   (...)
    120             1,
    121         )

File ~/Code/HEP/vector/src/vector/_methods.py:4381, in _ttype(obj)
   4376 def _ttype(obj: VectorProtocolLorentz) -> type[Coordinates]:
   4377     """
   4378     Determines the Temporal type of a vector for use in looking up a
   4379     dispatched function.
   4380     """
-> 4381     if hasattr(obj, "temporal"):
   4382         for t in type(obj.temporal).__mro__:
   4383             if t in (TemporalT, TemporalTau):

File ~/Code/HEP/vector/src/vector/backends/awkward.py:1259, in MomentumAwkward4D.temporal(self)
   1242 @property
   1243 def temporal(self) -> TemporalAwkward:
   1244     """
   1245     Returns a Temporal type object containing the temporal coordinates.
   1246 
   (...)
   1257         (<Array [1, 3] type='2 * int64'>,)
   1258     """
-> 1259     return TemporalAwkward.from_momentum_fields(self)

File ~/Code/HEP/vector/src/vector/backends/awkward.py:329, in TemporalAwkward.from_momentum_fields(cls, array)
    327     return TemporalAwkwardTau(array["m"])
    328 elif "mass" in fields or hasattr(array, "mass"):
--> 329     return TemporalAwkwardTau(array.mass)
    330 else:
    331     raise ValueError(
    332         "array does not have temporal coordinates (t/E/e/energy or tau/M/m/mass): "
    333         f"{', '.join(fields)}"
    334     )

File ~/Code/HEP/vector/src/vector/_methods.py:4137, in LorentzMomentum.mass(self)
   4135 @property
   4136 def mass(self) -> ScalarCollection:
-> 4137     return self.tau

File ~/Code/HEP/vector/src/vector/_methods.py:3808, in Lorentz.tau(self)
   3804 @property
   3805 def tau(self) -> ScalarCollection:
   3806     from vector._compute.lorentz import tau
-> 3808     return tau.dispatch(self)

File ~/Code/HEP/vector/src/vector/_compute/lorentz/tau.py:113, in dispatch(v)
    106 def dispatch(v: typing.Any) -> typing.Any:
    107     function, *returns = _from_signature(
    108         __name__,
    109         dispatch_map,
    110         (
    111             _aztype(v),
    112             _ltype(v),
--> 113             _ttype(v),
    114         ),
    115     )
    116     with numpy.errstate(all="ignore"):
    117         return v._wrap_result(
    118             _flavor_of(v),
    119             function(
   (...)
    126             1,
    127         )

File ~/Code/HEP/vector/src/vector/_methods.py:4381, in _ttype(obj)
   4376 def _ttype(obj: VectorProtocolLorentz) -> type[Coordinates]:
   4377     """
   4378     Determines the Temporal type of a vector for use in looking up a
   4379     dispatched function.
   4380     """
-> 4381     if hasattr(obj, "temporal"):
   4382         for t in type(obj.temporal).__mro__:
   4383             if t in (TemporalT, TemporalTau):

File ~/Code/HEP/vector/src/vector/backends/awkward.py:1259, in MomentumAwkward4D.temporal(self)
   1242 @property
   1243 def temporal(self) -> TemporalAwkward:
   1244     """
   1245     Returns a Temporal type object containing the temporal coordinates.
   1246 
   (...)
   1257         (<Array [1, 3] type='2 * int64'>,)
   1258     """
-> 1259     return TemporalAwkward.from_momentum_fields(self)

File ~/Code/HEP/vector/src/vector/backends/awkward.py:329, in TemporalAwkward.from_momentum_fields(cls, array)
    327     return TemporalAwkwardTau(array["m"])
    328 elif "mass" in fields or hasattr(array, "mass"):
--> 329     return TemporalAwkwardTau(array.mass)
    330 else:
    331     raise ValueError(
    332         "array does not have temporal coordinates (t/E/e/energy or tau/M/m/mass): "
    333         f"{', '.join(fields)}"
    334     )

File ~/Code/HEP/vector/src/vector/_methods.py:4137, in LorentzMomentum.mass(self)
   4135 @property
   4136 def mass(self) -> ScalarCollection:
-> 4137     return self.tau

File ~/Code/HEP/vector/src/vector/_methods.py:3808, in Lorentz.tau(self)
   3804 @property
   3805 def tau(self) -> ScalarCollection:
   3806     from vector._compute.lorentz import tau
-> 3808     return tau.dispatch(self)

File ~/Code/HEP/vector/src/vector/_compute/lorentz/tau.py:113, in dispatch(v)
    106 def dispatch(v: typing.Any) -> typing.Any:
    107     function, *returns = _from_signature(
    108         __name__,
    109         dispatch_map,
    110         (
    111             _aztype(v),
    112             _ltype(v),
--> 113             _ttype(v),
    114         ),
    115     )
    116     with numpy.errstate(all="ignore"):
    117         return v._wrap_result(
    118             _flavor_of(v),
    119             function(
   (...)
    126             1,
    127         )

    [... skipping similar frames: _ttype at line 4381 (489 times), TemporalAwkward.from_momentum_fields at line 329 (489 times), LorentzMomentum.mass at line 4137 (489 times), Lorentz.tau at line 3808 (489 times), MomentumAwkward4D.temporal at line 1259 (489 times), dispatch at line 113 (488 times)]

File ~/Code/HEP/vector/src/vector/_compute/lorentz/tau.py:113, in dispatch(v)
    106 def dispatch(v: typing.Any) -> typing.Any:
    107     function, *returns = _from_signature(
    108         __name__,
    109         dispatch_map,
    110         (
    111             _aztype(v),
    112             _ltype(v),
--> 113             _ttype(v),
    114         ),
    115     )
    116     with numpy.errstate(all="ignore"):
    117         return v._wrap_result(
    118             _flavor_of(v),
    119             function(
   (...)
    126             1,
    127         )

File ~/Code/HEP/vector/src/vector/_methods.py:4381, in _ttype(obj)
   4376 def _ttype(obj: VectorProtocolLorentz) -> type[Coordinates]:
   4377     """
   4378     Determines the Temporal type of a vector for use in looking up a
   4379     dispatched function.
   4380     """
-> 4381     if hasattr(obj, "temporal"):
   4382         for t in type(obj.temporal).__mro__:
   4383             if t in (TemporalT, TemporalTau):

File ~/Code/HEP/vector/src/vector/backends/awkward.py:1259, in MomentumAwkward4D.temporal(self)
   1242 @property
   1243 def temporal(self) -> TemporalAwkward:
   1244     """
   1245     Returns a Temporal type object containing the temporal coordinates.
   1246 
   (...)
   1257         (<Array [1, 3] type='2 * int64'>,)
   1258     """
-> 1259     return TemporalAwkward.from_momentum_fields(self)

File ~/Code/HEP/vector/src/vector/backends/awkward.py:329, in TemporalAwkward.from_momentum_fields(cls, array)
    327     return TemporalAwkwardTau(array["m"])
    328 elif "mass" in fields or hasattr(array, "mass"):
--> 329     return TemporalAwkwardTau(array.mass)
    330 else:
    331     raise ValueError(
    332         "array does not have temporal coordinates (t/E/e/energy or tau/M/m/mass): "
    333         f"{', '.join(fields)}"
    334     )

File ~/Code/HEP/vector/src/vector/_methods.py:4137, in LorentzMomentum.mass(self)
   4135 @property
   4136 def mass(self) -> ScalarCollection:
-> 4137     return self.tau

File ~/Code/HEP/vector/src/vector/_methods.py:3808, in Lorentz.tau(self)
   3804 @property
   3805 def tau(self) -> ScalarCollection:
   3806     from vector._compute.lorentz import tau
-> 3808     return tau.dispatch(self)

File ~/Code/HEP/vector/src/vector/_compute/lorentz/tau.py:111, in dispatch(v)
    106 def dispatch(v: typing.Any) -> typing.Any:
    107     function, *returns = _from_signature(
    108         __name__,
    109         dispatch_map,
    110         (
--> 111             _aztype(v),
    112             _ltype(v),
    113             _ttype(v),
    114         ),
    115     )
    116     with numpy.errstate(all="ignore"):
    117         return v._wrap_result(
    118             _flavor_of(v),
    119             function(
   (...)
    126             1,
    127         )

File ~/Code/HEP/vector/src/vector/_methods.py:4357, in _aztype(obj)
   4352 def _aztype(obj: VectorProtocolPlanar) -> type[Coordinates]:
   4353     """
   4354     Determines the Azimuthal type of a vector for use in looking up a
   4355     dispatched function.
   4356     """
-> 4357     if hasattr(obj, "azimuthal"):
   4358         for t in type(obj.azimuthal).__mro__:
   4359             if t in (AzimuthalXY, AzimuthalRhoPhi):

File ~/Code/HEP/vector/src/vector/backends/awkward.py:1220, in MomentumAwkward4D.azimuthal(self)
   1202 @property
   1203 def azimuthal(self) -> AzimuthalAwkward:
   1204     """
   1205     Returns an Azimuthal type object containing the azimuthal coordinates.
   1206 
   (...)
   1218         (<Array [1, 2] type='2 * int64'>, <Array [1.1, 2.2] type='2 * float64'>)
   1219     """
-> 1220     return AzimuthalAwkward.from_momentum_fields(self)

File ~/Code/HEP/vector/src/vector/backends/awkward.py:163, in AzimuthalAwkward.from_momentum_fields(cls, array)
    161     return AzimuthalAwkwardXY(array["px"], array["y"])
    162 elif "px" in fields and "py" in fields:
--> 163     return AzimuthalAwkwardXY(array["px"], array["py"])
    164 elif "rho" in fields and "phi" in fields:
    165     return AzimuthalAwkwardRhoPhi(array["rho"], array["phi"])

File ~/Code/HEP/vector/.env/lib/python3.11/site-packages/awkward/highlevel.py:1066, in Array.__getitem__(self, where)
    636 """
    637 Args:
    638     where (many types supported; see below): Index of positions to
   (...)
   1062 have the same dimension as the array being indexed.
   1063 """
   1064 with ak._errors.SlicingErrorContext(self, where):
   1065     return wrap_layout(
-> 1066         prepare_layout(self._layout[where]),
   1067         self._behavior,
   1068         allow_other=True,
   1069         attrs=self._attrs,
   1070     )

File ~/Code/HEP/vector/.env/lib/python3.11/site-packages/awkward/contents/content.py:512, in Content.__getitem__(self, where)
    511 def __getitem__(self, where):
--> 512     return self._getitem(where)

File ~/Code/HEP/vector/.env/lib/python3.11/site-packages/awkward/contents/content.py:529, in Content._getitem(self, where)
    526     return self._getitem((where,))
    528 elif isinstance(where, str):
--> 529     return self._getitem_field(where)
    531 elif where is np.newaxis:
    532     return self._getitem((where,))

File ~/Code/HEP/vector/.env/lib/python3.11/site-packages/awkward/contents/recordarray.py:458, in RecordArray._getitem_field(self, where, only_fields)
    454 def _getitem_field(
    455     self, where: str | SupportsIndex, only_fields: tuple[str, ...] = ()
    456 ) -> Content:
    457     if len(only_fields) == 0:
--> 458         return self.content(where)
    460     else:
    461         nexthead, nexttail = ak._slicing.head_tail(only_fields)

File ~/Code/HEP/vector/.env/lib/python3.11/site-packages/awkward/contents/recordarray.py:394, in RecordArray.content(self, index_or_field)
    393 def content(self, index_or_field: str | SupportsIndex) -> Content:
--> 394     out = super().content(index_or_field)
    395     if (
    396         self._length is unknown_length
    397         or out.length is unknown_length
    398         or out.length == self._length
    399     ):
    400         return out

File ~/Code/HEP/vector/.env/lib/python3.11/site-packages/awkward/_meta/recordmeta.py:133, in RecordMeta.content(self, index_or_field)
    132 def content(self, index_or_field: int | str) -> T:
--> 133     if is_integer(index_or_field):
    134         index = index_or_field
    135     elif isinstance(index_or_field, str):

File ~/Code/HEP/vector/.env/lib/python3.11/site-packages/awkward/_regularize.py:27, in is_integer(x)
     26 def is_integer(x) -> bool:
---> 27     return isinstance(x, numbers.Integral) and not isinstance(x, bool)

File <frozen abc>:119, in __instancecheck__(cls, instance)

RecursionError: maximum recursion depth exceeded in comparison

This error occurred while calling

    numpy.power.__call__(
        <MomentumArray4D [{mass: 1, px: 1, py: 1, ...}] type='1 * Momentum4...'>
        2
    )

I am not sure if there is a hotfix for this. (Given that this is not a small fix, I will not be able to work on it in the next couple of months. I'll definitely pick it up if it stays unresolved till then (or if I get spare time in between) ☹️)

@Saransh-cpp Saransh-cpp added the help wanted Extra attention is needed label Sep 7, 2024
@ikrommyd
Copy link
Author

ikrommyd commented Sep 7, 2024

So from the physicist's perspective, one fix I can do is to replace the photon collection. I'm not sure if there are any caveats here I'm not seeing

In [31]: from coffea.nanoevents import NanoEventsFactory

In [32]: events = NanoEventsFactory.from_root({"tests/samples/DYto2E.root": "Events"}).events()

In [33]: events["Photon"] = ak.zip({"pt": events.Photon.pt, "eta": events.Photon.eta, "phi": events.Photon.phi, "mass": ak.zeros_like(events.Photon.pt), "charge": ak.zeros_like(events.Photon.pt)}, with_name="PtEtaPhiMCandidate")

In [34]: events.Photon + events.Photon
Out[34]: dask.awkward<add, npartitions=1>

In [35]: (events.Photon + events.Photon).mass.compute()
Out[35]: <Array [[0.0442, 0], ..., [-0.0442, -0.0625]] type='1000 * var * float32'>

And of course, I should add all the other photon fields in the ak.zip dictionary.

@ikrommyd
Copy link
Author

ikrommyd commented Sep 7, 2024

@Saransh-cpp Should I also do from coffea.nanoevents.methods import candidate and behavior=candidate.behavior within the ak.zip or it doesn't do anything extra?

One more thing I'd like to comment is that if there isn't gonna be a fix quickly, I don't know if we can expect people to know this and replace the photon collection. This issue breaks basically every analysis that wants to calculate an invariant mass between photons and some other object (or themselves). If it is to stay like this for some time I'm wondering if it makes sense to have this photon replacement happen during nanoevents building in coffea as a temporary fix. This would be adding extra nodes to the events graph though from the beginning but the necessary columns optimization should take care of that during optimization. It should at least be announced though so that people know how to bypass this.

@ikrommyd
Copy link
Author

ikrommyd commented Sep 7, 2024

So from the physicist's perspective, one fix I can do is to replace the photon collection. I'm not sure if there are any caveats here I'm not seeing

In [31]: from coffea.nanoevents import NanoEventsFactory

In [32]: events = NanoEventsFactory.from_root({"tests/samples/DYto2E.root": "Events"}).events()

In [33]: events["Photon"] = ak.zip({"pt": events.Photon.pt, "eta": events.Photon.eta, "phi": events.Photon.phi, "mass": ak.zeros_like(events.Photon.pt), "charge": ak.zeros_like(events.Photon.pt)}, with_name="PtEtaPhiMCandidate")

In [34]: events.Photon + events.Photon
Out[34]: dask.awkward<add, npartitions=1>

In [35]: (events.Photon + events.Photon).mass.compute()
Out[35]: <Array [[0.0442, 0], ..., [-0.0442, -0.0625]] type='1000 * var * float32'>

And of course, I should add all the other photon fields in the ak.zip dictionary.

Update 1: there is an issue with that fix. Photons are losing attributes like

FAILED tests/test_tag_and_probe_nanoaod.py::test_tag_and_probe_photons_trigger[True] - AttributeError: no field named 'matched_electron'
FAILED tests/test_tag_and_probe_nanoaod.py::test_tag_and_probe_photons_id[True] - AttributeError: no field named 'matched_electron'

Update 2: Oh I can do with_name="Photon" looks like.

@Saransh-cpp
Copy link
Member

Copying comment from the PR linked above -


Hi! It should be behavior=nanoaod.behavior. candidate.behavior is included within nanoaod.behavior. coffea does not update the global awkward behavior (which is a good thing), which is why you need that extra behavior= arg.

Alternatively, to skip the behavior keyword, you could do something like -

>>> ak.behavior.update(nanoaod.behavior)
>>> a = ak.zip({"x": [1], "y": [2], "z": [3], "m": [4]}, with_name="Photon")
>>> a
<PhotonArray [{x: 1, y: 2, z: 3, m: 4}] type='1 * Photon[x: int64, y: int64...'>

Makes me wonder if coffea should have something like vector.register_awkward to make it easy for the users.

@lgray
Copy link

lgray commented Sep 9, 2024

No, it was a conscious choice to make the behaviors contained within coffea objects, it's less confusing when you start using distributed environments to make sure everything is synced.

As for a fix that doesn't break everything else, we can hack NanoEvents to just make all-zero fields for specified columns without adding extra nodes. We already do a variant of this for trigger names where we produce chunks of Nones when the trigger table changes in a single dataset. Just have to do the same thing with zeros for Photon or other objects.

Can add in a _default_columns private variable or something like that. If the column doesn't exist on disk, generate it on the fly with a given value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants