You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some users frame their evals as a single Task but with many Samples. We currently only include the task name in the annotations (which end up as annotations on pods etc.). This can make it hard when you see a failed Pod to understand what it relates to.
This is because this is what Inspect provides us with. But it also provides the Sample metadata.
2 ideas:
Suggest that Inspect adds a sample id to the inspect SandboxEnvironment interface (4 methods). This is quite a faff as its a change in the core of inspect which lots of environment providers depend on. And sample id might not be useful to some people.
Read an optional HELM_ANNOTATIONS value from the Sample.metadata dict (which Inspect gives us already). This would itself be a dict and we'd put it all into the release annotations so you'd see any values you set there alongside the inspectTaskName.
We could log this additional metadata when we have a Helm install error for example.
You could also use this for injecting more useful info in, such as the eval set name or other identifiers.
Note: We'd have to sort out the fact that the metadataparameter is typed as dict[str, str] whereas I think it ought to be dict[str, Any] as per Sample.metadata. If it really can't be dict[str, Any] then we'd have to define a convention like HELM_ANNOTATION_*.
The text was updated successfully, but these errors were encountered:
Some users frame their evals as a single
Task
but with manySample
s. We currently only include the task name in the annotations (which end up as annotations on pods etc.). This can make it hard when you see a failedPod
to understand what it relates to.This is because this is what Inspect provides us with. But it also provides the
Sample
metadata.2 ideas:
SandboxEnvironment
interface (4 methods). This is quite a faff as its a change in the core of inspect which lots of environment providers depend on. And sample id might not be useful to some people.HELM_ANNOTATIONS
value from theSample.metadata
dict (which Inspect gives us already). This would itself be adict
and we'd put it all into the releaseannotations
so you'd see any values you set there alongside theinspectTaskName
.We could log this additional metadata when we have a Helm install error for example.
You could also use this for injecting more useful info in, such as the eval set name or other identifiers.
Note: We'd have to sort out the fact that the
metadata
parameter is typed asdict[str, str]
whereas I think it ought to bedict[str, Any]
as perSample.metadata
. If it really can't bedict[str, Any]
then we'd have to define a convention likeHELM_ANNOTATION_*
.The text was updated successfully, but these errors were encountered: