-
Notifications
You must be signed in to change notification settings - Fork 16
Make dynascore metric weights settable in task owner interface #763
Make dynascore metric weights settable in task owner interface #763
Conversation
api/models/task.py
Outdated
default_metric_weights = db.orm.relationship( | ||
"DefaultMetricWeight", backref="task", lazy="dynamic" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just curious - is there a reason why we don't use orm.relationships to allow parents to access their children in a one-to-many/many-to-many relationship?
eg.
dynabench/api/controllers/tasks.py
Lines 188 to 190 in 79981d5
dm = DatasetModel() | |
dataset_list = [] | |
datasets = dm.getByTid(tid) |
dynabench/api/models/dataset.py
Lines 68 to 70 in 79981d5
def getByTid(self, task_id): | |
try: | |
return self.dbs.query(Dataset).filter(Dataset.tid == task_id).all() |
in contrast to the examples in sqlalchemy's relationship documentation (which I temporarily implemented here as an example, and allows one to access the default_metric_weights
of a specific task using some_task.default_metric_weights
tried to look up performance differences but couldn't find anything, so thought I'd ask
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not aware of a good reason for this. We could ask Douwe when he gets back.
tl;dr for the word vomit above - in my head the options to implement the feature are:
and not sure which one you'd prefer I go with. I'm thinking 2 or 3 |
I think that it could make most sense to implement this feature in two steps: -Enable to task owner to modify the aggregation metric part of the annotation config whenever they want (the thing at e.g., task-owner-interface/nli#advanced). |
2ae914e
to
79981d5
Compare
…okrui/dynabench into task_owner_set_default_weights
This PR allows the Known issue: the Task Configuration UI element above the Edit Task Configuration card requires a reload for the I'll implement dataset weight setter in a separate PR or later here once any kinks have been worked out for the metric weight setter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is looking great! But I think that a decent amount of the affected code will change when we modularize the aggregation_metric code. Let's just make this PR as basic as possible, for now. Instead of adding a new UI component for the aggregation_metric, we could just make the annotation_config_json editable, and return an error message if someone tries to modify anything besides the aggregation_metric. Does that sound ok? Also, I left a comment below. Thanks!
@@ -417,6 +417,52 @@ def update(credentials, tid): | |||
return util.json_encode({"success": "ok"}) | |||
|
|||
|
|||
@bottle.put("/tasks/update_annotation_config/<tid:int>") | |||
@_auth.requires_auth | |||
def update_annotation_config(credentials, tid): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would be more scalable if we just allowed the user to update the object with the field "aggregation_metric" in annotation_config_json. So you don't need to do anything here with metric_default_weights or aggregation_metric type. Before saving the update, we could still check to make sure it's valid with "verify_annotation_config".
replaced by #825 |
Created a draft PR first because I'm sort of at a crossroads and wanted to get some opinions on a few things @douwekiela @TristanThrush . A big part of my approach for this PR was to try introduce the functionality while ensuring that this change introduced 0 extra effort to the process of adding new metrics (since ease of adding metrics was mentioned as a strength in the Dynaboard paper).
I suppose if one were to ignore that the simplest implementation would be to add 5 columns in
task
, 1 for each existing metric, ala theScore
model. But what I tried to do (and what you can see on this PR so far) was to implement aDefaultMetricWeight
model that shares a many-to-one relationship withTask
. Would appreciate your thoughts on what I've got so far - is it too clunky/over-engineered?I was thinking that if a long-term goal was to modularise metrics (since we're planning to modularise Aggregated Metrics anyways as per #761) and create a TaskMetric model or something (where as a corollary a task owner can perhaps customise what metrics they want to be considered at all per round, rather than having to manually set the weights of a hypothetically far larger N possible metrics to 0), I might as well try to jump straight to that instead of having this
DefaultMetricWeight
model.Thanks in advance for the help and hope this makes sense - in the meantime I'll work on the frontend interface + other issues.