-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
null filling and coalescing for mv agg query #6088
base: main
Are you sure you want to change the base?
Conversation
@@ -205,6 +205,7 @@ message MetricsViewSpec { | |||
string format_d3 = 7; | |||
google.protobuf.Struct format_d3_locale = 13; | |||
bool valid_percent_of_total = 6; | |||
string treat_nulls_as = 14; // TODO what should the type, using string values will not work when coalescing numeric cols |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should just consider this to be a SQL expression to be templated into the query literally. For example:
treat_nulls_as: 0
treat_nulls_as: CAST(0 AS HUGEINT)
treat_nulls_as: "'Not available'"
runtime/metricsview/ast.go
Outdated
@@ -689,7 +699,87 @@ func (a *AST) buildSpineSelect(alias string, spine *Spine, tr *TimeRange) (*Sele | |||
} | |||
|
|||
if spine.TimeRange != nil { | |||
return nil, errors.New("time_range not yet supported in spine") | |||
if a.dialect == drivers.DialectDruid { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nowhere else in ast.go
or astsql.go
is there a hard-coded reference to a dialect. It would be very nice to avoid adding that now. Look at the various places that a.dialect
are used – it always pushes the dialect-specific handling into the dialect implementation. I think something similar should be possible here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moving this to drivers/olap.go
causing cyclic dependency because of metricsview.TimeGrain
, metricsview package need drivers package
runtime/metricsview/ast.go
Outdated
for _, cjs := range cpy.CrossJoinSelects { | ||
for _, f := range cjs.DimFields { | ||
s.DimFields = append(s.DimFields, FieldNode{ | ||
Name: f.Name, | ||
DisplayName: f.DisplayName, | ||
Expr: a.sqlForMember(cpy.Alias, f.Name), | ||
}) | ||
} | ||
|
||
if len(cjs.UnionAllSelects) > 0 { | ||
// All dimensions will be same across UNION ALL SELECTS so we can just pick the first one | ||
for _, f := range cjs.UnionAllSelects[0].DimFields { | ||
s.DimFields = append(s.DimFields, FieldNode{ | ||
Name: f.Name, | ||
DisplayName: f.DisplayName, | ||
Expr: a.sqlForMember(cpy.Alias, f.Name), | ||
}) | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this behavior different than the handling of LeftJoinSelects
and JoinComparisonSelect
? I think this might relate to the previously mentioned issue of not setting FromSelect
?
Ideally after the wrap, these can just be set to nil
since they are handled in the nested select.
@@ -124,6 +124,7 @@ type TimeSpine struct { | |||
Start time.Time `mapstructure:"start"` | |||
End time.Time `mapstructure:"end"` | |||
Grain TimeGrain `mapstructure:"grain"` | |||
Alias string `mapstructure:"alias"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this do? Ideally it should apply only to the time dimension if it's requested, and in that case, the time dimension already has an alias provided in Dimension
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used while constructing the inline select statement for time ranges and for also removing time dim from dim select node which will be cross joined with range select. Yes its present in the query dimension however to use it here it will need to passed down separately as in this context only ast dimensions are present.
if q.TimeRange == nil || q.TimeRange.Start == nil || q.TimeRange.End == nil { | ||
return nil, fmt.Errorf("time range is required for null fill") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might still work if there's a relative time range (like q.TimeRange.IsoDuration
), and this might be needed e.g. for alerts/reports that use a time spine.
It would be resolved to a fixed time range before the query is turned into an AST. It happens here:
func (e *Executor) rewriteQueryTimeRanges(ctx context.Context, qry *Query, executionTime *time.Time) error { |
Start: s, | ||
End: e, | ||
Grain: timeDim.Compute.TimeFloor.Grain, | ||
Alias: timeDim.Name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be handled inside the metricsview
package by applying it to any dimension that has a TimeFloor
applied (and erroring if zero or multiple dimensions have a TimeFloor
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WhereSpine
is created in the query package so did similar. You mean to create this inside NewAST
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a separate comment, can we add some tests for this feature for all of DuckDB, Druid and ClickHouse in runtime/resolvers/testdata
?
These PR adds supports for two features-
fill_missing
- This is a metrics view aggregation query level config and when a computed time dimension is used, it fills in the missing time buckets in the response.treat_nulls_as
- This is a measure level config used to configure what value to fill in for missing time buckets. This also works generally as COALESCING over non empty time buckets.Closes https://github.com/rilldata/rill-private-issues/issues/786