Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement]: time_bucket_gapfill limit for how far back and forward it will look per row #7618

Open
angus-cummings opened this issue Jan 24, 2025 · 0 comments
Labels
enhancement An enhancement to an existing feature for functionality

Comments

@angus-cummings
Copy link

What type of enhancement is this?

Other

What subsystems and features will be improved?

Gapfill

What does the enhancement do?

I am working to replace a piece of legacy Java software that received timeseries data from a variety of sources and produced a merged dataset with values that were interpolated to a specified interval (normally 5 seconds). It was also configurable with a MaxInterpolationInterval (default 24), which is the number of samples of gappy data that it is allowed to interpolate over.

I've got our data being written to TimescaleDB hypertables, I've got continuous aggregates for each table that are producing 5s mean-averaged values, but I need a way to manage intepolation for instruments that report less frequently than 5s.

I would love to be able to call time_bucket_gapfill('5 seconds', t1, 24) , or time_bucket_gapfill('5 seconds', t1, '120 seconds') and have interpolate only consider values that are within 120 seconds of each other.

For context, this is the sort of query I'm looking at running. The 'merged' schema is the existing continuous aggregates. I may actually be able to do away with the merged 'schema' in the future,

`WITH
tbg0 AS (
SELECT time_bucket_gapfill('5 seconds', merged.airsamplinginlet.bucket) as timestamp,

INTERPOLATE(AVG(merged.airsamplinginlet.windbearing)) as rawTrackingBearing 
    FROM merged.airsamplinginlet
    WHERE merged.airsamplinginlet.bucket BETWEEN '2024-04-17 22:00:00' AND '2024-04-17 23:00:00'
    GROUP BY 1), 
tbg1 AS (
    SELECT time_bucket_gapfill('5 seconds', merged.dl850_doppler.bucket) as timestamp,
            
INTERPOLATE(AVG(merged.dl850_doppler.longitudinalgroundspeed)) as rawLongitudinalGroundSpeed,
INTERPOLATE(AVG(merged.dl850_doppler.longitudinalwaterspeed)) as rawLongitudinalWaterSpeed,
INTERPOLATE(AVG(merged.dl850_doppler.transversegroundspeed)) as rawTransverseGroundSpeed 
    FROM merged.dl850_doppler
    WHERE merged.dl850_doppler.bucket BETWEEN '2024-04-17 22:00:00' AND '2024-04-17 23:00:00'
    GROUP BY 1), 
tbg2 AS (
    SELECT time_bucket_gapfill('5 seconds', merged.dropkeel.bucket) as timestamp,
            
INTERPOLATE(AVG(merged.dropkeel.portpos)) as rawPortKeelExtension,
INTERPOLATE(AVG(merged.dropkeel.stbdpos)) as rawStarboardKeelExtension 
    FROM merged.dropkeel
    WHERE merged.dropkeel.bucket BETWEEN '2024-04-17 22:00:00' AND '2024-04-17 23:00:00'
    GROUP BY 1), 
tbg3 AS (
    SELECT time_bucket_gapfill('5 seconds', merged.gyro1.bucket) as timestamp,
            
INTERPOLATE(AVG(merged.gyro1.headingtrue)) as rawGyroHeading 
    FROM merged.gyro1
    WHERE merged.gyro1.bucket BETWEEN '2024-04-17 22:00:00' AND '2024-04-17 23:00:00'
    GROUP BY 1), 
tbg4 AS (
    SELECT time_bucket_gapfill('5 seconds', merged.maap.bucket) as timestamp,
            
INTERPOLATE(AVG(merged.maap.airflowrate)) as rawAirFlow,
INTERPOLATE(AVG(merged.maap.barometricpressure)) as rawAtmPressure,
INTERPOLATE(AVG(merged.maap.blackcarbonconcentration)) as rawBlackCarbonConc 
    FROM merged.maap
    WHERE merged.maap.bucket BETWEEN '2024-04-17 22:00:00' AND '2024-04-17 23:00:00'
    GROUP BY 1), 
tbg5 AS (
    SELECT time_bucket_gapfill('5 seconds', merged.met.bucket) as timestamp,
            
INTERPOLATE(AVG(merged.met.anemrelativewinddir)) as rawUltrasonicRelWindDir,
INTERPOLATE(AVG(merged.met.anemrelativewindspeed)) as rawUltrasonicRelWindSpeed,
INTERPOLATE(AVG(merged.met.portcumulativehourrain)) as rawPortRain,
INTERPOLATE(AVG(merged.met.porthumidity)) as rawPortHumidity,
INTERPOLATE(AVG(merged.met.portparradiation)) as rawPortPAR,
INTERPOLATE(AVG(merged.met.portpirincomewave)) as rawPortRadiometer,
INTERPOLATE(AVG(merged.met.portpspincomewave)) as rawPortPyranometer,
INTERPOLATE(AVG(merged.met.portrelativewinddir)) as rawPortRelWindDir,
INTERPOLATE(AVG(merged.met.portrelativewindspeed)) as rawPortRelWindSpeed,
INTERPOLATE(AVG(merged.met.porttemperature)) as rawPortAirTemp,
INTERPOLATE(AVG(merged.met.stbdcumulativehourrain)) as rawStbdRain,
INTERPOLATE(AVG(merged.met.stbdhumidity)) as rawStbdHumidity,
INTERPOLATE(AVG(merged.met.stbdparradiation)) as rawStbdPAR,
INTERPOLATE(AVG(merged.met.stbdpspincomewave)) as rawStbdPyranometer,
INTERPOLATE(AVG(merged.met.stbdrelativewinddir)) as rawStbdRelWindDir,
INTERPOLATE(AVG(merged.met.stbdrelativewindspeed)) as rawStbdRelWindSpeed,
INTERPOLATE(AVG(merged.met.stbdtemperature)) as rawStbdAirTemp 
    FROM merged.met
    WHERE merged.met.bucket BETWEEN '2024-04-17 22:00:00' AND '2024-04-17 23:00:00'
    GROUP BY 1), 
tbg7 AS (
    SELECT time_bucket_gapfill('5 seconds', merged.pco2.bucket) as timestamp,
            
INTERPOLATE(AVG(merged.pco2.condensertemperature)) as rawCondTemp,
INTERPOLATE(AVG(merged.pco2.equilibratorpressure)) as rawEquPressure,
INTERPOLATE(AVG(merged.pco2.equilibratortemperature)) as rawEquTemp,
INTERPOLATE(AVG(merged.pco2.labpressure)) as rawLicorPressure,
INTERPOLATE(AVG(merged.pco2.licorflow)) as rawLicorFlow,
INTERPOLATE(AVG(merged.pco2.ventflow)) as rawVentFlow,
INTERPOLATE(AVG(merged.pco2.waterflow)) as rawWaterFlow,
INTERPOLATE(AVG(merged.pco2.watervapour)) as rawWaterVapour,
INTERPOLATE(AVG(merged.pco2.xco2)) as rawXCO2 
    FROM merged.pco2
    WHERE merged.pco2.bucket BETWEEN '2024-04-17 22:00:00' AND '2024-04-17 23:00:00'
    GROUP BY 1), 
tbg8 AS (
    SELECT time_bucket_gapfill('5 seconds', merged.picarro.bucket) as timestamp,
            
INTERPOLATE(AVG(merged.picarro.ch4dry)) as rawCh4Dry,
INTERPOLATE(AVG(merged.picarro.co2dry)) as rawCo2Dry,
INTERPOLATE(AVG(merged.picarro.h2o)) as rawH2O 
    FROM merged.picarro
    WHERE merged.picarro.bucket BETWEEN '2024-04-17 22:00:00' AND '2024-04-17 23:00:00'
    GROUP BY 1), 
tbg9 AS (
    SELECT time_bucket_gapfill('5 seconds', merged.seapath.bucket) as timestamp,
            
INTERPOLATE(AVG(merged.seapath.coursetrue)) as rawCourseOG,
INTERPOLATE(AVG(merged.seapath.headingtrue)) as rawShipHeading,
INTERPOLATE(AVG(merged.seapath.latitude)) as rawLatitude,
INTERPOLATE(AVG(merged.seapath.longitude)) as rawLongitude,
INTERPOLATE(AVG(merged.seapath.speedoverground)) as rawSpeedOG 
    FROM merged.seapath
    WHERE merged.seapath.bucket BETWEEN '2024-04-17 22:00:00' AND '2024-04-17 23:00:00'
    GROUP BY 1), 
tbg10 AS (
    SELECT time_bucket_gapfill('5 seconds', merged.underway_seawater.bucket) as timestamp,
            
INTERPOLATE(AVG(merged.underway_seawater.branchflow)) as rawLabBranchFlow,
INTERPOLATE(AVG(merged.underway_seawater.flowmeter)) as rawLabMainFlow,
INTERPOLATE(AVG(merged.underway_seawater.fluorescein)) as rawFluorescenceConcentration,
INTERPOLATE(AVG(merged.underway_seawater.fluorescence)) as rawFluorescence,
INTERPOLATE(AVG(merged.underway_seawater.fluoroflowrate)) as rawFluorometerFlow,
INTERPOLATE(AVG(merged.underway_seawater.fluorovoltage)) as rawFluorescenceVolts,
INTERPOLATE(AVG(merged.underway_seawater.optodeairsaturation)) as rawDoSaturation,
INTERPOLATE(AVG(merged.underway_seawater.optodeblueexcitationphase)) as rawBPhase,
INTERPOLATE(AVG(merged.underway_seawater.optodeoxygenconcentration)) as rawDo,
INTERPOLATE(AVG(merged.underway_seawater.optoderawtemperature)) as rawOptodeWaterTemp,
INTERPOLATE(AVG(merged.underway_seawater.ssttemp)) as rawWaterTemp,
INTERPOLATE(AVG(merged.underway_seawater.tsgflowrate)) as rawTsgFlow,
INTERPOLATE(AVG(merged.underway_seawater.tsgtemp)) as rawTsgSensorTemp,
INTERPOLATE(AVG(merged.underway_seawater.watersalinity)) as rawSalinity 
    FROM merged.underway_seawater
    WHERE merged.underway_seawater.bucket BETWEEN '2024-04-17 22:00:00' AND '2024-04-17 23:00:00'
    GROUP BY 1) SELECT tbg0.timestamp , rawTrackingBearing, rawLongitudinalGroundSpeed, rawLongitudinalWaterSpeed, rawTransverseGroundSpeed, rawPortKeelExtension, rawStarboardKeelExtension, rawGyroHeading, rawAirFlow, rawAtmPressure, rawBlackCarbonConc, rawUltrasonicRelWindDir, rawUltrasonicRelWindSpeed, rawPortRain, rawPortHumidity, rawPortPAR, rawPortRadiometer, rawPortPyranometer, rawPortRelWindDir, rawPortRelWindSpeed, rawPortAirTemp, rawStbdRain, rawStbdHumidity, rawStbdPAR,  rawStbdPyranometer, rawStbdRelWindDir, rawStbdRelWindSpeed, rawStbdAirTemp, rawCondTemp, rawEquPressure, rawEquTemp, rawLicorPressure, rawLicorFlow, rawVentFlow, rawWaterFlow, rawWaterVapour, rawXCO2, rawCh4Dry, rawCo2Dry, rawH2O, rawCourseOG, rawShipHeading, rawLatitude, rawLongitude, rawSpeedOG, rawLabBranchFlow, rawLabMainFlow, rawFluorescenceConcentration, rawFluorescence, rawFluorometerFlow, rawFluorescenceVolts, rawDoSaturation, rawBPhase, rawDo, rawOptodeWaterTemp, rawWaterTemp, rawTsgFlow, rawTsgSensorTemp, rawSalinity FROM tbg0 
JOIN tbg1 on tbg0.timestamp = tbg1.timestamp
JOIN tbg2 on tbg0.timestamp = tbg2.timestamp
JOIN tbg3 on tbg0.timestamp = tbg3.timestamp
JOIN tbg4 on tbg0.timestamp = tbg4.timestamp
JOIN tbg5 on tbg0.timestamp = tbg5.timestamp
JOIN tbg7 on tbg0.timestamp = tbg7.timestamp
JOIN tbg8 on tbg0.timestamp = tbg8.timestamp
JOIN tbg9 on tbg0.timestamp = tbg9.timestamp
JOIN tbg10 on tbg0.timestamp = tbg10.timestamp
;`

Thanks,
Angus

Implementation challenges

No response

@angus-cummings angus-cummings added the enhancement An enhancement to an existing feature for functionality label Jan 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An enhancement to an existing feature for functionality
Projects
None yet
Development

No branches or pull requests

1 participant