You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Risingwave worked well with those naive strings, Unfortunately, we currently parse and store those data as JSON, and letting users access those fields in materialized views. This will surely lead to printing out JSON object instead of timestamps, date times, long integers...
How did you deploy RisingWave?
This could be reproduced with a docker compose project.
ClSlaid
changed the title
[cdc] MongoDB CDC source dealing with canonical timestamps and other structural data types
Bug: MongoDB CDC source dealing with canonical timestamps and other structural data types
Jan 21, 2025
Looks related #19982
While the JSON implementation may be as raw as possible (See #17650 (comment) for discussion on DynamoDB, but it is not merged), the strongly typed syntax in #19982 shall handle these to provide a better experience.
@xiangjinwu Do you think we need to convert jsoin with relax bson fields to a more friendly json?
No. These extra annotations existed for a reason, and stripping them is a lossy conversion. In other words, this "friendly json" can be a third option after strong types and raw json. Essentially we are inventing another format - why not use "released": "2024-12-19T11:23:41.137Z" instead of "released": "1734607421137"?
Update:
Essentially we are inventing another format - why not use "released": "2024-12-19T11:23:41.137Z" instead of "released": "1734607421137"?
If MongoDB has formally defined such an annotation-free format already, then yes we can support converting to it.
Describe the bug
Current MongoDB CDC implementation doesn't work well with timestamps and datetimes in MongoDB.
MongoDB struct:
Run following query:
Error message/log
To Reproduce
For detailed deployment, check the How did you deploy RisingWave? part.
Expected behavior
MongoDB records its data as
BSON
, which will encode data along with its type together:It will then be capsuled in debezium messages as:
Risingwave worked well with those naive strings, Unfortunately, we currently parse and store those data as JSON, and letting users access those fields in materialized views. This will surely lead to printing out JSON object instead of timestamps, date times, long integers...
How did you deploy RisingWave?
This could be reproduced with a docker compose project.
The project is largely edited from https://github.com/risingwavelabs/risingwave/tree/711fedf72865c5a799ae0b6d1c892c84d1aff4de/integration_tests/debezium-mongo .
keyfile
is a MongoDB key file, you can simply generate it using base64.The version of RisingWave
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: