data_groups #5

johanvonboer · 2023-08-31T14:01:25Z

In some cases, notably dendro and ceramics, but also abundance counting datasets, I use a concept I call "data_groups" which are groupings of datasets since each key/value pair in e.g. a dendro analysis is considered a "dataset". It is however, impractical to use it like this, we need something to bind the various datasets belonging to the same sample together somehow, and this is the basic concept of a data_group, if I remember correctly, let's hope I do.

Anyway, the point here is that this needs to be looked over. These "data groups" need to be as consistent as possible across various data types, I am not currently sure they are. They are also currently outputted in parallel with the regular datasets array from the JAS server, which is inefficient since it leads to outputting the same data twice to a high degree. Perhaps it would be possible to create clever bindings/references across the data structure which would avoid this to a large degree?

All of this of course also begs the question; Why do we even need this "data groups" construct to begin with? Can't we just re-arrange our data so that a "dataset" becomes the more intuitive grouping that the "data group" is trying to be? The answer to that is probably yes, but this requires structural changes in the database.

johanvonboer · 2023-09-01T09:50:15Z

I have found out that a 'data_group' can be quite a different thing depending context. Wonderful.

For example, a dendro data_group has 'datasets' attached to it, while a C14 data_group has 'data_points'.

johanvonboer · 2023-09-01T11:49:13Z

Also, have a look at the postProcessSiteData in the MeasuredValuesModule. There we have something we're calling "datasets" inside data groups, but they are something quite different from what we normally call datasets. This should be corrected.

johanvonboer · 2023-09-01T12:06:27Z

We need to keep the data group concept until the database is fixed. But we should perhaps rename it to "dataset groups", because that's what they are. The core issue here is that ceramics and dendro have their data stored in the way that each key/value pair in an analysis on a sample is stored as a separate dataset with just one analysis entity in each.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data_groups #5

data_groups #5

johanvonboer commented Aug 31, 2023

johanvonboer commented Sep 1, 2023 •

edited

Loading

johanvonboer commented Sep 1, 2023

johanvonboer commented Sep 1, 2023

data_groups #5

data_groups #5

Comments

johanvonboer commented Aug 31, 2023

johanvonboer commented Sep 1, 2023 • edited Loading

johanvonboer commented Sep 1, 2023

johanvonboer commented Sep 1, 2023

johanvonboer commented Sep 1, 2023 •

edited

Loading