Skip to content
Michael Luggen edited this page Jun 25, 2021 · 27 revisions

Cube Creator Context

Why should you create Cubes?

A lot of data in organisations is organised in tables with multiple columns. Typically you can think of Excel or CSV tables. This data often has a dimension which describes the time, sometimes other dimensions for classifying the data, in most cases some actual observed values or counts.

Such tables are holding the data in a structured form, but most of the time the information to understand the dimensions and also the metadata enabling us to create use-full visualisations – based this data it is missing.

With Cube-Creator we allow you as data provider to augment and annotate your data with everything necessary to understand the input data directly inside the dataset and also to visualise your data with tools like visualize.admin.ch .

How can we create Cubes?

With the cube-creator.

What do you need create a Cubes?

Another way of thinking about this datasets are multi-dimensional cubes.

image cube

In the image above we have a cube with 3 dimensions.

What is missing

Fabian: I am no cube specialist, and no statistics specialist either, so I would need some help here, but I start with some questions/discussions that seems needed IMHO. The goal would certainly be to describe a Cube as simply as possible, so that "Excel" users will understand the tool, and Statistical experts, who are familiar with complex cubes, would also understand the tool.

I did not find a specific description of what a cube is on Zazuko's cube page, and the definition given for the W3C Data Cube seems too complex.

Would it be possible to simplify the description, for instance:
"A cube is a collection of observations.
Within this tool, a cube can be seen as a table or matrix (similar to a spreadsheet), where each line is an observation and each column is a dimension. All observations of a cube have the same structure (i.e. same dimensions)"

That proposal would reduce the concept of "Cube" (a multi-dimensional representation) to the concept of Table (a two-dimensional representation)
-> would that be ok ? maybe yes, as the main definition of the cube is the "Observation table".
And maybe here we should take into consideration the further coming explanation about "literal" vs "link to another table" situation.
This representation of a main table with links to "secondary" tables should be clear for the end-users, as already discussed with Véronique (similar to databases, spreadsheet, etc.).

This vision can make sens if, compared to the W3C RDF data cube that makes distinctions between 3 different components "dimensions, attributes and measures", Zazuko designed its cube with a much "simpler" concept where those 3 components are now just "dimensions", is it the case ?

My current understanding is that "column" refers to the columns of the CSV files, and "dimension" can be used for each line of an Output table (for both, literal value and link to another table). This understanding does match the wording on the Cube Designer, where each icon is labeled "edit dimension metadata" (for both, literal value and link to another table), and it also matches the current page for the Cube Designer.

The Mapping as Literal Property vs. Mapping as Dimension makes a distinction between "literal" vs "dimension", which is also not clear to me (and to Veronique neither, see her comment there), all those terms need clarification now.

Was any decision taken about this, is there still a need to go further with the issue Implement naming concept / consistent wording

Question: do we reuse the former glossary ? if yes, it will need clean-up and clarification (still mentions of "pipeline", "rdf", etc.)

Clone this wiki locally