Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

print_tree of parent can freeze shape of child #1031

Open
ddkohler opened this issue Oct 26, 2021 · 2 comments
Open

print_tree of parent can freeze shape of child #1031

ddkohler opened this issue Oct 26, 2021 · 2 comments
Labels

Comments

@ddkohler
Copy link
Contributor

ddkohler commented Oct 26, 2021

e.g.

root = wt.Collection(name="root")
data = root.create_data("data")

root.print_tree()

data.create_variable(name="x", values=np.arange(4))
data.transform("x")

print(data.shape)

data.shape is (). If root.print_tree() is removed, shape is (4,).

Seems to only apply when parent is called. e.g shape remains (4,) if data.print_tree() is called before or after create_variable.

@ddkohler ddkohler added the bug label Oct 26, 2021
@ksunden
Copy link
Member

ksunden commented Oct 26, 2021

This is only indirectly due to the print tree call:

The shape is cached when you look it up, so even just a print(data.shape) will do the same.

In most cases it is not a valid operation to change the shape of the data by adding variables... cases where it is valid include:

  • First variable/channel added (here); this is the only place where adding ndim (without generating a new data object) is valid, as far as I'm concerned
  • adding a broadcasting axis for an existing ndim

We currently do nothing to enforce these constraints.

My proposal is to put these checks into the create_x functions and invalidate the shape cache (self._shape = None) accordingly. This will also help prevent creation of broken data objects with inconsistent shapes.

@darienmorrow
Copy link
Member

This flavor of problem---poorly formed data object creation causing wonky downstream issues--- is similar to a problem I had with units. #1010

I am wondering if we should think carefully about what constraints we are assuming, and then explicitly checking those on data object creation.

Explicit is better than implicit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

3 participants