Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using node features (featureless=False in first conv-layer) #19

Open
JulianNeuberger opened this issue Nov 3, 2020 · 0 comments
Open

Comments

@JulianNeuberger
Copy link

Hi @tkipf,

first of all, thank you for your work on GCNs, I'm currently researching their application in my domain and really like the results so far.

Sadly I'm stuck with a problem I'm not sure how to solve. I'm trying to apply the rgcn in the following setup: There are multiple, relatively small graphs of variable size, which contain directed edges and nodes with an optional feature vector (*). Those graphs have to be classified into 2 separate classes, which I do by introducing a global node -- the "hacky" solution you proposed in Issue #4 on the original gcn code. If I use your code in "featureless" mode, everything works pretty well and I get about 80% accuracy. I suspect that I can improve that by using the node features from above.

As soon as I change the featureless flag in the first gc-layer, the net won't learn an "estimator" anymore, but instead a constant output regardless of input (the ratio between target classes to be precise).

I did some digging to figure out where I went wrong and saw that you use the square identity matrix as dummy features in the featureless case. I then set featureless=False and passed in the square identity matrix, which resulted in roughly the same ~80% accuracy on the global node. But if I change the identity matrix to something like a matrix of the same dimensions, but with ones in the first column instead of in the diagonal, training fails again.
Did I miss an assumption you made in your paper? Are feature vectors simply not allowed and instead I have to use scalars along the diag of X?

I realize that you get a lot of questions regarding your code and work, but I'd be very grateful for any hints or ideas.

Cheers,
Julian

(*) Those optional features are latent vectors from an embedding, but sometimes there is nothing to embed. I solved that by using the zero vector in the latter case. Maybe there is a better way?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant