-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Seems like an "extend_features" option for CrabNet could be useful for several people #17
Comments
Hi Sterling, I agree! I can try to work on this in the next days but probably will not be able to finish until after the 9th. |
That sounds great. Thanks @anthony-wang! Let me know what I can do to help. |
I forgot that CrabNet has an Lines 170 to 174 in a5be06f
|
After looking back at this, the |
@anthony-wang I'm guessing a quick implementation to extend features would be directly wherever the element descriptors get loaded, correct? I.e. just patch in an extra column to the element descriptors immediately after loading. |
I took an initial stab at an @truptimohanty @hasan-sayeed do either of you want to give it a try on a dataset with a state variable or additional features? |
Hi! I was looking at the code now and wondering how you implemented the |
@fedeotto great question! I worked with @AndrewFalkowski refactoring CrabNet and implementing an extended features capability. We had a basic implementation that splices the state variable (e.g. load, temperature) in-between the end of the transformer and the beginning of the recurrent neural network. So the information goes through the final neural network without any self-attention. CrabNet vs. XGBoost
I tried out the basic In particular see XGBoost did somewhat better on this dataset. It's quite possible there's just not enough data to justify the transformer architecture, as there are only something like 300-500 unique compositions represented in the RMSE/MAE
CrabNet with basic
|
Also, @AndrewFalkowski tried out some more sophisticated implementations. IIRC these weren't panning out and he had some ideas for future work. @AndrewFalkowski could you shed some additional light here? |
@lantunes recently put out a codebase and preprint a version of CrabNet that supports extended features. |
@sgbaird thanks for including me on this thread! There appears to be a lot interest in adding extended features support to CrabNet. I'd be happy to help where I can. I'm not sure if anyone's tried the approach I adopt. But basically it involves a learned non-linear transformation of the extended features vector, followed by a tiling operation, which results in a matrix which can be added element-wise to the output of the last Transformer block. I found in my experimentation that this method was far superior than simply concatenating the features to the flattened residual net input (the other approach I tried). The approach is described in the preprint you refer to. |
@sgbaird Here is what I do. The data set is a pandas dataframe:
Code with extend_feature:
Both result in the exact same predictions. What could be the reason? Perhaps I am making some silly mistake in passing the extend_features argument. Thanks in advance! |
Similar to the CBFV package
extend_features
option. Does this seem feasible?Marianne and I are trying to incorporate CGCNN features. Trupti wants to incorporate temperature into the model. Hasan would be able to use his custom
mat2vec
/robocrystallographer
feature vectors.The text was updated successfully, but these errors were encountered: