Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Layers module #30

Open
abheesht17 opened this issue Mar 18, 2022 · 3 comments
Open

Add Layers module #30

abheesht17 opened this issue Mar 18, 2022 · 3 comments

Comments

@abheesht17
Copy link
Collaborator

abheesht17 commented Mar 18, 2022

I think we should have a folder inside src/modules called layers. Please see the example below to get a better idea:

Consider the DCAN code (https://agit.ai/jsx/DCAN/src/branch/master/models.py). It has multiple "building blocks", namely, WordRep, TemporalBlock, etc. (let's just consider these two layers for now).

Now, the aim of our library is to allow the user to reuse these "building blocks". So, say the user wants to build another model (not DCAN), and he/she wants to use TemporalBlock. They can load it directly from layers. We will have to generalise these building blocks though, to allow all kinds of inputs. I think this is where the strength of our library will lie - to enable the user to use these layers directly without copying the code separately. For example, DCAN uses the following Conv Block:

        self.conv1 = weight_norm(nn.Conv1d(n_inputs, n_outputs, kernel_size,
                                           stride=stride, padding=padding, dilation=dilation))
        self.chomp1 = Chomp1d(padding)
        self.relu1 = nn.ReLU()
        self.dropout1 = nn.Dropout(dropout)

The above code snippet is repeated twice here: https://agit.ai/jsx/DCAN/src/branch/master/models.py#L84

We can convert it to:

class ConvTemporalSubBlock(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size, stride, padding, dilation, dropout=0.2, weight_norm=True, activation="relu"):
        self.conv_layer = nn.Conv1d(in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation)
        if weight_norm:
            self.conv_layer = weight_norm(self.conv_layer)
        self.chomp1d = Chomp1d(padding)
        self.activation = ConfigMapper.get_object("activations", activation)()
        self.dropout = nn.Dropout(dropout)

        self.__init_weights__()

    def __init_weights__(self):
        xavier_uniform_(self.conv_layer.weight)

    def forward(self, x):
        x = self.conv_layer(x)
        x = self.chomp1d(x)
        x = self.activation(x)
        x = self.dropout(x)
        return x

and reuse this without having to duplicate code. Moreover, the user can use this Conv Layer if he wants to build a DCAN-like model.

What do you guys think, @dalgu90, @SuhasShanbhogue?

@abheesht17
Copy link
Collaborator Author

In fact, another advantage of this is cross-model sharing. Consider the MultiResCNN code: https://github.com/foxlf823/Multi-Filter-Residual-Convolutional-Neural-Network/blob/master/models.py#L11, and the DCAN code: https://agit.ai/jsx/DCAN/src/branch/master/models.py#L12. These two classes are literally the same.

@abheesht17
Copy link
Collaborator Author

Another example which comes to mind is Label-wise Attention. The same Label Attention technique is used across different models and approaches.

@abheesht17
Copy link
Collaborator Author

abheesht17 commented Mar 18, 2022

Another advantage of this is: we will have only the main model class in the respective model file. So, for DCAN, for example, inside src/modules/models/dcan.py, I will only have class DCAN. All layers/building blocks, etc. will be present in src/modules/layers/....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant