Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(attention): add Bi-Directional MLM attention model #721

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

TamirFriedman-RecoLabs
Copy link

I want to implement this kind of mask in xformers, for implementing bidirectional masked-language-model type of attention:

class BlockDiagoNULLMask(fmha.attn_bias.BlockDiagonalMask):
    """
    Modification of `BlockDiagonalMask` where blocks are inner-connected
    except for the diagonal elements, which are masked from themselves.
    """

    def _create_block_mask(
        self: Self,
        shape: Tuple[int, ...],
        dtype: torch.dtype,
        device: str | torch.device,
    ) -> torch.Tensor:
        # Create a matrix filled with `-inf` on the diagonal and `0` elsewhere
        return torch.zeros(shape, dtype=dtype, device=device).fill_diagonal_(-torch.inf)

@yiakwy-xpu-ml-framework-team

Hi @TamirFriedman-RecoLabs Are you working on encoder stack ? For example generate model for video, music and so on.

Are you still working on this branch ? Happy to hear from you soon!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants