View on GitHub

maximal

A TensorFlow-compatible Python library that provides models and layers to implement custom Transformer neural networks. Built on TensorFlow 2.

SelfAttention()

Implements Scaled Dot-Product Attention as in the original Transformer paper, where Q, K, V are the same tensor.

Inherits from tensorflow.keras.layers.Layer.

Arguments

__init__ arguments:

depth: (int) model depth, corresponds to the size of the Embedding.

call arguments:

x: (np.array, tf.tensor) tensor to be attentioned.

Returns

attention: (tf.tensor) attention output.