SelfAttention()
Implements Scaled Dot-Product Attention as in the original Transformer paper, where Q, K, V are the same tensor.
Inherits from tensorflow.keras.layers.Layer.
Arguments
__init__ arguments:
depth: (int) model depth, corresponds to the size of the Embedding.
call arguments:
x: (np.array, tf.tensor) tensor to be attentioned.
Returns
attention: (tf.tensor) attention output.