atpv | Навчальні матеріали з автоматизації технологічних процесів та виробництв, розроблені спільнотою


<— 4_8_Attention_layers.md	Зміст	4_10_Positional_encoding.md —>

4.9 Token embedding

In many situations, we need to convert discrete tokens into vectors. This can be donewith an embedding layer, which consists of a lookup table that directly maps integers to vectors.

Such a layer is defined by two meta-parameters: the number $N$ of possible token values, and the dimension $D$ of the output vectors, and one trainable $N×D$ weight matrix $M$.

Given as input an integer tensor $X$ of dimension $D_1×···×D_K$ and values in ${0,…,N−1}$ such a layer returns a real-valued tensor $Y$ of dimension $D_1×···×D_K×D$ with