attention

BBC BERTopic Modeling

Unsupervised Topic Modeling on BBC articles.

Dante GPT

LLM trained on Dante Alighieri's Divina Commedia using Pytorch.

An Introduction to Transformers - Summary

Input to the Transformer The input to the transformer is a sequence X(0)RD×N where N is the length of the sequence and D is the dimensionality of each item in the sequence, which are known as tokens and denoted as xn(0)RD×1. X=[x0(0),,xN(0)] The items in the sequence are representations of objects of interest. For instance, in language tasks, a token is usually a unique vector representation of a word, whereas for an image it would be a vector representation of a patch.

Paper Summary: An Introduction To Transformers - Turner (2023)

Input to the Transformer The input to the transformer is a sequence X(0)RD×N where N is the length of the sequence and D is the dimensionality of each item in the sequence, which are known as tokens and denoted as xn(0)RD×1. X=[x0(0),,xN(0)] The items in the sequence are representations of objects of interest. For instance, in language tasks, a token is usually a unique vector representation of a word, whereas for an image it would be a vector representation of a patch.