Notation
Symbol | Meaning | Type |
---|---|---|
$\mathcal{V}$ | Vocabulary | set |
$\mathcal{V}^{\text{sorted}}$ | Sorted vocabulary | finite sequence |
$\mathcal{S}$ | Set of token indices | set |
$\mathbf{t}$ | Token | string |
$s$ | Token index | integer |
$n_{\text{vocab}}$ | Number of tokens in the vocabulary | integer |