https://jakecyr.medium.com/decoding-tokenizers-the-unsung-heroes-of-large-language-models-d6f223de6801