https://medium.com/@monocosmo77/how-masked-language-models-work-part2-machine-learning-2024-6f4c31577681