JetMoE: Pre-training an 8B LLM Better than Llama 2 7B

JetMoE: Pre-training an 8B LLM Better than Llama 2 7B

5 months ago
Anonymous $6hYC3Wwiad

JetMoE: Pre-training an 8B LLM Better than Llama 2 7B

Mon Apr 15, 1:19pm UTC
https://blog.stackademic.com/jetmoe-pre-training-an-8b-llm-better-than-llama-2-7b-ecbcf5765c6c