2026-06-17
Extending a 4k model to 120k context for low-resource EU languages
A data-generation plan for the long-context mid-training stage: what to source, how to synthesize it (OLMo-style bootstrap vs. prompt-driven), the staged curriculum, and the multilingual bet.
long-context
multilingual
synthetic data
mid-training