19 December 2025 — Building upon the success of Genesis I, we introduce QVAC Genesis II, a major expansion that adds new domains and a total of 148 billion tokens.
QVAC Blog
19 December 2025
21 minutes read
QVAC Genesis II: Expanding the Largest and Highest-Quality Multi-domain Educational Synthetic Dataset for LLM Pre-training
Read more
23 October 2025
18 minutes read
Introducing QVAC Genesis I: the Largest and Highest-Quality Multi-domain Educational Synthetic Dataset for Pre-training
23 October 2025 — There is a need for publicly available, large-scale synthetic datasets that are rigorously curated. Genesis I is our first effort in this direction.
Read more