QVAC Psy: The Foundation of Stable Intelligence
QVAC Psy, is Tether's family of state-of-the-art foundational models rooted in the principles of Psychohistory. Designed to provide a stable, objective substrate for a decentralized, composable, and infinitely scalable intelligence , QVAC Psy aims to ensure that intelligence remains a constant, unbreakable utility for all without a central authority.
QVAC MedPsy
QVAC MedPsy is the first specialized evolution of our foundational logic, purpose-built for edge deployment. These QVAC PSY medical models deliver reasoning capabilities previously exclusive to models seven times their size, setting a new benchmark for efficient, local intelligence.
Unprecedented Efficiency: Across 7 medical benchmarks, our 1.7B model outperforms Google’s MedGemma 4B by over 11 points, despite being less than half its size. In rigorous testing on HealthBench Hard, the 1.7B model also outperformed the nearly sixteen-times-larger MedGemma 27B.
Sovereign Power: On those same 7 medical benchmarks, our 4B model outperformed the nearly seven-times-larger MedgGemma-27B.
Closing the Parameter Gap: This demonstrates that our smaller models, powered by a superior methodology, can match or outperform larger competing state-of-the-art models, achieving top-tier results on real-world medical and health clinical assessments.
Optimized for the Edge
Beyond raw reasoning, we have achieved massive token efficiency. QVAC MedPsy-4B produces accurate answers using 3.2 times fewer tokens than backbone models. Alongside the full-precision models, we also release a full suite of efficient quantized variants.
Real-Time Performance: Lower token counts translate to lower latency and faster inference on edge devices.
Quantized Models: Our 4‑bit quantization retains accuracy within ~1 point of BF16 while reducing disk footprint by ~69%, enabling practical deployment on resource‑constrained edge devices.
Universal Access: Bringing high-level intelligence to bandwidth-constrained and privacy-sensitive environments.
Your Infrastructure,
Your Sovereignty
QVAC MedPsy is fully open-source, allowing you to run them on your own infrastructure.
Full Open Source: Audit, refine, and adapt the weights to your specific needs. All models are released in both full‑precision checkpoints and complete GGUF‑quantized variants.
Local Execution: Achieve near-native speeds on local GPUs using the QVAC Fabric.
Stability by Design: A sovereign engine of intelligence that ensures your data remains private in any environment.
FAQ
MedPsy is a family of compact, text-only medical and healthcare large language models developed by Tether Data’s AI Research for edge and on-device deployment. The family includes MedPsy-1.7B, MedPsy-4B, and GGUF quantized versions for local inference.
All MedPsy models are available in the Hugging Face collection: MedPsy on Hugging Face. The collection includes the full-precision models and GGUF quantized versions for local deployment.
They are intended for developers and researchers building healthcare applications involving medical text, especially privacy-sensitive or on-device use cases. They are starting points for downstream applications and should be validated, adapted, and monitored before production use.
No. MedPsy models are not a substitute for professional medical judgment, clinical diagnosis, or treatment. They can make mistakes, hallucinate, or produce incomplete advice, so outputs should always be reviewed by qualified healthcare professionals where medical decisions are involved.
MedPsy is designed to deliver strong medical reasoning at much smaller model sizes. MedPsy-4B surpassed MedGemma-27B-text-it on the closed-ended benchmark review conducted while being nearly 7x smaller. MedPsy-1.7B is designed for smartphone-class deployment and outperforms larger baselines on several reported medical and HealthBench evaluations.
Use MedPsy-1.7B when memory, latency, or smartphone deployment is the main constraint. Use MedPsy-4B when you want higher quality while still staying within edge-device scale. For most local deployments, the GGUF Q4_K_M variants are the recommended size/quality trade-off based on our testing.
Yes. The GGUF releases are built for local inference through llama.cpp and the QVAC SDK. This enables fully on-device workflows where sensitive health queries do not need to leave the user’s device.
Not if the model is deployed locally through QVAC SDK or another on-device runtime. The models themselves support local inference, but application developers are responsible for ensuring their full product architecture preserves privacy.
For best quality, use Q8_0. For most users, use Q4_K_M with imatrix calibration: about 2.6 GB for MedPsy-4B and 1.2 GB for MedPsy-1.7B. For MedPsy-4B, IQ3_M is a strong around-2 GB option. For MedPsy-1.7B, 3-bit variants are not recommended for medical use.
No. The models were trained and evaluated in English. Performance in other languages has not been validated.
No. MedPsy is text-only. It cannot interpret X-rays, scans, photos, PDFs as images, or other non-text modalities. It can only process information provided as text.
The models are built on Qwen3 backbones and post-trained through a multi-stage medical pipeline: broad supervised fine-tuning, reasoning-focused supervised fine-tuning, and reinforcement learning on medical QA tasks. Long-form reasoning supervision was generated using Baichuan-M3-235B as the teacher model.
They were evaluated across closed-ended medical benchmarks including MedQA-USMLE, MedMCQA, MMLU Health, MMLU-Pro Health, PubMedQA, AfriMedQA, and MedXpertQA, plus HealthBench and HealthBench Hard for realistic open-ended health scenarios.
No. They should never be used in emergency or life-threatening situations. MedPsy models are not a substitute for professional medical judgment, clinical diagnosis, or treatment. They can make mistakes, hallucinate, or produce incomplete advice, so outputs should always be reviewed by qualified healthcare professionals where medical decisions are involved.
The models may hallucinate, miss rare or complex conditions, reflect biases in training data, lack up-to-date medical knowledge, and perform unpredictably. Quantized versions can introduce additional quality degradation, especially at very low bit counts.
The model cards list the models under the Apache 2.0 license. Developers should still review the model cards and applicable terms carefully.