Fabric LLM

Our edge-first high-performance AI framework transforms any consumer device into a capable inference and fine-tuning node. No central clouds, no massive data centers, no vendor dependency.

From Android and Apple smartphones to high-end workstations and even industry-grade mainframes, our unified system allows LoRA fine-tuning directly in the llama.cpp ecosystem so you can initialize, train, checkpoint and merge adapters locally for maximum privacy and resilience.

Fabric LLM
// For Apple Silicon curl -L https://github.com/tetherto/qvac-fabric/releases/download/ v1.0/qvac-macos-apple-silicon-v1.0.zip -o qvac-macos.zip unzip qvac-macos.zip cd qvac-macos-apple-silicon-v1.0
# Download model mkdir -p models wget https://huggingface.co/Qwen/Qwen3-1.7B-GGUF/resolve/main/ qwen3-1_7b-q8_0.gguf -O models/qwen3-1.7b-q8_0.gguf
# Download dataset wget https://raw.githubusercontent.com/tetherto/qvac-fabric/main/ datasets/train.jsonl
# Quick test with email style transfer ./bin/llama-finetune-lora -m models/qwen3-1.7b-q8_0.gguf -f train.jsonl -c 512 -b 128 -ub 128 -ngl 999 --lora-rank 16 --lora- alpha 32 --num-epochs 3

Cross-platform scalability

Our solution provides universal compatibility across the entire desktop GPU ecosystem, including AMD, Intel, NVIDIA, and Apple architectures. By leveraging Vulkan, we ensure your sensitive datasets never leave your control while maintaining total operational resilience.

Train anywhere

Whether it's Adreno, Mali, or Apple, our novel dynamic tiling algorithm lets you train wherever you are. Fabric is the first to offer this previously unsupported capability.

Only assistant responses

We implemented masked-loss training, where a mask is applied to train only on assistant tokens. This ensures that user and system messages influence the context but not the loss and that the same tokenization and masking logic are used consistently during both dataset creation and loss computation.

FAQ

Unlike legacy frameworks that require CUDA, our solution supports virtually all modern consumer hardware. By leveraging Vulkan and Metal backends, you can train on Android (Qualcomm Adreno, ARM Mali), iOS and macOS (Apple Silicon), and standard Windows/Linux setups (AMD, Intel, NVIDIA).