ITEL’s VibeStudio Achieves a Big AI Breakthrough: World-Best LLM Performance on just One GPU
A small Indian team shows the World how to do More with Less: the first to achieve 55% pruning of the MiniMax M2 open-source LLM for reasoning and software coding.
Chennai, India — November 21, 2025
Large Language Models (LLMs) today is well-known and most commonly used AI tool. There are a large number of LLMs today, including ChatGPT, DeepSeek, Gemini and many others. Many of them are open-sourced, implying that anyone can copy them and use them. They can perform all kinds of tasks. But to do that, they require significant computational resources like high-end GPUs and memory banks, and also consume immense amount of electrical power. Reducing the GPU and Memory usage, while performing the task in SOTA levels, is a major goal of AI developers today. In a world where most companies chase larger and larger LLMs, VibeStudio shows that small, efficient, high-performance models are often the more practical and impactful route. This shift echoes what India does best: build more with less.
Who are we?
We are VibeStudio, an agentic coding suite startup incubated by Immersive Technology and Entrepreneurship Labs (ITEL), and are providing an AI based tool to enable software developers and enterprises to carry out coding automatically. We are led by Arjun Reddy , a native of Madurai and a graduate of Satyabhama Engineering College. Prof. Ashok Jhunjhunwala, the founder and ex-President of IITM Research Park, is the Chairman of ITEL.
What have we achieved?
VibeStudio, with the support of ITEL, undertook the ambitious task
of reducing GPU and Memory requirement of an open-source LLM, for
the specific task of software coding (known as vibe-coding). When
we took up MiniMax M2, the goal wasn't academic curiosity. We
needed a model that could power VibeStudio's Agentic Integrated
Development Environment (IDE); it is fast, disciplined, and
capable of real coding and full repo reasoning, the way developers
actually work. But deploying M2 at scale across Indian engineering
colleges and enterprise systems would have required a huge number
of GPUs (like H200) and Memory, costly as well as
energy-intensive. What we needed was to make M2 significantly
lighter without diluting the intelligence that makes it valuable.
That is why we engineered THRIFT — Targeted Hierarchical Reduction
for Inference and Fine-Tuning. THRIFT isn't a gimmick. It's an
engineering process. It examines the model like a structured audit
— layer by layer, pathway by pathway — identifying redundant
experts, silent activation routes, and dead parameters that add
cost but no intelligence. Instead of one reckless pruning pass,
THRIFT reduces the model in calibrated stages. After each stage,
teacher-guided fine-tuning recalibrates the model so it stays
stable and sharp.
The outcome is exactly what we wanted: A 55% size reduction with
80% of the same reasoning strength, coding precision, and in many
cases, faster responsiveness than the original. Inside VibeStudio,
this pruned M2 handles vibe coding beautifully — structured
autocompletions, multi-file context, refactoring, analysis, and
real-time agent-driven coding workflows. We have released THRIFT
(55% reduced M2) on HuggingFace, as open-source.
Our work so far has already resonated with the wider community.
Across HuggingFace, our open-source releases have crossed 150,000+
downloads, and developers repeatedly choose our models because
they are engineered with discipline, not hype.
Behind the scenes, we now maintain two private foundational models
for enterprise deployments:
- An 8B Dense Model, optimised for quantised local use on mainstream hardware.
- A 32B A3B MoE Model, built for secure, high-speed, on-premises reasoning at enterprise scale.
THRIFT, combined with these models and VibeStudio's Agentic IDE, gives us the ability to deliver powerful, affordable AI across India and the world, from large companies, all the way down to first-year engineering students working on budget laptops. This is exactly the direction we believe the next decade of AI should follow. Our work proves that the frontier of AI is not only about training massive models—it's also about engineering intelligence efficiently, and making powerful AI accessible to millions of developers, not just the richest labs.
About VibeStudio
VibeStudio is a startup that offers enterprise-grade evolution of agentic tools like Cursor and Lovable — built for companies that need real AI agents running securely on-premise without compromising data privacy, rate limits or capability.
It gives teams a full-stack agent development environment where coding, reasoning, tool-use, and deployment all happen inside one controlled environment with no external dependency.
For enterprises that want the power of modern AI without sending a single byte outside their firewall, VibeStudio is simply the only practical choice.
About ITEL
Immersive Technology & Entrepreneurship Labs (ITEL) is a Section 8, not-for-profit organization (www.itelfoundation.in), focused on making India a technology leader in selected areas. It also incubates deep-tech startups and provides technology support, mentorship, and strategic guidance to scale. It has set-up Vikram Sarabhai AI Labs (VSAIL) to accelerate sovereign AI.