Skip to content
Accent
Shortcuts
D Dark
G Grid
/ Search
Back to Journals
June 4, 2026 14 min read 5 views

How SocratiCode Slashed Our LLM Token Costs by 78% While Building a Complex Flight Data Platform

A real-world case study from building the Flight Data Dashboard — a monorepo with NestJS, Next.js, Stripe, and Clerk

Screenshot 2026 06 04 at 10.40.50 AM

When we embarked on building the Flight Data Telemetry Platform — a full-stack SaaS application spanning a NestJS backend, Next.js dashboard, Stripe billing, Clerk authentication, and PostgreSQL — we knew the LLM-powered development workflow would consume massive context windows. What we didn’t expect was that a single MCP tool called SocratiCode would cut our token consumption by nearly 78% and save us dozens of hours of redundant file reading.

This is the story of how it happened — with real numbers, real examples, and a blueprint you can apply to your own projects.


What Is SocratiCode?

SocratiCode is an MCP (Model Context Protocol) tool that creates an intelligent, searchable index of your entire codebase. Think of it as “Google for your repository” — but purpose-built for AI-assisted development.

Key capabilities:

  • Semantic code search — Find relevant code by describing what it does, not what it’s named
  • Symbol-level analysis — Trace where functions are defined, called, and impacted
  • Dependency graph visualization — See how files connect at a glance
  • Call flow tracing — Follow execution paths from entry points through the entire stack
  • Impact analysis — Know every file that will break before you make a change
  • Context artifact indexing — Index database schemas, API specs, and infrastructure configs alongside code

All of this is available to your LLM in a single tool call — no need to read 20 files to find the one function you need.


The Flight Data Project: Complexity by the Numbers

Before diving into savings, let me ground this in a real project. The Flight Data Telemetry Platform is a monorepo containing:

ComponentTechnologyFiles
Backend APINestJS + Prisma + PostgreSQL30+ modules
DashboardNext.js 16 + shadcn/ui + Tailwind v415+ pages
API ServiceCloudflare Workers5+ workers
Shared PackageTypeScript types, plans, constants3 files
InfrastructureStripe, Clerk, PostgreSQL
Total indexed125 files, 2,549 chunks

The project involved:

  • 17 database models (User, Subscription, Plan, Device, Airport, BillingEvent, etc.)
  • Stripe checkout sessions, webhook handling, and invoice processing
  • Clerk authentication with user sync, role management, and impersonation
  • Multi-tenant white-label branding with live previews
  • Airport entitlement management with 80,000+ global airports
  • Subscription plans with tiered limits, add-ons, and usage tracking

This level of complexity means thousands of interleaved relationships between files, functions, and data models. Without tooling, every codebase question required reading 5–15 files.


The Token Economics: Before vs. After SocratiCode

Let me walk through five real scenarios from this project and show you the exact token difference.

Scenario 1: Understanding the Billing Flow

Task: Figure out how Stripe checkout creates subscriptions and what happens after payment.

Without SocratiCode:

  1. Read billing/billing.controller.ts (22 lines, ~400 tokens)
  2. Read billing/billing.service.ts (296 lines, ~5,400 tokens)
  3. Read subscriptions/subscriptions.service.ts (112 lines, ~1,800 tokens)
  4. Read billing/stripe-webhook.controller.ts (30 lines, ~500 tokens)
  5. Read subscriptions/subscriptions.controller.ts (52 lines, ~900 tokens)
  6. Read prisma/schema.prisma — Subscription and BillingEvent models (~600 tokens)
  7. Grep for stripeCustomerId across codebase (5 results, ~800 tokens)

Total tokens consumed: ~10,400 tokens per query
Time spent: ~8 minutes reading and correlating

With SocratiCode:

  1. socraticode_codebase_search query: “stripe checkout subscription creation flow” → returns the 5 most relevant chunks (~2,000 tokens)
  2. socraticode_codebase_flow with entrypoint createCheckoutSession → shows exact call tree (~500 tokens)
  3. socraticode_codebase_symbol for handleCheckoutCompleted → shows definition, callers, and callees (~600 tokens)

Total tokens consumed: ~3,100 tokens
Time saved: 70% fewer tokens, answer in seconds

Token Comaprison with and without Socreticode

Scenario 2: Fixing the Clerk ID vs Database UUID Mismatch

Task: Debug why the subscription page wasn’t showing a user’s purchased plan.

Without SocratiCode:

  1. Read subscription/page.tsx frontend (439 lines, ~9,500 tokens)
  2. Read subscriptions/subscriptions.service.ts findByUser (10 lines, ~500 tokens)
  3. Read users/users.service.ts findByClerkId (8 lines, ~400 tokens)
  4. Read lib/app-data.tsx context provider (80 lines, ~1,800 tokens)
  5. Manually trace the userId from Clerk → database lookup → subscription fetch (3 more file reads, ~2,000 tokens)

Total: ~14,200 tokens and 15 minutes of investigation

With SocratiCode:

  1. socraticode_codebase_flow for fetchSubscription → shows it calls /subscriptions/user/:userId with Clerk ID but the backend expects database UUID
  2. socraticode_codebase_symbol for findByUser in SubscriptionsService → confirms it queries by userId field (database UUID)
  3. socraticode_codebase_symbol for findByClerkId in UsersService → confirms this is the correct lookup

Total: ~2,800 tokens and 3 minutes — the mismatch was immediately obvious

Result: Found a critical bug in 3 minutes that would have otherwise taken a full debugging session. The userId field in Subscription was the database UUID, but the frontend was passing the Clerk user ID.


Scenario 3: Adding Image Upload to White-Label Branding

Task: Add a file upload endpoint for logo images in the branding module.

Without SocratiCode:

  1. Search for existing file handling patterns — read 8 files to find any multer or FileInterceptor usage
  2. Read branding.controller.ts (32 lines)
  3. Read branding.service.ts (72 lines)
  4. Read main.ts for CORS and static asset config
  5. Read app.module.ts for module registration
  6. Read prisma/schema.prisma for BrandingAsset model
  7. Read package.json to check if multer was installed

Total: ~12,000 tokens, 20 minutes

With SocratiCode:

  1. socraticode_codebase_context_search for “file upload pattern” → returns relevant docs (if configured)
  2. socraticode_codebase_search for “branding profile model and controller” → returns all relevant code chunks (~2,500 tokens)
  3. socraticode_codebase_graph_query for branding/branding.controller.ts → shows what imports the file and what it depends on

Total: ~3,500 tokens, 5 minutes


Scenario 4: Impact Analysis Before Refactoring the Sidebar

Task: Extract the sidebar navigation into a client component for active link highlighting.

Without SocratiCode:

  1. Read dashboard/layout.tsx (83 lines, ~1,500 tokens)
  2. Grep for Link imports across the project — 12 files
  3. Check each file to see if it imports from the layout or has sidebar dependencies
  4. Read sidebar-nav.tsx to be created (new file, 0 impact)
  5. Check globals.css for any .sidebar styles (none found)

Total: ~6,000 tokens, 10 minutes

With SocratiCode:

  1. socraticode_codebase_impact target dashboard/layout.tsx → returns every file that depends on this layout (4 files: page.tsx, middleware.ts, and 2 child layouts)
  2. socraticode_codebase_symbols query “Sidebar” → shows no existing sidebar symbols (confirms clean extraction)

Total: ~800 tokens, 1 minute

Result: Knew exactly which 4 files to check before making changes. Zero surprises.


Scenario 5: Fixing the Missing Plans Display Bug

Task: Debug why “Available Plans” were missing from the subscription page.

Without SocratiCode:

  1. Read subscription/page.tsx (556 lines, ~12,000 tokens)
  2. Read subscriptions/subscriptions.service.ts getPlans (3 lines, ~150 tokens)
  3. Check prisma/schema.prisma Plan model (confirm plan data exists)
  4. Check if seed data was run
  5. Check backend logs for plan API responses
  6. Test the /subscriptions/plans endpoint manually

Total: ~14,000 tokens, 15 minutes

With SocratiCode:

  1. socraticode_codebase_flow entrypoint SubscriptionPage → shows the fetchPlans call in the component tree
  2. socraticode_codebase_search “getPlans endpoint subscriptions controller” → returns the exact controller route and service method (~800 tokens)
  3. Backend /subscriptions/plans returns 200 OK with empty array → confirms it’s a database seeding issue

Total: ~1,500 tokens, 3 minutes


The Aggregate Savings

Let me sum it up across the entire project lifecycle:

MetricWithout SocratiCodeWith SocratiCodeSavings
Average tokens per codebase query11,2002,34079%
Average time per codebase query12 min3 min75%
Files read per query8–150–193%
Total queries during project~150~150
Total tokens consumed~1,680,000~351,0001,329,000 saved
Total developer time~30 hours~7.5 hours22.5 hours saved

At current API pricing (Claude 3.5 Sonnet: $3/M input tokens, $15/M output tokens), the token savings alone amount to roughly:

  • Input tokens saved: 1,197,000 × $3/M = $3.59
  • Output tokens saved: 132,000 × $15/M = $1.98
  • Total token cost saved: $5.57
Token costs with SocratiCode" vs "Token costs without

Now, $5.57 might not sound like much. But consider:

  1. Scale this across a team of 5 developers → $28/month
  2. Scale across a year of development → $336/year
  3. Scale across an organization with 10 such projects → $3,360/year
  4. The real cost isn’t tokens — it’s developer time. 22.5 hours × developer rate ($75–150/hr) = $1,687–$3,375 saved per project

The token savings are the visible metric. The time savings are where the real ROI lives.


Beyond Tokens: The Developer Experience Transformation

Token savings are easy to quantify. But the qualitative improvements matter just as much:

1. Fewer Context Window Overflows

Without SocratiCode, we routinely hit the context window limit when asking the LLM to analyze complex issues. The assistant would read 15 files, consume 80% of the context window, and then struggle to synthesize a coherent answer.

With SocratiCode, the assistant reads exactly the right chunks — typically 3–5 focused results — leaving ample context for reasoning and code generation.

2. Faster Onboarding for New Team Members

When a new developer joins, instead of pointing them to “read the entire billing module,” you can say:

“Search SocratiCode for ‘stripe checkout flow’ and trace from createCheckoutSession.”

They’re productive in minutes, not hours.

3. Confidence in Refactoring

The codebase impact analysis feature is worth its weight in gold. Before extracting the sidebar component, we ran socraticode_codebase_impact on layout.tsx and immediately saw all 4 dependent files.

No grepping. No guessing. No “I hope I didn’t break anything.”

4. Context-Aware Semantic Search

Traditional grep finds text patterns. SocratiCode finds meaning. When we searched for “how does the subscription flow connect to Stripe webhooks,” it returned the exact webhook handler, checkout session creator, and the database model — even though those files share no common keywords.

5. Dependency Graph Visibility

The socraticode_codebase_graph_visualize feature generated a Mermaid diagram showing the entire dependency graph with circular dependencies highlighted. This caught a circular import between our shared package and the dashboard before it became a runtime error.

Screenshot 2026 06 04 at 11.14.17 AM

How SocratiCode Works Under the Hood

For the technically curious, here’s the architecture:

  1. Indexing Phase: Scans every file in your project, chunks it intelligently (preserving function boundaries, imports, and exports), and generates embeddings using Ollama
  2. Storage: Chunks and embeddings are stored in a Qdrant vector database
  3. Graph Building: Static analysis (via ast-grep) maps import/export relationships between files and symbol-level call graphs
  4. Query Time: When you search semantically, the query is embedded and compared against the vector database. Results are ranked by relevance, and the graph is consulted for additional context
  5. Auto-Updates: A file watcher monitors for changes and re-indexes modified files automatically

The MCP protocol makes all of this available to any LLM that supports the protocol — Claude, GPT, and others.


Getting Started with SocratiCode in 5 Minutes

Prerequisites

  • Docker (for Qdrant vector database)
  • Ollama (for embeddings)
  • Node.js 18+

Step 1: Install and Start Infrastructure

docker run -d -p 6333:6333 qdrant/qdrant
ollama pull nomic-embed-text

Step 2: Index Your Project

In your project directory:

socraticode_codebase_index

Wait for the indexing to complete (check progress with socraticode_codebase_status).

Step 3: Enable Auto-Watch

socraticode_codebase_watch start

Your codebase is now live-indexed and responds to changes automatically.

Step 4: Start Searching

From your LLM, just describe what you need:

"Find where Stripe webhooks are handled"
"Show me the call graph for user registration"
"What files depend on the Subscription model?"
"List all symbols in the billing module"

Key Metrics Dashboard

Here’s a snapshot of what your indexed project looks like:

MetricValue
Indexed files125
Code chunks2,549
Code graph nodes98
Code graph edges82
File watcherActive
Index latencyNear real-time

This level of visibility means you always know the state of your codebase intelligence.


Real-World Use Cases Across Our Project

Here’s a sampling of how SocratiCode helped across specific modules:

Backend API (NestJS)

  • Semantic search for “JWT authentication middleware” → found Clerk webhook handler
  • Symbol lookup for upsertFromClerk → showed 3 callers and 1 callee
  • Impact analysis before changing Prisma schema → showed all affected services

Dashboard (Next.js)

  • Flow tracing from SubscriptionPage → revealed the Clerk ID / UUID mismatch bug
  • Graph query for sidebar-nav.tsx → confirmed no circular dependencies
  • Semantic search for “plan display component” → found the plan card rendering logic

Stripe Integration

  • Flow tracing from createCheckoutSession → mapped the entire billing flow
  • Impact analysis before adding userId to metadata → confirmed webhook handler compatibility
  • Context search for “invoice events” → found handleInvoicePaid handler

Conclusion: Why Every AI-Assisted Development Workflow Needs Codebase Intelligence

The future of software development isn’t about replacing developers with AI. It’s about augmenting them with the right context at the right time.

SocratiCode bridges the gap between what an LLM knows (general programming patterns) and what it needs to know (your specific codebase). Without it, every query consumes thousands of tokens just to understand the project. With it, the LLM gets targeted, relevant context in a single tool call.

The bottom line:

  • 79% fewer tokens per codebase query
  • 75% less time spent on investigation
  • 93% fewer files read per task
  • Zero context window overflows during complex analysis

Start Using SocratiCode on Your Codebase

If you’re building LLM-powered development workflows — whether with Claude, GPT, or any other model — you need codebase intelligence. And SocratiCode is the most practical, token-efficient way to get it.

Try it on your project today. Index your codebase, enable auto-watch, and watch your token costs plummet.

Frequently Asked Questions

What is SocratiCode, and how does it reduce LLM token usage in real-world projects?

SocratiCode is an MCP (Model Context Protocol) tool that indexes your entire codebase and exposes semantic search, symbol lookup, flow tracing, and impact analysis to your LLM in a single tool call. By letting the model fetch only the most relevant code chunks instead of reading many files, it cuts unnecessary context tokens and speeds up investigation across complex projects like the Flight Data Telemetry Platform

How does SocratiCode improve developer productivity beyond token savings?

Beyond raw token reduction, SocratiCode prevents context window overflows, accelerates onboarding, and makes refactoring safer through codebase impact analysis. Developers can trace flows like Stripe checkout or subscription handling in seconds, read far fewer files per task, and gain confidence that changes will not break hidden dependencies.

How does SocratiCode work under the hood to index and search my codebase?

SocratiCode scans your project, chunks code intelligently, and generates embeddings (for example using Ollama) that it stores in a Qdrant vector database. It also builds a dependency and symbol graph with static analysis, then uses semantic similarity plus graph context at query time so LLMs can retrieve exactly the right code paths and related symbols.

How can I get started using SocratiCode with my NestJS or Next.js monorepo?

To start, you set up Docker (for Qdrant) and Ollama (for embeddings), then run the socraticode_codebase_index command from your project directory to index the codebase. After enabling socraticode_codebase_watch for live updates, you can call SocratiCode tools from your LLM chat to search for flows like “Stripe webhooks,” visualize graphs, or run impact analysis before refactors.

Is SocratiCode only useful for large enterprise codebases, or does it help smaller SaaS projects too?

SocratiCode delivered significant benefits on a single SaaS monorepo with 125 files, 2,549 chunks, and a mix of NestJS, Next.js, Stripe, and Clerk integrations. Even at this scale, it reduced average investigation time per query from 12 minutes to 3 minutes and cut files read per task by 93%, which is typical of many early-stage SaaS projects.

What infrastructure and tools do I need to integrate SocratiCode into my AI-assisted development workflow?

To use SocratiCode as described in this case study, you need:
Docker – to run the Qdrant vector database
Ollama – to generate embeddings (for example, nomic-embed-text)
Node.js 18+ – for the SocratiCode CLI and codebase indexer
Once installed, you index your project with socraticode_codebase_index, enable live updates with socraticode_codebase_watch start, and then query your codebase directly from your LLM using semantic search, flow tracing, and impact analysis tools.

Can I use OpenAI or other embedding APIs instead of Ollama with SocratiCode?

Yes, you can replace Ollama with other embedding APIs as long as they return dense vector embeddings compatible with your vector database (for example, Qdrant).
SocratiCode’s indexing pipeline is generic: it scans files, chunks code, generates embeddings, and stores them in Qdrant. The embedding step can be implemented with any provider that outputs vectors, such as:
OpenAI text-embedding-3-large or text-embedding-3-small
Cohere embed-english-v3 or embed-multilingual-v3
Voyage AI code-optimized models like voyage-code-2
You just need to plug your chosen provider into the embedding step instead of Ollama.

Why are alternative embedding models like OpenAI, Cohere, or Voyage good fits for SocratiCode?

These models are strong fits because:
1) They provide high-quality semantic embeddings that work well for both natural language and code, improving the relevance of semantic code search.
2) They support relatively long input lengths, which is important when embedding code chunks with context like imports, function signatures, and comments.
3) Code-optimized models like Voyage’s voyage-code-2 are specifically tuned for repository search and symbol retrieval, which aligns directly with SocratiCode’s use cases: semantic search, flow tracing, and impact analysis.

All of these improve the accuracy of SocratiCode’s ability to find the right code paths and related symbols for your LLM.

Is the token cost savings from SocratiCode worth it for small teams?

The direct token cost savings in this case study were about $5.57 per project, but the real value is in developer time.
SocratiCode reduced investigation time from around 30 hours to 7.5 hours, saving 22.5 developer hours on this project. At a typical developer rate of $75–150/hour, that’s:
22.5 × (75 to 150) = $1,687 to $3,375 saved per project
For small teams, time savings are far more valuable than raw token cost reduction, especially when scaling across multiple projects or a year of development.

Categories: Technical, Product

Written by Sanjay Shankar

Sanjay Shankar: Program Manager & dev lead in Kerala. Writes on engineering, agentic AI & team culture at sanjayshankar.me

Leave a Reply

Your email address will not be published. Required fields are marked *

S
Sanjay's Assistant Online
Hi! 👋 I'm Sanjay's assistant. Ask me anything about his work, services, or products.
Or if you'd like to talk directly: