Wednesday, 28 January 2026

Why Your AI Coworker Will Never Understand Your Code

The Uncomfortable Pattern Everyone's Ignoring

When Claude Code first started autocompleting my code with insane accuracy, I felt what every engineer feels: a flash of obsolete. Here's a system that writes cleaner boilerplate than I do, recalls API signatures I've forgotten, and implements patterns faster than I can type them. Then I asked it to help with a custom threading synchronization algorithm I was designing - something that had never been written before - and watched it confidently generate a total mess.

Pattern repeats everywhere. LLMs write SQL queries brilliantly until you need to optimize for your specific data distribution. They explain React patterns perfectly until you're debugging a novel state management approach. They're simultaneously genius and confused grade schoolers, and everyone's pretending this jaggedness is just a scaling problem waiting to be solved.

Andrej Karpathy finally said what we've been avoiding: we're not building animals that learn from reality. We're summoning ghosts - digital entities distilled from humanity's text corpus, optimized for mimicry rather than understanding. And if Yann LeCun is right, the architecture itself guarantees they can never become anything else.

This isn't point of view to stop using LLMs. It's recognition that the tools transforming software engineering today have a ceiling we need to see clearly, because your career decisions should account for what AI will never do, not just what it's learning to automate.

What Karpathy Actually Said About Ghosts

Metaphor landed because it captured an asymmetry everyone building with LLMs feels but struggles to articulate. Karpathy's framing is black and white: "today's frontier LLM research is not about building animals. It is about summoning ghosts."

Distinction is architectural. Animals learn through dynamic interaction with reality. "AGI machine" concept envisions systems that form hypotheses, test them against the world, experience consequences, and adapt. 

There's no massive pretraining stage of imitating internet webpages. There's no supervised finetuning where actions are teleoperated by other agents. 

Animals observe demonstrations but their actions emerge from consequences, not imitation.

Ghosts are different. They're "imperfect replicas, a kind of statistical distillation of humanity's documents with some sprinkle on top." 

Optimization pressure is fundamentally different: human neural nets evolved for tribal survival in jungle environments - optimization against physical reality with life-or-death consequences. 

LLM neural nets are optimized for imitating human text, collecting rewards on math puzzles, and getting upvotes on LMSYS Arena. These pressures produce different species of intelligence.

This creates what Karpathy calls "jagged intelligence" - expertise that doesn't follow biological patterns because it wasn't shaped by biological constraints. 

LLM can explain quantum field theory while failing basic common sense about physical objects. It writes elegant code for standard patterns while generating nonsense for novel architectures. 

Jaggedness isn't a bug - it's the signature of learning from corpus statistics rather than reality.

LeCun's Mathematical Doom Argument

While Karpathy's ghost metaphor describes the phenomenology, Yann LeCun argues the architecture itself is mathematically doomed. His position isn't "we need better training" - it's "autoregressive generation can't work for genuine intelligence."

The core argument is this: imagine the space of all possible token sequences as a tree. Every token you generate has options - branches in the tree. 

Within this massive tree exists a much smaller subtree corresponding to "correct" answers. 

Now imagine probability e that any given token takes you outside that correct subtree. Once you leave, you can't return - errors accumulate. Probability your sequence of length n remains correct is (1-e)^n.



This is exponential decay. Even if you make e small through training, you cannot eliminate it entirely. Over sufficiently long sequences, autoregressive generation inevitably diverges from correctness. You can delay the problem but you cannot solve it architecturally.

LeCun's critics point out the math assumes independent errors, which isn't true - modern LLMs use context to self-correct. They note that LLMs routinely generate coherent thousand-token responses, which seems impossible under exponential decay. Recent research shows errors concentrate at sparse "key tokens" (5-10% of total) representing critical semantic junctions, not uniformly across all tokens.

But LeCun's deeper point stands: the autoregressive constraint means sequential commitment without exploring alternatives before acting.

The Lookback vs Lookahead Distinction

To be precise about what "autoregressive" actually constrains: LLMs have full backward attention - at each token, the model attends to ALL previous tokens in the context. This is fundamental to how transformers work. They're constantly "looking back."

What they don't do is lookahead during generation:

Standard Autoregressive Generation:
Token 1: Generate → COMMIT (can't change later)
Token 2: Generate → COMMIT  
Token 3: Generate → COMMIT
...each decision is final upon generation

Compare this to search-based planning (like AlphaGo):

Consider move A → simulate outcome → score: 0.6
Consider move B → simulate outcome → score: 0.8  
Consider move C → simulate outcome → score: 0.4
Choose B (explored before committing)

Chess analogy: Standard LLM generation is like being forced to move immediately after seeing the board position, without considering "if I move here, opponent does this, then I..." Human planning involves internally simulating multiple futures before committing to action. Autoregressive generation commits token-by-token without exploring alternative continuations.

What People will say to this argument ? 

Modern techniques add planning on top of base generation. Chain-of-thought generates reasoning tokens first but still commits sequentially. 

Beam search keeps multiple candidates but is exponentially expensive for deep exploration. 

OpenAI's o1 reportedly uses tree search during inference, which IS genuine lookahead - a significant architectural addition beyond pure autoregressive generation.

LeCun's claim isn't that these improvements are impossible. It's that they're band-aids on an architecture that doesn't naturally support the kind of internal world simulation that characterizes animal intelligence.

Four Gaps That Can't Be Trained Away

LeCun identifies four characteristics of intelligent behavior that LLMs fundamentally lack: understanding the physical world, persistent memory, the ability to reason, and the ability to plan. But the deepest issue is what they're optimized for.

Consider what this means for scientific reasoning. Scientists don't generate hypotheses by pattern-matching previous hypotheses - they observe phenomena, form novel explanations, design experiments to falsify them, observe results that surprise them, and refine their models. 

Every step involves interaction with ground truth that can prove you wrong.

LLMs have no mechanism for this. 

- They can't run an experiment and be surprised. 

- They can't observe results that contradict their predictions and update based on physical evidence.

 Every token is inference from prior tokens in a corpus that only contains what was already discovered and written down. You cannot discover novel physics from a corpus that only contains known physics.

This explains why LLMs excel at code but struggle with physical reasoning. Code operates in "a universe that is limited, discrete, deterministic, and fully observable" - the state space is knowable and verification is programmatic. 

Physical reality is continuous, partially observable, probabilistic, and full of phenomena we haven't documented. 

Animals navigate this effortlessly because they learn directly from it. LLMs can only learn from our linguistic shadows of it.

When LeCun states "Auto-Regressive LLMs can't plan (and can't really reason)," he's not being provocative - he's describing an architectural constraint. 

Even chain-of-thought prompting doesn't fix this because it's "converting the planning task into a memory-based (approximate) retrieval." You're not teaching reasoning - you're teaching corpus-level pattern matching about what reasoning looks like. This is exact reason that "Prompt" is hyper parameter and model is very very sensitive to it. 

LLM learns to generate text that resembles reasoning steps because that's what appears in the training data, not because it's internally simulating multiple future scenarios and choosing the best path.

Why Code Works But Reality Doesn't

I've noticed an interesting pattern with GitHub Copilot/Claude Code. When implementing a standard REST API or writing React components, the suggestions are good - often exactly what I was about to type. 

When debugging a distributed systems issue or architecting a novel state management approach, the suggestions become actively unhelpful, confidently wrong in ways that would break production.

The difference isn't random. Standard patterns exist extensively in training corpora - GitHub is full of REST APIs and React components.

LLM has seen thousands of implementations and learned the statistical regularities of how these patterns manifest in code. 

It's not understanding the requirements and generating a solution; it's recognizing "this looks like a REST endpoint" and retrieving an approximate match from distribution of REST endpoints in its training data.

For novel code that deviates from conventions, this breaks down. When you are building custom thread synchronization, models repeatedly failed because they kept pattern-matching to standard practices - adding defensive try-catch statements, turning focused implementations into bloated production frameworks. 

They couldn't understand his actual intent because they don't understand intent at all. They understand corpus statistics.

This is why code works better than general reasoning for LLMs: code is verifiable, domains are closed, and common patterns dominate the training data. You can build benchmarks with programmatic correct answers. 

You can use Reinforcement Learning from Verifiable Rewards (RLVR) because verification is automatic. But this success doesn't generalize to open-ended domains where ground truth isn't programmatically checkable.

Strategic question for engineers: which parts of your work are "standard patterns well-represented in training corpora" versus "novel architecture requiring genuine understanding?" 

First category is being rapidly automated at 100X speed. Second isn't just hard for current LLMs - it may be architecturally impossible for them.

What Animals Have That Ghosts Never Will

LeCun's alternative to LLMs is Joint Embedding Predictive Architecture (JEPA), which inverts the paradigm entirely. Instead of predicting next token in pixel/word space, JEPA learns to predict in latent representation space - building an internal world model that captures structural regularities while ignoring unpredictable details.

Key insight: most of reality's information is noise. When you watch a video of someone throwing a ball, the exact trajectory is predictable from physics but the precise pixel values (lighting, shadows, texture) contain high entropy. 

Generative models waste capacity modeling all this unpredictable detail. JEPA learns representations that "choose to ignore details of the inputs that are not easily predictable" and focus on "low-entropy, structural aspects" - like the parabolic arc, not the exact RGB values.

This mirrors biological learning. An infant knocking objects off a table learns gravity not by memorizing pixel sequences but by building an abstract model: "objects fall downward." 

The model ignores irrelevant details (color, texture, lighting) and captures the physical law. 

No books required, no 170,000 years of reading - just observation and interaction.

Meta's V-JEPA demonstrates this works. When tested on physics violations (objects floating mid-air, collisions with impossible outcomes), it showed higher surprise/prediction error than state-of-the-art generative models. 

It acquired common-sense physics from raw video by building an actual world model, not by memorizing corpus statistics about how people describe physics.

Architectural difference matters because it determines what's learnable. LLMs learn "what humans wrote about the world" - a tiny, biased, lossy compression. 



JEPA-style models can learn "how the world actually works" through observation. The first hits a data ceiling when you've processed all available text. The second has access to reality's infinite bandwidth.



Architecture Is The Constraint

LeCun's prediction is bold: "within three to five years, no one in their right mind would use" autoregressive LLMs. 

His position is that better systems will appear "but they will be based on different principles. They will not be auto-regressive LLMs.

This isn't incrementalism. It's claiming the entire paradigm is a dead end for genuine intelligence. The reason is architectural: 

LLMs and humans play by completely different rules. One is a master of compression, and the other is a master of adaptation.

Simply feeding more data to "this compression beast will only make it bigger and stronger, but it won't make it evolve into an adaptive hunter.

Consider what this means for the current race toward AGI via scaling. If LeCun is right, we're optimizing along a dimension that can't reach the target. Better compression of human text gets you better mimicry, not understanding. 

Larger context windows let you mimic longer documents, not think longer thoughts. 

RLVR on verifiable domains like code and math creates better pattern-matchers for those domains, not general reasoners.

Counterargument is so convincing: LLMs keep surprising us with emergent capabilities. GPT-4 does things GPT-3 couldn't, and Claude Sonnet 4 does things GPT-4 struggled with. 

Maybe there's no architectural ceiling, just insufficient scale. Maybe chain-of-thought reasoning plus tool use plus larger context windows eventually produces something indistinguishable from genuine intelligence.

LeCun's response: show me the world model. Show me the system that can watch a video, form a novel hypothesis about what happens next, be surprised when it's wrong, and update its model of reality.

 Autoregressive text generation can't do this by construction - it has no mechanism for ground truth interaction, no ability to be surprised by reality rather than corpus statistics.

What This Actually Means For Your Career

Practical implication isn't abandoning LLMs - they're extraordinarily useful for what they actually are. It's recognizing their ceiling so your skill development accounts for what AI will never automate.

Here's the framework: 

LLMs excel at problems where 

(1) the solution space is well-represented in training corpora, 

(2) verification is possible through execution or programmatic checking, and 

(3) the domain is closed and discrete. 

They struggle where 

(1) the problem is genuinely novel, 

(2) correctness requires understanding beyond pattern-matching, or 

(3) the domain involves continuous physical reality or open-ended reasoning.

This creates a clear dividing line in engineering work:

Automatable (pattern-matching sufficient): Standard CRUD implementations, boilerplate reduction, API integration following documentation, test generation for known patterns, code explanation and documentation, refactoring for style consistency.

Not automatable (understanding required): Novel algorithm design, distributed systems debugging with emergent behavior, performance optimization for specific workload characteristics, architectural decisions balancing tradeoffs, security reasoning about attack surfaces, integration of fundamentally new technologies.

The difference isn't difficulty - it's whether success requires recognizing patterns in existing code versus forming and testing novel hypotheses about system behavior. 

One is corpus retrieval, the other is scientific method.

For career strategy, this suggests investing in skills that require building world models: understanding how systems actually behave under load, why certain architectural patterns create subtle failure modes, what tradeoffs matter for your specific context. 

These aren't pattern-matching problems. They're "I need to understand this system well enough to predict what happens in scenarios I haven't seen."

The engineers who thrive won't be those who resist AI tools. They'll be those who understand exactly which problems LLMs can solve (letting them automate aggressively) versus which problems require genuine understanding (where they need to think like scientists, not pattern-matchers). 

The tools are getting better at mimicry. But mimicry isn't understanding, and the architecture guarantees it never will be.

If Karpathy's right that we're summoning ghosts, and LeCun's right that ghosts can never become animals, then the question isn't "how do I compete with AI." 

It's "which problems require animal intelligence?" Those problems aren't going away. They might be the only ones that matter.

Friday, 23 January 2026

Intelligence has become Commodity

Why Apple's $1B Intelligence Rental Is Actually Brilliant

Apple recently announced they're paying Google approximately $1 billion annually to power Siri with Gemini. It might looks like Apple admitting defeat in AI.

They got it exactly backwards.

Apple just confirmed what the smartest companies already know: If you have distribution and trust, renting intelligence is the power move. And Meta's desperate open-sourcing of Llama proves what happens when you lack the moat that makes renting viable.



Deal Everyone Misreading

Apple looked at building a competitive foundation model in-house. The real cost:

  • $10-20 billion over 5 years
  • Hundreds of ML researchers (competing with Google, OpenAI, Anthropic for talent)
  • Building training infrastructure from scratch
  • 3-5 years to catch up to Google's decade-long DeepMind advantage

Apple's response: "We'll pay you a billion a year to skip all that."

This isn't weakness. It is refusing to fight a multi-billion dollar battle in territory where you have zero advantage.

For $1 billion annually - 0.25% of their revenue, less than they spend on store design - Apple bought:

Optionality without commitment. If Google's models fall behind, switch to Anthropic, OpenAI, or whoever wins next generation. The integration layer (Private Cloud Compute, iOS hooks) works with any sufficiently capable model.

Speed without technical debt. Ship AI features this year instead of 2028. No sunk costs, no research teams to maintain, no infrastructure to depreciate.

Competitive intelligence. By being Google's customer, Apple sees exactly what's possible with current models, what's improving, what's still broken. Learning the constraints without paying for research.

But they did not rent many things like relationship with 2 billion iOS device owners. App Store monopoly developers can't escape. Hardware-software integration that took twenty years to build. The premium pricing power from brand trust.

Apple is renting the commodity and owning the moat.

What Meta's Strategy Actually Teaches

Meta spent billions building Llama, then gave it away for free. Open source. No licensing fees. Available to anyone, including direct competitors.

This isn't generosity. This is a calculated move to prevent moats from forming in the intelligence layer.

Meta absolutely COULD rent models like Apple does - nothing stops them from using Claude, GPT, or Gemini. They have the money, the distribution, 3 billion users. But at their scale, with their existing compute infrastructure and data, building is actually cost-effective.

Strategic choice isn't BUILD vs RENT. It's what they did AFTER building: They gave it away for free.

Here's why: Meta's actual business is advertising on social feeds. Nightmare scenario is Google or OpenAI building such a strong moat in AI that it becomes a competitive bottleneck. Imagine if accessing frontier AI required paying Google, and Google could prioritize their own products or charge Meta premium rates.

By open-sourcing Llama, Meta is trying to: Commoditize intelligence to prevent anyone from building a moat there.

If Llama is free and good enough, then intelligence can't become a competitive advantage. If it's free, Google can't charge Apple $1 billion for preferential access. If it's free, startups can't build AI-native products that threaten Meta's ad business without Meta having access to the same capabilities.

This is a deliberate choice about WHERE moats should form. Meta is saying: "We're fine with moats in distribution, in user relationships, in advertising infrastructure. But intelligence itself? We need that to stay commodity."

Uncomfortable Insight

 Apple-Meta comparison reveals: Different companies want moats in different layers.

Apple can rent because their moat is in ecosystem and distribution:

  • 2 billion devices creating lock-in
  • An ecosystem developers can't afford to leave
  • Premium pricing power from brand trust
  • Hardware integration creating switching costs

For Apple, intelligence being commodity is PERFECT. It means they can access the best models without building research teams, and their real advantages (ecosystem control, user trust) remain defensible.

Meta builds and open sources because they want to PREVENT moats from forming in intelligence:

  • They have the scale and infrastructure to build cost-effectively
  • They need intelligence to stay commodity to protect their ad business
  • They can't afford Google or OpenAI controlling access to frontier AI

Apple pays $1 billion to rent what Meta gives away for free. But they're not in the same strategic position - they're pursuing opposite goals.

Apple wants intelligence to be rented infrastructure (keeps it commodity, lets them focus on ecosystem). Meta wants intelligence to be free infrastructure (keeps it commodity, prevents competitors from building moats there).

Both strategies treat intelligence as commodity. The difference is how they achieve that outcome.

Pattern Across Big Tech

Amazon is doing the Apple strategy at scale. AWS Bedrock hosts everyone's models - Claude, Llama, Cohere, their own Titan. They don't care who wins the model race because infrastructure is their moat.

Google is the only one playing both sides profitably. They sell to Apple ($1B/year), power their own products, AND offer Vertex AI to enterprises. But even Google's strategy depends on their search monopoly - the intelligence itself is just one revenue stream. To some extend they are also thinking like META

Companies that RENT to keep intelligence commodity:

  • Apple (ecosystem lock-in is the moat)
  • Amazon (infrastructure dominance is the moat)
  • Enterprise SaaS with strong moats (Salesforce, Adobe)

These companies WANT intelligence to be rented commodity infrastructure. It protects their actual moats.

Companies that BUILD then OPEN SOURCE to keep intelligence commodity:

  • Meta (prevent competitors from building intelligence moats)
  • Mistral (European AI sovereignty positioning)

These companies spend billions building, then give it away to prevent moats from forming in the intelligence layer.

Companies trying to BUILD moats IN intelligence:

  • OpenAI (model quality as primary differentiation)
  • Anthropic (model quality + safety positioning)


Ecosystem Defense Through Rental

Apple's strategy is more sophisticated than just "rent the AI." They're using rented intelligence to strengthen their ecosystem while avoiding the sunk cost trap that kills tech giants.

The playbook:

  1. Rent frontier models to ship competitive AI features fast
  2. Build the integration layer in-house (Private Cloud Compute, iOS hooks)
  3. Own the user relationship and trust (privacy positioning)
  4. Let model providers compete for their business
  5. Switch providers when someone gets better

Every AI feature makes iOS more valuable. Every AI integration makes it harder to leave Apple's ecosystem. But none of it requires winning the model training arms race.

Compare Meta's position. They spend billions on Llama to:

  1. Prevent Google/OpenAI from building intelligence moats
  2. Protect their ad business from AI disruption
  3. Maybe get PR credit for "open source leadership"

One company strengthens their moat. The other desperately tries to prevent competitors from building one.

Story from Model builder

They're optimized for different strategic contexts.

Companies trying to build proprietary moats IN intelligence itself (OpenAI, Anthropic) are betting against both Apple AND Meta's preferred outcome.

If either Apple or Meta succeeds in keeping intelligence commodity - whether through competitive rental markets or open source proliferation - then intelligence itself can't be a sustainable moat.

The smartest strategic question isn't "should I build or rent intelligence?"

It's "where do I want moats to form, and does my intelligence strategy support or undermine that goal?"

Thursday, 15 January 2026

Audacious Plan to Train AI in Space

What happens when humanity's appetite for artificial intelligence outpaces our planet's ability to feed it? Someone decided the answer was "leave the planet."

In December 2025, something delightfully absurd happened 325 kilometers above Earth. A 60-kilogram satellite named Starcloud-1, carrying an NVIDIA H100 GPU, trained an AI model on the complete works of Shakespeare. Model learned to speak in shakespeare English while orbiting our planet at 7.8 kilometers per second.

"To compute, or not to compute"—apparently, in space, the answer is always "compute."

This wasn't a publicity stunt. It was a proof of concept for what might be the most bold infrastructure play in the history of computing: moving AI training off Earth entirely.

First reaction: "This is insane. I should buy more chip stocks !"

Second reaction, after reading the physics: "Wait, this might actually work."





Dirty Secret Nobody Wants to Talk About at AI Conferences

Here's something the AI industry prefers to whisper about over drinks rather than announce on keynote stages: we're running out of power. Not in some distant climate-apocalypse scenario. Now. Today. While you're reading this.

The numbers read like a horror story for grid operators:

Data centers consumed approximately 415 terawatt-hours of electricity globally in 2024—roughly 1.5% of all electricity generated on Earth. By 2030, that figure is projected to more than double to 945 TWh. That's Japan's entire annual electricity consumption. For computers. Training models to argue about whether a hot dog is a sandwich.

Virginia's data centers alone consume 26% of the state's electricity. Imagine explaining to your neighbors that their brownouts are because someone needed to train a chatbot to write better cover letters.
But here's where it gets truly uncomfortable: to train the next generation of frontier models—think GPT-6 or whatever Claude's grandchildren will be called—we'll need multi-gigawatt clusters. 

5 GW data center would exceed the capacity of the largest power plant in the United States. These clusters don't exist because they can't exist with current terrestrial infrastructure.

Breakthrough is not coming from earth solar panel and fusion reactors is 10 to 20+ Year.

It might come from the one place where solar works really, really well.

325 kilometers straight up.

The Physics of Space: Nature's Cheat Codes

Starcloud's white paper makes a case that initially sounds like venture capital science fiction. But then you check the physics, and... huh. It actually works. Let me break down as per paper why space is basically running a different game engine than Earth.

Cheat Code #1: Infinite Solar Energy (Seriously)

Solar panels in Earth orbit receive unfiltered sunlight 24/7. No atmosphere absorbing photons. No weather. No pesky night cycle if you pick the right orbit. A dawn-dusk sun-synchronous orbit keeps a spacecraft perpetually riding the terminator line between day and night—eternal golden hour, but for electricity.

The capacity factor of space-based solar exceeds 95%, compared to a median of 24% for terrestrial installations in the US. The same solar array generates more than 5X the energy in orbit than it would on your roof.  This seems like coding gains we get from claude code :-) 


Cheat Code #2: The Universe's Free Air Conditioning

Deep space is cold. Like, really cold. Cosmic microwave background sits at approximately -270°C. A simple black radiator plate held at room temperature will shed heat into that infinite cold at approximately 633 watts per square meter.

Cooling algorithm is very different on earth vs space.

Earth Cooling:
Evaporative cooling towers consuming billions of gallons of water. Chillers running 24/7. 
Microsoft literally sinking servers in the ocean like some kind of tech burial at sea.

Space Cooling:

Point a black plate at the void. Wait. Physics does the rest. No water. No chillers.

Just thermodynamics being thermodynamic.


Cheat Code #3: No Land Law in Orbit

Perhaps the most underrated advantage. On Earth, large-scale energy and infrastructure projects routinely take a decade or more to complete due to environmental reviews, utility negotiations, zoning battles, and that one guy at every town hall meeting who's convinced 5G causes migraines.

In space? You dock another module and keep building. When xAI had to resort to natural gas generators for their Memphis cluster because the grid wasn't ready, they weren't just solving a technical problem—they were demonstrating the bureaucratic fragility of terrestrial infrastructure.

What does Cost Math look like

Starcloud's white paper presents this comparison for a 40 MW data center operated over 10 years:

Terrestrial 10-Year Cost:
Energy: ~$140M (@$0.04/kWh)
Land, permits, cooling infrastructure
Water, maintenance, grid upgrades
Total: ~$167 million+
Space 10-Year Cost:
Solar array: ~$2M
Launch: ~$5M (next-gen vehicles)
Radiation shielding: ~$1.2M
Total: ~$8 million

That's a 20x difference, driven almost entirely by energy costs.

Now, before you start a space data center SPAC, let's be honest about what this analysis conveniently ignores: the actual compute hardware. 40 MW of GPU capacity costs somewhere in the neighborhood of $12-13 billion. That's... a lot of billions.

But here's the thing: you pay for that hardware whether it's sitting in a concrete bunker in Iowa or floating above the atmosphere. The operational cost delta remains. And as models get larger and training runs stretch from weeks to months, that delta compounds like the most patient venture capitalist in history.

What This Means for Everyone Betting Billions on the Ground Datacenter

If orbital data centers become economically viable at scale.

Hyperscaler Dilemma

 Microsoft, Google, Amazon, and Meta have collectively committed over $200 billion in capital expenditure on terrestrial data center infrastructure. These are sunk costs with multi-decade payback periods. Do they pivot to space and write off billions? Do they wait and risk being leapfrogged? The prisoner's dilemma dynamics here are brutal. Someone will defect first.

Sovereign AI Gets Complicated

Countries racing to build domestic AI capabilities have assumed the limiting factor is talent and chips. If it turns out the limiting factor is energy, and the solution is orbital infrastructure, the competitive landscape shifts dramatically. Quick: who controls orbital launch capacity? Who can deploy and maintain space-based infrastructure? These aren't questions most national AI strategies have seriously considered.

Environmental Narrative Flips

Right now, AI's carbon footprint is a vulnerability—a PR problem and increasingly a regulatory target. Orbital data centers, powered entirely by solar energy and requiring no water for cooling, transform AI infrastructure from environmental liability to potential climate solution. That's a narrative shift worth billions in avoided regulatory friction alone.


Design Principles That Actually Matter

What makes Starcloud's approach interesting isn't just "put computers in space"—it's how they're thinking about building something that can survive and scale in a hostile environment. 

There's some genuine distributed systems wisdom here:

Modularity: Everything is designed to be added, replaced, or abandoned independently. No single-point-of-failure architecture. This is microservices thinking applied to hardware, which is either brilliant or terrifying depending on your ops experience.

Incremental Scalability: You don't build a 5 GW space station and pray it works. You launch 40 MW modules, validate they function, scale up. It's the same philosophy that made AWS successful: don't bet everything on one deployment.

Failure Resiliency: In space, you can't send a technician. Components will fail. The system has to route around damage like the internet was originally designed to route around nuclear attacks. Graceful degradation isn't optional—it's existential.

Ease of Maintenance: Or rather, the complete absence of it. Everything has to be either radiation-hardened enough to outlast its usefulness, or cheap enough to abandon. There's no middle ground.


Thursday, 8 January 2026

The Circle Game: Why Everyone's Wrong About AI's Circular Finance

Financial press is filled with panic attack about "circular finance" in AI. Nvidia invests in OpenAI, OpenAI buys Nvidia chips. Microsoft funds Anthropic, Anthropic rents Microsoft Azure. Oracle backs AI labs, labs fill Oracle datacenters.

"This is a bubble!" they cry. "Circular financing!" they warn. "Just like the dot-com crash!"

But here's what nobody is telling that : This is literally how banking has worked for centuries.

And banks are rewarded for it.



Let me explain why the "circular finance" criticism reveals more about financial illiteracy than about AI sustainability.

How Banking Actually Works: The Original Circle

Let's start with what everyone accepts as normal, healthy finance:

Traditional Business Loan:

  1. Bank lends you $500,000 to start a restaurant
  2. You use that $500,000 to buy equipment, inventory, lease space
  3. You operate the restaurant and generate revenue
  4. You pay the bank back $600,000 over 5 years (principal + interest)
  5. The bank's balance sheet grows by $100,000

Wait, isn't this circular?

Bank gives you money. You spend that money. You make money from what you bought. You give money back to the bank. Same money just moved in a circle, but the bank's numbers went up.

Nobody calls this a "circular finance scheme" or worries it's unsustainable. Why? Because we understand the mechanism:

  • Bank provided capital
  • Capital bought productive assets (kitchen equipment, inventory, house, education loan, corporate loan)
  • Productive assets generated revenue
  • Revenue exceeded costs
  • Bank captured a portion of that value creation (interest)

Productive finance. Money circulates, but value is created in the process. The bank's growing balance sheet reflects real economic activity, not financial engineering.

How AI Finance Actually Works: The Same Circle

Now let's look at what everyone's panicking about:

AI Lab Financing:

  1. Nvidia invests/lends $50B to an AI lab
  2. AI lab uses $50B to buy Nvidia chips and datacenter capacity
  3. AI lab builds models and generates revenue from AI services
  4. AI lab pays Nvidia back through revenue, equity appreciation, or future purchases
  5. Nvidia's balance sheet grows

This is the exact same structure as banking.

Replace "bank" with "Nvidia" and "restaurant equipment" with "AI chips" and you have the identical circular flow:

  • Nvidia provides capital
  • Capital buys productive assets (GPUs, compute infrastructure)
  • Productive assets generate revenue (AI services, API calls, subscriptions)
  • Revenue exceeds costs (hopefully)
  • Nvidia captures a portion of that value creation (returns, revenue)

Money circulates. Nvidia's numbers go up. But if the AI services generate real revenue from real customers paying real money, this is productive finance—not a house of cards.

Why the Circle Works (When It Works)

Both banking circular finance and AI circular finance succeed under the same conditions:

1. Productive Asset Purchase

Banking: Loan buys equipment that produces valuable goods/services 

AI: Investment buys chips that produce valuable AI capabilities

2. Real Customer Demand

Banking: Customers pay for restaurant meals, manufactured goods, services 

AI: Customers pay for AI capabilities, productivity gains, automation

3. Revenue > Costs

Banking: Business generates enough revenue to cover operations + loan repayment 

AI: AI lab generates enough revenue to cover compute costs + investor returns

4. Risk-Adjusted Returns

Banking: Bank charges interest rate that compensates for default risk 

AI: Nvidia prices investments/chips to compensate for business risk

When these conditions hold, the circle is sustainable. When they don't, it collapses—whether you're lending to restaurants or AI labs.

Real Question Isn't "Is It Circular?" But "Is It Productive?"

Entire circular finance critique misses the fundamental question: Are the acquired assets generating real value?

Bad circular finance (dot-com era):

  • Lucent lent money to telecom companies
  • Telecom companies bought Lucent equipment
  • Equipment sat unused because demand was overestimated
  • Telecoms couldn't generate revenue to pay back loans
  • Lucent wrote off billions, 47 carriers went bankrupt
  • The circle broke because the assets weren't productive

Good circular finance (banking forever):

  • Bank lends money to businesses
  • Businesses buy productive equipment
  • Equipment generates goods/services customers want
  • Revenue pays back loans
  • Everyone prospers
  • The circle continues because the assets ARE productive

AI circular finance (TBD):

  • Nvidia/Oracle/Microsoft fund AI labs
  • Labs buy compute infrastructure
  • Infrastructure produces AI capabilities
  • Customers pay for those capabilities (or don't)
  • Revenue justifies infrastructure costs (or doesn't)
  • The circle continues IF the assets are productive

What the Numbers Actually Show

Let's look at what OpenAI's commitments actually mean:

The "Scary" Numbers:

  • OpenAI revenue projection 2025: $13B
  • Infrastructure commitments: $300B (Oracle) + $90B (AMD) + $38B (AWS) = $428B
  • Ratio: 33x annual revenue in commitments

But here's the context nobody mentions:

These are multi-year commitments, not immediate spending. If spread over 10 years, that's $42.8B/year. If OpenAI grows revenue at 50% annually (slower than recent growth), they hit $118B annual revenue by year 5.

Compare to restaurant financing:

  • Restaurant revenue Year 1: $500k
  • Bank loan: $500k (1x annual revenue)
  • Equipment lifespan: 10 years
  • If restaurant grows 20%/year, loan becomes 0.16x revenue by year 10

The structure is identical. The only question is: Will AI services revenue grow fast enough to justify infrastructure investment?

That's not a question about circular finance. That's a question about market demand and business fundamentals.

Why Nvidia Isn't "Playing Bank" Wrong

Critics say Nvidia is taking excessive risk by being both investor and supplier. But this is actually standard practice in capital-intensive industries:

Equipment Financing Examples:

  • John Deere Financial lends money to farmers to buy John Deere tractors
  • Caterpillar Financial lends money to construction companies to buy Caterpillar equipment
  • Boeing Capital lends money to airlines to buy Boeing planes
  • Tesla provides financing for Tesla solar installations

In every case:

  1. Manufacturer has capital
  2. Manufacturer lends to customers
  3. Customers buy manufacturer's products
  4. Products generate revenue for customers
  5. Customers pay back manufacturer
  6. Manufacturer's business grows

This is called vendor financing, and it's normal.

Nvidia providing capital for customers to buy Nvidia chips is the AI equivalent of John Deere financing farmers to buy tractors. The circle works if tractors generate farming revenue and chips generate AI service revenue.

The Real Risks (Which Have Nothing to Do With Circularity)

I'm not saying AI finance is risk-free. The real risks are:

1. Demand Risk

Will enterprise customers pay enough for AI services to justify infrastructure costs? This is the same risk restaurants face—will customers pay enough for meals to justify kitchen equipment costs?

2. Competition Risk

Will AI service margins get compressed by competition? This is the same risk any business faces when competitors enter the market.

3. Technology Risk

Will current infrastructure become obsolete before generating sufficient returns? This is the same risk farmers face when buying tractors that might be superseded by better equipment.

4. Execution Risk

Will AI labs successfully build profitable businesses? This is the same risk any loan carries—will the borrower execute their business plan?

None of these risks are about circular finance. They're standard business risks that exist in any capital-intensive industry.

Why the Criticism Persists

If AI circular finance is structurally identical to banking and vendor financing, why is everyone panicking?

Three reasons:

1. Scale Shock

The numbers are enormous. $428B in commitments sounds scary. But in context, it's not crazy:

  • Global banking assets: $180 trillion
  • Amazon's capital investments 2019-2023: $240B
  • Meta's planned datacenter spend: Similar scale
  • AI infrastructure is capital-intensive, like all infrastructure

2. Speed Shock

AI scaled incredibly fast. OpenAI went from research lab to $13B revenue in ~5 years. Traditional businesses take decades to reach this scale, so the financing mechanisms feel rushed.

3. Misunderstanding "Circular"

People see money flowing in a circle and assume it's artificial. But ALL productive finance is circular—money flows from capital providers to businesses to customers back to capital providers. That's called an economy.

The Actual Test

Whether AI circular finance succeeds or fails will depend on one question:

Do AI services generate enough customer value to justify the infrastructure investment?

If yes:

  • Labs make revenue from real customers
  • Revenue pays infrastructure providers
  • Infrastructure providers profit
  • Circle continues sustainably
  • Just like banking for centuries

If no:

  • Labs can't generate sufficient revenue
  • Infrastructure providers don't get paid back
  • Investments written off
  • Circle breaks
  • Just like failed business loans

This isn't novel financial engineering. It's not a bubble indicator. It's not concerning because it's circular.

It's just finance. The same finance that's funded every capital-intensive industry from railroads to restaurants to rental cars.

What This Really Reveals

The "circular finance" panic reveals something uncomfortable: Most people don't understand how productive finance works.

They see money flowing in circles and panic because they think money should flow in straight lines. But productive economies have ALWAYS been circular:

  • Banks lend to businesses
  • Businesses buy equipment
  • Equipment generates goods
  • Customers buy goods
  • Revenue pays back banks
  • Banks lend more

The circle is a feature, not a bug. It's called economic growth.

The question isn't "Is AI finance circular?" (Yes, like all finance)

The question is "Is AI infrastructure productive?" (Are customers willing to pay for AI services?)

If you're worried about AI circular finance, ask yourself: Are you also worried about every business loan a bank makes? Because structurally, they're the same thing.

If the answer is no, then your concern isn't about circular finance. It's about whether you believe AI services will generate sufficient customer demand.

That's a valid concern. But let's be honest about what we're actually debating.

The Bottom Line

Banks have done circular finance for centuries and we reward them for it. They lend money, borrowers buy productive assets, assets generate revenue, revenue pays back loans, banks profit.

Nvidia is doing the same thing with AI labs. They provide capital, labs buy chips, chips generate AI services, services generate revenue, revenue flows back to Nvidia.

The structure is identical. The mechanism is identical. The only question is execution: Will AI services generate enough demand?

If yes, this is productive finance that creates value. If no, it's failed investment that destroys value.

But calling it "circular finance" as if that's inherently problematic just reveals you don't understand how banking—or business—has ever worked.

The circle isn't the problem. The circle is how capitalism functions.



Thursday, 13 November 2025

Agentic Commerce War - Car with bumpy roads

 There's a lawsuit that should make every engineer building AI applications pause and think carefully about the world they're creating. Amazon is suing Perplexity AI, and while the legal complaint talks about "covert access" and "computer fraud," what's really happening is far more interesting: we're watching the first shots fired in a war over who gets to control the future of commerce.

"AI revolution" in shopping is probably going to make markets less competitive, not more. Let me explain why.




The Pattern-Matching Disguised as Innovation



We've seen this movie before. In the 2000s, it was about who controls app distribution. Apple and Google built "open" platforms, welcomed developers, then extracted 30% rent from everyone. In the 2010s, it was about who controls attention. Facebook and Google became the gatekeepers to your customers, then jacked up ad prices once you were dependent on them.

Now we're in the 2020s, and the game is about who controls shopping intent. The technology changed—from apps to ads to AI agents—but the fundamental power dynamics remain depressingly familiar.

Agentic commerce is the fancy term for AI systems that can shop on your behalf. Tell ChatGPT you need running shoes under $100, and it searches stores, compares options, and completes the purchase. No browsing, no clicking through pages, no "adding to cart." The AI does it all.

McKinsey forecasts this could generate $1 trillion in global commerce by 2030. Traffic to U.S. retail sites from GenAI browsers already jumped multi fold year-over-year in 2025. 

Amazon vs. Perplexity: A Case Study in Platform Power

Here's what actually happened, stripped of the legal jargon:

November 2024: Amazon catches Perplexity using AI agents to make purchases through Amazon accounts. They tell Perplexity to stop. Perplexity agrees.

July 2025: Perplexity launches "Comet," their AI browser that can shop for you. Price tag: $200/month.

August 2025: Amazon detects Comet's agents are back, but this time they're disguised as Google Chrome browsers. Amazon implements security measures to block them.

Within 24 hours: Perplexity releases an update that evades Amazon's blocks.

November 2025: Amazon files a federal lawsuit accusing Perplexity of violating the Computer Fraud and Abuse Act. Perplexity publishes a blog post titled "Bullying is Not Innovation."

Now, you might think this is about security or customer protection. And sure, those are real concerns—when AI agents access customer accounts, make purchases, and handle payment data, security matters enormously.

But let's be honest about what's actually happening here: Amazon is defending its moat.




Amazon built a trillion-dollar business by owning the customer relationship. They know what you buy, when you buy it, how much you're willing to pay, and what you'll probably want next. This data advantage is what makes Amazon Rufus (their own shopping agent) dangerous to competitors—it already knows you better than any third-party agent ever could.

If Perplexity's agents can freely roam Amazon's platform, comparison-shop ruthlessly, and complete purchases without Amazon controlling the experience, then Amazon loses three critical things:

  1. The ability to show you ads for products they want you to buy
  2. The ability to promote their own private-label brands
  3. The data about what AI-assisted shopping actually looks like

This is Amazon's "app store moment." And they learned from Apple: if you're going to allow third parties to build on your platform, you need to control who gets access and extract rent from those you approve.

Architecture of Control: How This Actually Works

Let's talk about the technical stack for a moment, because this is where it gets interesting from an engineering perspective.

The Five-Layer Problem

Layer 1: Consumers delegate shopping tasks to AI agents, often paying $20-200/month for the privilege.

Layer 2: AI Agents (ChatGPT Operator, Perplexity Comet, Amazon Rufus) search, compare, and transact on your behalf.

Layer 3: Trust & Payment Infrastructure (Visa, Mastercard, Stripe) verify agent identity and process payments.

Layer 4: Platform Gatekeepers (Amazon, Google, Apple) control access to inventory and customer data.

Layer 5: Merchants & Brands fulfill orders and watch their margins compress.

Where power concentrates: not at the AI layer where everyone's focused, but at Layer 3 (payments) and Layer 4(Platform Gatekeepers)




Why Payments Matter More Than You Think

Visa and Mastercard are quietly positioning themselves as the critical trust infrastructure for agentic commerce. They're partnering with Cloudflare to implement Web Bot Auth—a cryptographic authentication protocol that lets merchants verify which AI agents are legitimate.

Think about the implications: if every agentic transaction must flow through payment network authentication, then Visa and Mastercard become the gatekeepers of which agents can transact at all. They've turned themselves into the identity verification layer for AI agents, which means they can collect tolls on the entire ecosystem.

This is brilliant infrastructure play. While everyone's fighting over the AI layer, the payment networks are becoming the new platform.

The Security Nightmare Nobody Wants to Talk About

Here's the thing that keeps security engineers up at night: traditional fraud detection assumes humans are making purchases. You can look at behavioral patterns, device fingerprinting, velocity checks—all the usual signals that distinguish legitimate users from attackers.

But what happens when the "legitimate" user is an AI agent that behaves like a bot because it is a bot?

The attack surface is enormous:

  • Agent manipulation: Increase vulnerability rate for tricking AI agents with fake listings or manipulated reviews
  • Automated account takeover: AI can run credential stuffing attacks at scale, then use compromised accounts to make "legitimate" agent purchases
  • Synthetic identity fraud: Generate deepfakes and fake identities that pass agent verification
  • Phishing at industrial scale: AI-generated personalized phishing that tricks both humans and other agents

To successfully implement agentic commerce, you need to solve the impossible problem: identify agents, distinguish legitimate from malicious ones, verify consumer intent, and do all of this in real-time at massive scale.

This isn't just a "hard problem"—it requires fundamentally rethinking identity, authentication, and trust in ways our current infrastructure wasn't designed for.

Legal Black Hole

The most fascinating aspect of this entire situation is that nobody knows what the law actually says about AI agents making purchases on your behalf.

Consider this scenario: Your AI agent buys you running shoes. They don't fit. Who's responsible?

  • Is the AI agent your "employee" acting under your authority?
  • Is it a contractor working for the AI company?
  • Did you actually "agree" to the purchase, or did the AI misinterpret your intent?
  • Can you return them under standard return policies, or do different rules apply?

The Uniform Electronic Transactions Act (UETA) and E-SIGN Act validate electronic signatures and contracts, but they were written assuming humans click "I agree." They don't tell us how to handle situations where an AI system makes autonomous decisions based on high-level instructions like "buy me running shoes under $100."

And it gets worse. When things go wrong—the agent buys the wrong product, accesses the wrong account, or exposes payment data—who's liable?

The legal frameworks assume someone clicked a button and agreed to terms. But with agentic AI:

  • The consumer gave high-level intent ("I need shoes")
  • The AI developer built the agent with certain objectives
  • The platform (Amazon) sets rules about what's allowed
  • The payment processor enables the transaction

When something breaks, you've got four parties pointing at each other saying "not my fault."

This isn't edge case stuff—this is the fundamental contract law question that needs answering before any of this scales. And right now? It's a complete void.

 Three Scenarios 

Based on the current trajectory, few things could happen:

Scenario 1: Platform Dominance (Very High Probability)

Amazon wins the lawsuit. Google, Apple, and other major platforms watch carefully and implement similar policies. The outcome:

  • Platforms allow only "approved" agents
  • Approved agents must share 15-30% revenue with platforms
  • Platforms build superior first-party agents using proprietary data
  • Market concentration increases dramatically

This is the most likely outcome because platforms hold all the leverage. They control access to inventory, customer data, and the ability to transact. If you want your AI agent to work, you play by their rules or you don't play at all.

Winner: Existing platform giants. The "disruption" looks suspiciously like the old oligopoly, just with AI agents instead of apps.

Scenario 2: Payment Network Mediation (Medium probability)

Visa and Mastercard successfully establish themselves as neutral trust brokers. Their authentication standards become mandatory. Multiple agents can compete, but all must register with payment networks and follow their protocols.

This creates a more open ecosystem than Scenario 1, but you've still got gatekeepers—just different ones. Every transaction generates payment network fees. The rails change hands, but someone still controls the rails.

Winner: Payment networks become infrastructure monopolies. Better than platform domination, but not exactly a free market.

Scenario 3: Regulatory Intervention (Very low)

Governments step in, mandate open access standards, require algorithmic transparency, and force interoperability. The EU tries this first with AI Act enforcement.

Winner: Consumers and smaller players benefit from enforced competition.

Reality check: Given current U.S. regulatory momentum and the fact that legal frameworks are years behind AI development, this seems highly unlikely. The platforms are moving too fast, and regulators are too slow.

Why This Probably Makes Markets Less Competitive

Here's the uncomfortable truth: despite all the talk about AI "democratizing" commerce and creating more efficient markets, the likely outcome is increased market concentration.

Why? 

Trust Concentrates Around Scale

When AI agents are making autonomous purchases with your money, you need to trust them completely. That trust is hard to build and easy to destroy. Large, established players like Amazon can credibly say "we've processed billions of transactions, here's our security track record."

A startup building a shopping agent? Much harder sell. The trust moat actually gets deeper, not shallower.

Data Moats Become too big wall to jump

The best shopping agent needs to know:

  • Your purchase history
  • Your preferences and budget
  • Your calendar and schedule
  • Your payment methods and addresses
  • Context about why you're shopping

Amazon already has all of this. Google has most of it. A third-party agent has... whatever you manually tell it.

This isn't a gap you can close with "better AI." It's a fundamental data disadvantage that compounds over time.

Network Effects Intensify

Just as traditional commerce requires an ecosystem (platforms, payment processors, logistics, fraud prevention), agentic commerce needs an even more complex interconnected system. The platforms that can bundle these services—authentication, payments, fulfillment, customer service—win by default.

It's the AWS playbook: provide the full stack, make integration seamless, and competitors can't match the convenience.

Power to Block Is Power to Control

This is the key insight from the Amazon-Perplexity fight: if platforms can simply block agents they don't like, then innovation requires permission.

Want to build a revolutionary shopping agent? Great. But if Amazon, Google, and Walmart all block you, your revolutionary agent can't access any inventory. You've built a car with no roads to drive on.

The platforms learned from the app store wars: let a thousand flowers bloom, then harvest the ones that matter.

What This Means for Engineers Building AI Applications

If you're working on AI agents, here's what you need to understand:

Platform Risk Is Your Existential Risk

Don't build on platforms you don't control unless you have explicit agreements in place. The terms of service you're operating under were written before agentic AI existed, and platforms can change the rules whenever they want.

Perplexity is learning this the hard way. They built a business model that required access to Amazon's platform, then discovered Amazon could just say "no."

The Liability Problem Won't Solve Itself

Right now, there's massive ambiguity about who's responsible when AI agents screw up. This ambiguity is risk for everyone in the stack. You need to:

  • Get explicit terms in writing about agent behavior and limits
  • Build audit trails for every decision your agent makes
  • Have clear escalation paths when things go wrong
  • Understand you're probably liable for your agent's actions, even if that's not fair

Security Can't Be an Afterthought

The threat model for agentic commerce is genuinely novel. You can't just apply traditional bot detection because legitimate agents are bots. You need:

  • Cryptographic agent authentication (like Web Bot Auth)
  • Behavioral anomaly detection that works for non-human actors
  • Multi-party verification for high-value transactions
  • Fallback to human-in-the-loop when confidence is low

This is hard, expensive, and essential. The first major security breach involving agent-based shopping will tank consumer trust in the entire category.

Think in Systems, Not Just Models

The failure mode for agentic commerce isn't "the AI makes a mistake." It's "the AI makes a reasonable-seeming decision based on incomplete data, which cascades into a mess of returns, chargebacks, and customer service nightmares."

Good agentic systems need:

  • Clear boundaries on what decisions they can make autonomously
  • Confidence thresholds that trigger human review
  • Graceful degradation when uncertain
  • Mechanisms for users to understand and override decisions

This is systems engineering, not just prompt engineering.

So Who Wins?

If you're asking "who wins the agentic commerce war," here's my read:

Tier 1: Platform Oligarchs (Amazon, Google, Apple) - They control access to inventory and customers. They can block competitors and extract rent from those they allow. Amazon's lawsuit against Perplexity is them establishing this reality.

Tier 2: Payment Networks (Visa, Mastercard) - Becoming the critical trust infrastructure. Every transaction flows through them, and they're setting authentication standards for the entire ecosystem.

Tier 3: AI Insurgents (OpenAI, Perplexity, Anthropic) - High risk, high reward. They have the AI capabilities and consumer mindshare, but they need platform access to deliver value. Many will get squeezed or forced into revenue-sharing deals.

The Losers: Traditional retailers and brands who get reduced to "background utilities" in agent-controlled marketplaces. TripAdvisor already down 30% in traffic. AllRecipes lost 15%. This is the canary in the coal mine.

The uncomfortable parallel: this is the app store model all over again. Platforms create "open" ecosystems, welcome innovation, then monetize, control, and eventually squeeze everyone building on top.

In five years, we'll have agentic commerce. But it will likely be dominated by 3-5 massive platforms that control access, set standards, and extract rent. The "revolution" will look suspiciously like the old regime—just with better AI.

The Bottom Line

Agentic commerce is coming whether we're ready for it or not. The technology works, the market opportunity is massive, and the big platforms are already building it.

But let's not fool ourselves about what we're building. This isn't some perfect future where AI agents create perfect market efficiency and infinite consumer choice. It's a new battleground for the same old fight: who gets to control access to customers, and who gets to extract rent from transactions.

Amazon is suing Perplexity because they understand what's at stake. This isn't about "covert access" or "customer security"—those are the legal justifications. The real fight is about whether Amazon gets to control agentic commerce the same way Apple controlled app distribution and Google controlled digital advertising.

And based on history, platform power, and the economics of trust at scale, they probably will.

We've seen this movie before. The technology is new, but the plot is depressingly familiar.


Friday, 24 October 2025

Memorization Machine:Why AI Coding Agents Aren't Really Programming Yet

This is further exploration and sharing of experience of using Coding Agents tool. You can read some of old post 



If you believe the internet, AI has essentially solved programming. Every day brings new viral videos of LLMs building complete applications in minutes, fixing complex bugs instantly, and dramatically boosting developer productivity. The narrative is clear: AI coding agents are revolutionizing software development.

But there's a fundamental truth being obscured by all the hype: current AI coding agents are sophisticated memorization machines, not genuine programmers. And understanding this distinction explains both their impressive capabilities and their critical limitations.

Programming as Crystallized History

Here's an insight that might seem obvious once stated but has profound implications: programming is fundamentally built on accumulated patterns and historical knowledge. Every framework, algorithm, and design pattern represents decades of collective problem-solving. When we write code, we're rarely inventing something truly novel—we're recombining established solutions in contextually appropriate ways.

This makes programming uniquely suited to pattern-matching systems like LLMs. Unlike fields requiring real-time sensory input or physical manipulation, programming creates an enormous corpus of documented solutions, discussions, and examples. Stack Overflow, GitHub, documentation sites, and millions of code repositories form a vast memory bank that LLMs can internalize during training.


The memorization works at multiple levels:

  • Syntactic patterns: How code is structured
  • Semantic patterns: What code means in context
  • Pragmatic patterns: How code is actually used
  • Meta-patterns: Common problem-solving approaches

Why LLMs Excel at Coding (Within Limits)

This memorization-based architecture explains why LLMs punch above their weight in programming tasks:

Pattern Density: Code has extraordinarily high pattern density. The same structures appear repeatedly across different contexts, creating clear patterns for memorization.
Explicit Structure: Programming languages have formal syntax and semantics, making patterns more distinct and recognizable than natural language.
Solution Reusability: Most programming problems are variations of previously solved problems. An LLM that has "memorized" solutions can adapt them to new contexts with surprising effectiveness.
Rich Training Data: The internet contains millions of code examples with explanations, making it possible for LLMs to learn not just syntax but usage patterns and common approaches.
This is genuinely impressive! Sophisticated pattern matching can solve a remarkable range of programming tasks. But it's not the same as genuine programming expertise.





The Wall: Where Memorization Breaks Down

Real programming requires capabilities that pattern matching simply cannot provide:

System-Level Thinking

Great programmers understand how code fits into larger systems, considering performance, maintainability, security, and business constraints simultaneously. They think architecturally, not in isolated snippets.

LLMs can generate architecturally-sound code patterns they've memorized, but they can't make real architectural trade-offs based on your specific traffic patterns, team structure, regulatory requirements, or budget limitations.

Long-Term Reasoning

Professional programming means writing code thinking about how it will be maintained, modified, and scaled over months or years. It requires understanding technical debt, anticipating future requirements, and building for evolution.

LLMs have no persistent understanding. Each interaction is essentially fresh—they can't build up knowledge of a codebase over time like human programmers do.

Context Beyond Code

Real programming involves understanding business requirements, user needs, team capabilities, and reading between the lines of incomplete specifications.

LLMs work with the text you give them. They can't interview stakeholders, understand implicit requirements, or navigate organizational politics that shape technical decisions.

Novel Problem Solving

When you encounter a problem that doesn't match memorized patterns—a genuinely novel requirement, an unusual constraint, or an emerging technology—memorization-based systems struggle.

They can combine patterns creatively, but they can't reason from first principles or develop entirely new approaches.





The Real Test: Maintenance Programming

The biggest gap becomes obvious when you move beyond initial implementation to maintenance programming:

A viral demo might show: "I built a complete e-commerce site in 30 minutes!"

What's not shown:

  • No authentication system worth deploying
  • No error handling for edge cases
  • No data validation or security considerations
  • No scalability planning
  • No testing strategy
  • Breaks on inputs the demo didn't consider
  • Requires extensive refactoring for production

This is the difference between code that runs initially and code that serves a business reliably for years.

Experienced programmers understand why code evolved the way it did, recognize technical debt patterns, assess risks of changes, and make judgment calls about when to refactor versus work around issues. These capabilities come from experiential learning that memorization cannot replicate.

So Why All the Success Stories?

If current AI coding agents are fundamentally limited, why is the internet overflowing with success stories? The disconnect is real and worth understanding:

The Demo Problem

Success stories showcase clean, isolated problems with clear specifications: "Build a todo app," "Implement quicksort," "Create a REST API endpoint."

They don't showcase real programming work: debugging memory leaks in 500K line codebases, integrating with undocumented legacy systems, refactoring critical infrastructure without breaking anything, or making architectural decisions under business pressure.

Cherry-Picking at Scale

Even if LLMs only work impressively 10% of the time, that's still thousands of viral examples from millions of attempts. Hundreds of failures disappear without a trace.



Economic Incentives

Companies promoting these tools have billions of dollars at stake. OpenAI, Microsoft, Google, and countless startups need to demonstrate transformative results to justify valuations and drive adoption.

"Our AI can replace junior developers" sells better than "Our AI can help with boilerplate code sometimes."

The Experience Gap

Beginners are amazed by any working code generation and often can't assess code quality deeply. Experts notice subtle issues, architectural problems, and maintenance nightmares—but most viral content comes from impressed beginners.

This creates a distortion where surface-level success gets amplified while deeper limitations remain hidden until you actually try to use AI-generated code in production.

Definition Games

What counts as "success"? Code that runs initially or code that's maintainable? Solving toy problems or real business challenges? Individual productivity or team productivity? Short-term output or long-term quality?

The goalposts keep moving. As AI gets better at basic tasks, "success" shifts to whatever AI can currently do.


The Uncanny Valley of AI Programming

This creates an uncanny valley effect. AI coding agents seem very capable on surface-level tasks but fail unpredictably on deeper challenges. They can write code that looks professional but may have subtle issues that only become apparent months later.




What This Means Practically

This isn't an argument against using AI coding tools—I use them regularly and find them valuable. But it's crucial to understand their nature and limitations:

Use AI coding agents as powerful assistants, not autonomous programmers. They excel at accelerating experienced developers but can't replace the deep thinking that programming requires.

Trust but verify. AI-generated code needs careful review by someone who understands the broader context. The code might work in isolation but introduce problems in your specific system.

Focus on the right problems. AI tools shine on well-defined, pattern-matching tasks. They struggle with ambiguous requirements, system design, and novel challenges.

Beware the productivity illusion. Writing code faster doesn't mean programming faster if that code requires extensive debugging and refactoring later.


Looking Forward




For AI to move beyond sophisticated memorization to genuine programming capability, systems would need:

We're not there yet. Current AI coding agents are impressive pattern-matching systems that happen to work well in a domain built on historical patterns. That's genuinely useful, but it's not the same as artificial programming intelligence.

Memorization foundation makes them brittle in exactly the situations where you need the most help—the complex, ambiguous, high-stakes decisions that define excellent programming.