Saturday, 29 March 2025

How AI Coding Assistants Reshape Productivity

The Jevons Paradox in Software Engineering: How AI Coding Assistants Reshape Productivity

Imagine this: You've just installed the latest AI coding assistant. The marketing promised to cut your coding time in half. Six months later, you're writing more code than ever before, tackling increasingly complex problems, and somehow still working the same hours. What happened?

Welcome to Jevons Paradox in the age of AI-assisted software development.

The Curious Case of Efficiency That Doesn't Save Time

In 1865, a British economist named William Stanley Jevons noticed something counterintuitive about coal consumption. When more efficient steam engines were introduced, logic suggested coal use would decrease. Instead, it skyrocketed. The more efficiently coal could be used, the more applications people found for it.

Fast forward to 2025: Your AI coding assistant is the modern-day steam engine, and your time and mental energy are the coal.

My Journey With AI Coding Assistants: A Personal Story

When I first integrated an AI coding assistant into my workflow last year, I had visions of shorter workdays and more time for strategic thinking. The reality? I found myself saying "now I can finally tackle that refactoring project I've been putting off" and "let's add those extra test cases we've been skipping."

Sound familiar?

The Numbers Don't Lie: The Productivity Paradox in Action

Recent industry surveys reveal a fascinating pattern:

Metric Without AI Assistant With AI Assistant Change
Lines of code written/week 1,200 1,560 +30%
Tickets closed/sprint 8 10 +25%
Languages/frameworks used regularly 2-3 4-5 +67%


Below Metrics are subjective.
I loved writing more code 







Average hours worked/week 40 to 50 70+ 50%+

Metric	Without AI Assistant	With AI Assistant	Change
Lines of code written/week	1,200	1,560	+30%
Tickets closed/sprint	8	10	+25%
Languages/frameworks used regularly	2-3	4-5	+67%
Below Metrics are subjective. I loved writing more code
Average hours worked/week	40 to 50	70+	50%+

Why We Keep Consuming Our Efficiency Gains

The Expanding Possibility Frontier

As our tools improve, our concept of what's possible expands with them. It's human nature. When we suddenly have the capacity to do more, we don't pocket the difference—we expand our ambitions.

This cycle isn't unique to software development, but our field experiences it more intensely than most because of how quickly our tools evolve.

The Four Types of Productivity Consumers

In my observation, there are four ways engineers typically "spend" their AI-driven productivity gains:

The Depth Diver – Uses efficiency to create more robust solutions with better error handling, edge case management, and performance optimization
The Breadth Explorer – Leverages AI to work across more languages, frameworks, and systems than previously possible
The Quality Enhancer – Invests saved time in better documentation, more comprehensive tests, and cleaner code
The Volume Maximizer – Simply produces more features, closes more tickets, and ships more code

Which one are you? Most of us are a blend, shifting between these archetypes depending on project requirements and personal interests.

The Great Capability Expansion

What makes AI coding assistants particularly powerful is how they expand what individual developers can accomplish:

This expansion means junior developers can contribute to complex systems earlier in their careers, while senior developers can focus more on architecture and innovation.

Reimagining Productivity in the AI Era

From "Doing Things Faster" to "Doing Better Things"

The most successful teams I've observed aren't just using AI to speed up existing processes—they're rethinking what processes should exist in the first place.

Consider this reimagined development workflow:

The key shift: humans focus on the parts of the process where creativity, judgment, and contextual understanding matter most.

How to Thrive in the Age of AI-Assisted Development

1. Embrace Strategic Inefficiency

Not everything should be optimized for speed. Sometimes, diving deep into a problem without AI assistance builds fundamental understanding that pays dividends later.

2. Set Clear Boundaries

Establish team norms around when and how to use AI assistants. Some projects benefit from exploration and creative generation; others need careful, methodical human reasoning.

3. Measure What Matters

If you're still measuring productivity by lines of code or tickets closed, you're missing the true impact of AI assistance. Consider metrics like:

Time to validated solution (not just working code)
Reduction in production incidents
User-reported satisfaction with features
Knowledge dissemination across the team

4. Continuously Reskill

The skills that make developers valuable are evolving rapidly. The future belongs to those who can:

Clearly articulate problems for AI to solve
Evaluate and refine AI-generated solutions
Understand and communicate system-level concerns
Apply deep domain knowledge to technical decisions

Looking Ahead: The Co-Evolution of Engineers and Their Tools

As our relationship with AI coding assistants deepens, we're not just changing our tools—our tools are changing us. The software engineers of 2030 will approach problems differently than those of 2020, just as today's engineers think differently than those of the pre-internet era.

The most exciting part of this journey isn't just what we'll build—it's who we'll become as builders.

What's your experience with AI coding assistants? Are you saving time, doing more, or both? Share your thoughts in the comments below!

Thursday, 6 March 2025

Building a Strongly-Typed API for Large Language Models

In this post, we'll explore a Java interface designed to interact with Large Language Models (LLMs) in a type-safe manner. We'll break down the GenerativeAIService interface and its supporting classes to understand how it provides a structured approach to AI interactions.

The Problem: Unstructured LLM Responses

When working with LLMs, responses typically come as unstructured text. This presents challenges when you need to extract specific data or integrate AI capabilities into enterprise applications that expect structured data.

For example, if you want an LLM to generate JSON data for your application, you'd need to:

Parse the response text
Extract the JSON portion
Deserialize it into your application objects
Handle parsing errors appropriately

This process can be error-prone and verbose when implemented across multiple parts of your application.

Enter GenerativeAIService

The GenerativeAIService interface provides a clean solution to this problem by offering methods that not only communicate with LLM APIs but also handle the parsing of responses into Java objects.

Let's look at the core interface:

java
public interface GenerativeAIService {
    ChatMessageReply chat(ChatRequest conversation);

    default ChatRequest prepareRequest(ChatRequest conversation, Map<String, Object> params) {
        return ParamPreparedRequest.prepare(conversation, params);
    }

   

    default <T> T chat(ChatRequest conversation, Class<T> returnType) {
        return chat(conversation, returnType, (jsonContent, e) -> {
            throw new RuntimeException("Failed to parse JSON: " + jsonContent, e);
        }).get();
    }

    default <T> Optional<T> chat(ChatRequest conversation, Class<T> returnType, BiConsumer<String, Exception> onFailedParsing) {
        var reply = chat(conversation);
        return ChatMessageJsonParser.parse(reply, returnType, onFailedParsing);
    }

//Other methods
}

The interface provides three key capabilities:

Basic Chat Functionality: The chat(ChatRequest) method handles direct communication with the LLM and returns raw responses.
Type-Safe Responses: Overloaded chat() methods accept a Class<T> parameter to specify the expected return type, allowing the service to automatically parse the LLM response into the desired Java class.
Robust Error Handling: Options to provide custom error handling logic when parsing fails.

How It Works

Behind the scenes, the ChatMessageJsonParser class does the heavy lifting:

java
public static <T> Optional<T> parse(ChatMessageReply reply, Class<T> returnType, BiConsumer<String, Exception> onFailedParsing) {
    var message = reply.message().trim();
    var jsonContent = _extractMessage(message);
    return _cast(returnType, onFailedParsing, jsonContent);
}

It:

Extracts JSON content from the LLM's response (which may be wrapped in markdown code blocks)
Uses Gson to deserialize the JSON into the requested type
Handles parsing errors according to the provided error handler

Parameterised Prompts

The interface also supports parameterised prompts through the ParamPreparedRequest class:

java
default ChatRequest prepareRequest(ChatRequest conversation, Map<String, Object> params) {
    return ParamPreparedRequest.prepare(conversation, params);
}

This allows you to:

Create template prompts with placeholders like {{parameter_name}}
Fill those placeholders at runtime with a map of parameter values
Validate that all required parameters are provided

Code Example: Using the Typed API

Here's how you might use this API in practice:

java
// Define a data class for the structured response
public static class ProductSuggestion {
    public String productName;
    public String description;
    public double price;
    public List<String> features;

   

}

// Create a parameterised prompt
String prompt = """
        Suggest a {{product_type}} product with {{feature_count}} features. Reply in JSON format
        
        <example>
        {
            "productName": "product name",
            "description": "product description",
            "price": 100.0,
            "features": ["feature 1", "feature 2", "feature 3", "feature 4", "feature 5"]
        }
        </example>
        """;
var request = new ChatRequest(
        "gemini-2.0-flash",
        0.7f,
        List.of(new ChatRequest.ChatMessage("user",
                prompt))
);

// Prepare with parameters
Map<String, Object> params = Map.of(
        "product_type", "smart home",
        "feature_count", 5
);
request = service.prepareRequest(request, params);


// Get typed response
var suggestion = service.chat(request, ProductSuggestion.class);

System.out.println(suggestion);

Benefits of a Typed LLM API

Type Safety: Catch type mismatches at compile time rather than runtime.
Clean Integration: Seamlessly incorporate AI capabilities into existing Java applications.
Reduced Boilerplate: Consolidate JSON parsing and error handling logic in one place.
Parameter Validation: Ensure all required prompt parameters are provided before making API calls.
Flexible Error Handling: Customize how parsing errors are handled based on your application's needs.

Implementation Considerations

When implementing this interface for different AI providers, consider:

JSON Mode/Structure Mode: Now a days LLM support JSON or Structure mode and that can used as compared to Prompt instruction.
Response formats: Ensure your parser can handle the specific output formats of each provider.

Conclusion

By creating a strongly-typed interface for LLM interactions, we bridge the gap between the unstructured world of AI and the structured requirements of enterprise applications. This approach enables developers to leverage the power of large language models while maintaining the type safety and predictability.

The GenerativeAIService interface provides a foundation that can be extended to work with various AI providers while providing a consistent interface for application code. It represents a step toward making AI capabilities more accessible and manageable in traditional software development workflows.

Code for this post is available @ TypeSafety

LLM Patterns

Design patterns are reusable solutions to common problems in software design. They represent best practices evolved over time by experienced software developers to solve recurring design challenges.

The concept was popularized by the "Gang of Four" (Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides) in their influential 1994 book "Design Patterns: Elements of Reusable Object-Oriented Software."

In this post we are going to look at some of design patterns for LLM.

Simple Chat

This is most simple pattern where text is send as input into LLM and Text is return. Every one start with this.

Lets look at code example

var service = GenerativeAIDriverManager.create(GoogleAIFactory.NAME, "https://generativelanguage.googleapis.com", properties);

var messages = new ChatRequest.ChatMessage("user", "Top 5 Country by GPD");
var conversation = ChatRequest.create("gemini-2.0-flash", List.of(messages));
var reply = service.chat(conversation);
System.out.println(reply.message());

Output

Okay, here are the top 5 countries by GDP (Nominal) according to the latest estimates from the International Monetary Fund (IMF) as of October 2023:

1. **United States:** $26.95 trillion

2. **China:** $17.72 trillion

3. **Germany:** $4.43 trillion

4. **Japan:** $4.23 trillion

5. **India:** $3.73 trillion

It's important to note:

* **Source:** I'm using the IMF's World Economic Outlook Database, October 2023 edition. These are estimates and projections, and are subject to change.

* **Nominal GDP:** This is GDP measured at current market prices, without adjusting for inflation.

* **Data Availability:** The most current, definitive GDP figures are usually released with a bit of a lag.

* **GDP (PPP):** It's also worth knowing that if you look at GDP based on Purchasing Power Parity (PPP), the rankings can shift somewhat, with China often being very close to, or even exceeding, the United States.

Simple Chat with Some Structure

You may wonder how to use LLM responses programmatically when looking at the output. This is precisely the problem we'll solve with this pattern. By making a small adjustment to your prompt, you can instruct the LLM to return JSON output. The revised prompt would look like: "List the top 5 countries by GDP. Reply in JSON format."

With just a small change to the prompt, the LLM can return structured output, making it function more like a proper API with a degree of type safety. Types are essential in programming—without them, code becomes difficult to maintain and debug as applications grow in complexity.

Lets look at output of prompt

```json

{

"top_5_countries_by_gdp": [

{

"rank": 1,

"country": "United States",

"gdp_usd": "Approximately $25+ Trillion (USD)"

{

"rank": 2,

"country": "China",

"gdp_usd": "Approximately $17+ Trillion (USD)"

{

"rank": 3,

"country": "Japan",

"gdp_usd": "Approximately $4+ Trillion (USD)"

{

"rank": 4,

"country": "Germany",

"gdp_usd": "Approximately $4+ Trillion (USD)"

{

"rank": 5,

"country": "India",

"gdp_usd": "Approximately $3+ Trillion (USD)"

}

"note": "GDP figures are approximate and based on the most recent available data (typically from organizations like the World Bank and the IMF). These values fluctuate and can vary slightly depending on the source and the date the data was collected."

}

```

Chat with My Custom Structure

Now you know where we are going. JSON output is good but you need more control and consistency over what is structure of output. One more thing to note without enforcing specific type structure LLM is free to return data in any structure and that will break your API contract. This is also achieved by changing prompt to

```
Top 5 Country by GPD. Reply in JSON format
Example:
{
"countries":[
{"name":"country 1","gdp":gpd , "unit":"trillion or billion etc","rank":rank of country},
{"name":"country 2","gdp":gpd , "unit":"trillion or billion etc","rank":rank of country},
{"name":"country 3","gdp":gpd , "unit":"trillion or billion etc","rank":rank of country},
{"name":"country 4","gdp":gpd , "unit":"trillion or billion etc","rank":rank of country},
{"name":"country 5","gdp":gpd , "unit":"trillion or billion etc","rank":rank of country}
]
}
```

Code Sample

var prompt = """
        Top 5 Country by GPD. Reply in JSON format
        Example:
        {
        "countries":[
            {"name":"country 1","gdp":gpd , "unit":"trillion or billion etc","rank":rank of country},
            {"name":"country 2","gdp":gpd , "unit":"trillion or billion etc","rank":rank of country},
            {"name":"country 3","gdp":gpd , "unit":"trillion or billion etc","rank":rank of country},
            {"name":"country 4","gdp":gpd , "unit":"trillion or billion etc","rank":rank of country},
            {"name":"country 5","gdp":gpd , "unit":"trillion or billion etc","rank":rank of country}
        ]
        }
        """;
var messages = new ChatRequest.ChatMessage("user", prompt);
var conversation = ChatRequest.create("gemini-2.0-flash", List.of(messages));
var reply = service.chat(conversation);
System.out.println(reply.message());

Output

```json

{

"countries": [

{

"name": "United States",

"gdp": 26.95,

"unit": "trillion USD",

"rank": 1

{

"name": "China",

"gdp": 17.73,

"unit": "trillion USD",

"rank": 2

{

"name": "Japan",

"gdp": 4.23,

"unit": "trillion USD",

"rank": 3

{

"name": "Germany",

"gdp": 4.07,

"unit": "trillion USD",

"rank": 4

{

"name": "India",

"gdp": 3.42,

"unit": "trillion USD",

"rank": 5

}

]

}

```

Strongly Typesafe Chat

We've established a solid foundation to approach type safety, and now we reach the final step: converting the LLM's string output by passing it through a TypeConverter to create a strongly typed object. This completes our transformation from unstructured text to programmatically usable data.

Changes for type safety is done in library - llmapi

```

<dependency>
    <groupId>org.llm</groupId>
    <artifactId>llmapi</artifactId>
    <version>1.2.1</version>
</dependency>

```

Sample Code

var prompt = """
        Top 5 Country by GPD. Reply in JSON format
        Example:
        {
        "countries":[
            {"name":"country 1","gdp":gpd , "unit":"trillion or billion etc","rank":rank of country},
            {"name":"country 2","gdp":gpd , "unit":"trillion or billion etc","rank":rank of country},
            {"name":"country 3","gdp":gpd , "unit":"trillion or billion etc","rank":rank of country},
            {"name":"country 4","gdp":gpd , "unit":"trillion or billion etc","rank":rank of country},
            {"name":"country 5","gdp":gpd , "unit":"trillion or billion etc","rank":rank of country}
        ]
        }
        """;
var messages = new ChatRequest.ChatMessage("user", prompt);
var conversation = ChatRequest.create("gemini-2.0-flash", List.of(messages));

var reply = service.chat(conversation, CountryGdp.class);
System.out.println(reply);

Only change in the code is using typesafe chat function from llmapi


public interface GenerativeAIService {
    ChatMessageReply chat(ChatRequest var1);

    default EmbeddingReply embedding(EmbeddingRequest embedding) {
        throw new UnsupportedOperationException("Not Supported");
    }

    default <T> T chat(ChatRequest conversation, Class<T> returnType) {
       ....
    }

    default <T> Optional<T> chat(ChatRequest conversation, Class<T> returnType, BiConsumer<String, Exception> onFailedParsing) {
        ....
    }
}

Output is instance of CountryGdp Object.

Parameter based chat

As your LLM applications grow in complexity, your prompts will inevitably become more sophisticated. One essential feature for managing this complexity is parameter support, similar to what you find in JDBC. The next pattern addresses this need, demonstrating how prompts can contain parameters that are dynamically replaced at runtime, allowing for more flexible and reusable prompt templates.

Code

var prompt = """
        Top {{no_of_country}} Country by GPD. Reply in JSON format
        Example:
        {
        "countries":[
            {"name":"country 1","gdp":gpd , "unit":"trillion or billion etc","rank":rank of country},
            {"name":"country 2","gdp":gpd , "unit":"trillion or billion etc","rank":rank of country},
            {"name":"country 3","gdp":gpd , "unit":"trillion or billion etc","rank":rank of country},
            {"name":"country 4","gdp":gpd , "unit":"trillion or billion etc","rank":rank of country},
            {"name":"country 5","gdp":gpd , "unit":"trillion or billion etc","rank":rank of country}
        ]
        }
        """;
var messages = new ChatRequest.ChatMessage("user", prompt);
var conversation = ChatRequest.create("gemini-2.0-flash", List.of(messages));

var preparedConversion = service.prepareRequest(conversation, Map.of("no_of_country", "10"));
var reply = service.chat(preparedConversion, CountryGdp.class);

System.out.println(reply);

Conculsion

We have only scratched the surface of LLM patterns. In this post, I've covered some basic to intermediate concepts, but my next post will delve into more advanced patterns that build upon these fundamentals.

All the code used in this post is available @ llmpatterns git project

Wednesday, 5 March 2025

Building a Universal Java Client for Large Language Models

In today's rapidly evolving AI landscape, developers often need to work with multiple Large Language Model (LLM) providers to find the best solution for their specific use case. Whether you're exploring OpenAI's GPT models, Anthropic's Claude, or running local models via Ollama, having a unified interface can significantly simplify development and make it easier to switch between providers.

The Java LLM Client project provides exactly this: a clean, consistent API for interacting with various LLM providers through a single library. Let's explore how this library works and how you can use it in your Java applications.

Core Features

The library offers several key features that make working with LLMs easier:

Unified Interface: Interact with different LLM providers through a consistent API
Multiple Provider Support: Currently supports OpenAI, Anthropic, Google, Groq, and Ollama
Chat Completions: Send messages and receive responses from language models
Embeddings: Generate vector representations of text where supported
Factory Pattern: Easily create service instances for different providers

Architecture Overview

The library is built around a few key interfaces and classes:

GenerativeAIService: The main interface for interacting with LLMs
GenerativeAIFactory: Factory interface for creating service instances
GenerativeAIDriverManager: Registry that manages available services
Provider-specific implementations in separate packages

This design follows the classic factory pattern, allowing you to:

Register service factories with the GenerativeAIDriverManager
Create service instances through the manager
Use a consistent API to interact with different providers

Getting Started

To use the library, first add it to your Maven project:

xml
<dependency>
    <groupId>org.llm</groupId>
    <artifactId>llmapi</artifactId>
    <version>1.0.0</version>
</dependency>

Basic Usage Example

Here's how to set up and use the library:

java
// Register service providers
GenerativeAIDriverManager.registerService(OpenAIFactory.NAME, new OpenAIFactory());
GenerativeAIDriverManager.registerService(AnthropicAIFactory.NAME, new AnthropicAIFactory());
// Register more providers as needed

// Create an OpenAI service
Map<String, Object> properties = Map.of("apiKey", System.getenv("gpt_key"));
var service = GenerativeAIDriverManager.create(
    OpenAIFactory.NAME, 
    "https://api.openai.com/", 
    properties
);

// Create and send a chat request
var message = new ChatMessage("user", "Hello, how are you?");
var conversation = new ChatRequest("gpt-4o-mini", List.of(message));
var reply = service.chat(conversation);
System.out.println(reply.message());

// Generate embeddings
var vector = service.embedding(
    new EmbeddingRequest("text-embedding-3-small", "How are you")
);
System.out.println(Arrays.toString(vector.embedding()));

Working with Different Providers

OpenAI

java
Map<String, Object> properties = Map.of("apiKey", System.getenv("gpt_key"));
var service = GenerativeAIDriverManager.create(
    OpenAIFactory.NAME, 
    "https://api.openai.com/", 
    properties
);

// Chat with GPT-4o mini
var conversation = new ChatRequest("gpt-4o-mini", 
    List.of(new ChatMessage("user", "Hello, how are you?")));
var reply = service.chat(conversation);

Anthropic

java
Map<String, Object> properties = Map.of("apiKey", System.getenv("ANTHROPIC_API_KEY"));
var service = GenerativeAIDriverManager.create(
    AnthropicAIFactory.NAME, 
    "https://api.anthropic.com", 
    properties
);

// Chat with Claude
var conversation = new ChatRequest("claude-3-7-sonnet-20250219", 
    List.of(new ChatMessage("user", "Hello, how are you?")));
var reply = service.chat(conversation);

Ollama (Local Models)

java
// No API key needed for local models
Map<String, Object> properties = Map.of();
var service = GenerativeAIDriverManager.create(
    OllamaFactory.NAME, 
    "http://localhost:11434", 
    properties
);

// Chat with locally hosted Llama model
var conversation = new ChatRequest("llama3.2", 
    List.of(new ChatMessage("user", "Hello, how are you?")));
var reply = service.chat(conversation);

Under the Hood

The library uses an RPC (Remote Procedure Call) client to handle the HTTP communication with various APIs. Each provider's implementation:

Creates appropriate request objects with the required format
Sends requests to the corresponding API endpoints
Parses responses into a consistent format
Handles errors gracefully

The RpcBuilder creates proxy instances of service interfaces, handling the HTTP communication details so you don't have to.

Supported Models

The library currently supports several models across different providers:

OpenAI: all
Anthropic: all
Google: gemini-2.0-flash
Groq: all
Ollama: any other model you have locally

Extending the Library

One of the strengths of this design is how easily it can be extended to support new providers or features:

Create a new implementation of GenerativeAIFactory
Implement GenerativeAIService for the new provider
Create necessary request/response models
Register the new factory with GenerativeAIDriverManager

Conclusion

The Java LLM Client provides a clean, consistent way to work with multiple LLM providers in Java applications. By abstracting away the differences between APIs, it allows developers to focus on their application logic rather than the details of each provider's implementation.

Whether you're building a chatbot, generating embeddings for semantic search, or experimenting with different LLM providers, this library offers a straightforward way to integrate these capabilities into your Java applications.

The project's use of standard Java patterns like factories and interfaces makes it easy to understand and extend, while its modular design allows you to use only the providers you need. As the LLM ecosystem continues to evolve, this type of abstraction layer will become increasingly valuable for developers looking to build flexible, future-proof applications.

Link to github project - llmapi