Newsletter

Sep 25, 2025

Version Control for AI Prompts

Version control for AI prompts is the next frontier, essential for managing, optimizing, and evaluating prompt workflows in modern software development.

Copy Link

AI/ML

Copy Link

We have written about version control in the past, but the widespread integration of artificial intelligence (AI) into software development has created a new set of processes that require version control. As users integrate more natural language prompts into their code, they need to find a way to manage versions of these prompts and to evaluate how effective different versions are at achieving its goal. This week, we want to explore what prompt engineering is and why new version control systems are needed to manage this workflow.

What is Prompt Engineering?

Prompt engineering is the process of guiding an AI solution to provide a specific output. For example, if you tell an AI model to:

“Give me a picture of a building.”

You could get any range of things, and likely not what you had in mind. However, if you get more specific and say:

“Create a realistic, life-like image of the Empire State Building at night. The top of the building should glow with golden lighting. The background sky should be filled with stars, and there should be a large red moon prominently visible. The perspective should highlight the building against the night sky, with sharp detail and cinematic contrast.”

The result is likely to be a lot more precise. This same phenomenon exists with every interaction with an AI model. Whether you are trying to get it to engage with a code base, summarize an article, or chain together a series of functions, the more specific the user input, the better the output.

Integrating prompts into a software development workflow means they won’t just be one-off inputs, they’ll be reused repeatedly. For example, you might set up a program to run the same instruction every week: “Summarize every newsletter from Konvoy and post it on Twitter.” Over time, you’ll likely refine that instruction to improve clarity, accuracy, or style. Keeping track of how prompts evolve becomes essential, since even small changes can lead to very different outputs.

‍Prompt vs Code Version Control

‍‍In traditional software development, version control systems help teams track all changes made to code over time, allowing users to recall older versions, restore files, and test new features without affecting the main project. Systems like Git, Perforce, and Diversion (a Konvoy portfolio company) help developers keep track of their changes and often have built-in tools to ensure everything is working properly when things change.

Similarly, version control for prompts can help track changes, collaborate with other members of your team, and revert to old versions of a prompt. However, the similarities stop there. Version control for code and prompts has distinctly different workflows that need different tools and processes to manage properly.

Technical & Non-Technical: As prompts are written in natural language, non-technical team members can create and use them. Creating a familiar space where these users can engage with prompts is important, given that Git workflows can be overwhelming for non-technical users.
Prompt Management: Prompts show up in code as large blocks of text. This makes a traditional collaborative word processing document a better place to edit and manage prompts versus traditional version control for code.
AI Model: New AI models are constantly coming out with different capabilities – having a place where you can easily manage API keys, try different models, and evaluate their performance is not something you can easily do with existing version control tools, although third-party tools like Openrouter.ai can be helpful here.
Different Measurements: Existing version control tools do not have a central platform for monitoring usage, messages, or dollars spent on models. You can go to individual model portals, but centralizing this for testing is critical.
Evaluation: Evaluating these models requires more than just metrics. You need a way to validate the clarity, accuracy, and usefulness of the outputs they generate.

Evaluation Data as a Moat

‍New entrants into this space can differentiate themselves from traditional version control systems by 1) a clean user interface, 2) aggregating a unique set of tools and data analytics, and 3) leveraging the unique data that is captured from the evaluation process.

Evaluating the output of an LLM is different from evaluating the output of code. For example, understanding if an LLM chatbot is providing the correct information about your product and is also using the correct demeanor is not a binary answer. This requires different types of testing. PromptLayer, a company attempting to solve version control for prompts, details a few methodologies that can be used here:

Negative Examples: This is the process of identifying examples of bad responses and setting up guardrails to make sure these are not represented in the output.
LLM as a Judge Rubric: This approach uses another LLM to score the output based on a set of parameters.

To support these new styles of evaluation, new tools are required that can automatically run tests on new prompts, see results side by side, and build custom score cards to measure evaluation needs.

Scoring these prompts provides valuable information that can be leveraged to provide suggestions for other users looking to optimize prompts. Depending on the complexity of the task, these new version control platforms could even be well-positioned to create a marketplace for prompts, helping users streamline their creation process and create a data flywheel for the business.

‍Takeaway: As AI becomes a core part of software development, managing prompts is increasingly cumbersome and needs to be supported with custom tools. Just as Git transformed how developers track and collaborate on code, new systems are needed to version, evaluate, and optimize prompts. The real moat will not just come from storing prompt history, but from building rich evaluation datasets that guide better outputs. Teams that invest early in prompt version control will not only gain an increased productivity and performance, but will have a system for testing that aligns with best practices in software development.

From the newsletters

Newsletter

Nov 7, 2025

The Holding Company

How holding company structures have developed and changed over American history

Newsletter

Oct 30, 2025

Satellites + Data Centers

Satellite fleets will eventually leverage data centers in orbit

Newsletter

Oct 23, 2025

AI’s Impact on the Job Market

There are key differences between AI automation and historical automations

Newsletter

Oct 17, 2025

Soulless Social

Soulless Social explores how AI-generated feeds like Vibes and Sora threaten creativity, deepen isolation, and reshape engagement in today's attention economy.

Newsletter

Oct 9, 2025

Health and Hardware

Health and Hardware explores how Oura, Whoop, and emerging tech like AI and XR are reshaping the future of personalized wellness and wearables.

Newsletter

Oct 2, 2025

LatAm's Local Gig Economy

LatAm’s gig economy is booming, driven by delivery apps, migration, and new platforms serving local needs in services beyond ride-hailing.

Newsletter

Sep 19, 2025

Local vs Cloud AI

Trade offs in privacy, speed, and scale

Newsletter

Sep 12, 2025

Mobile Web Shops: The Great Platform Unbundling

Mobile's walls are coming down

Newsletter

Sep 5, 2025

India’s $23B Ban: A Warning for Investors

India’s sweeping ban highlights the risks of sudden, unpredictable regulation

Newsletter

Aug 27, 2025

Digital Gate Keeping

Age verification is redefining online safety and digital business models

Newsletter

Aug 21, 2025

Risk-On in the Face of Uncertainty

An uncertain future has created a “risk-on” population

Newsletter

Aug 14, 2025

AI Kids Toys

AI will revolutionize childrens’ toyboxes

Newsletter

Aug 7, 2025

Satellites & Digital Markets (+$157bn / year)

Satellite launches are unlocking an $86bn ad market and a $71bn digital subscription opportunity

Newsletter

Jul 29, 2025

K-pop x Gaming

K-pop is an undertapped opportunity in games

Newsletter

Jul 18, 2025

Reimagining the Familiar

Consumers are subtly telling the gaming industry that novelty is not necessarily needed

Newsletter

Jul 11, 2025

Don’t Play With Your Food

Restaurants are benefiting from the lessons learned in games & apps

Newsletter

Jul 1, 2025

$100m+ Gaming Exit Founders

The demographics of founders who have built $100m+ Gaming Companies

Newsletter

Jun 27, 2025

In Reddit We Trust

Reddit curates through trust, but struggles with complexity

Newsletter

Jun 20, 2025

Flow to Flaws: Vibe Coding

Vibe coding is great, but it comes with security risks and backend scalability concerns

Newsletter

Jun 13, 2025

Realism in Games

A framework for games to continue to move toward more extreme and realistic experiences

Newsletter

Jun 6, 2025

The Great Sensory Rebalancing

How digital natives are reclaiming reality through off-screen entertainment

Newsletter

May 30, 2025

Drowning In Decisions

Technology places a burden on decision-making processes

Newsletter

May 23, 2025

Rewiring: A Screenless Future

The future of personal computing could be fewer interactions with technology

Newsletter

May 16, 2025

Sports Betting: Take A Gamble

Alternatives to sports betting (sweepstakes, prediction markets) to compound growth of regulated markets

Newsletter

May 9, 2025

Grand Theft Auto VI

The most highly anticipated video game of all time: May 26, 2026

Newsletter

May 2, 2025

3D: From Standardized to Scaled

Standardization improves network effects

Newsletter

Apr 18, 2025

AI Guardians: Nurturing Young Minds

AI's future relies on being able to be fine-tuned to the user's needs

Newsletter

Apr 11, 2025

IP Licensing: Weathering the Storm

Licensing IP in games will increasingly be used in a world of increased competition

Newsletter

Apr 4, 2025

Evolution of Console Business Models

How console business models have evolved since the 1970s

Newsletter

Mar 28, 2025

The Lifeblood of Robotics

Robotics is expected to intersect with gaming in multiple ways.

Interested in our Newsletters?

Click

here

to see them all

FOLLOW US ON SOCIAL!

FOLLOW US ON SOCIAL!

FOLLOW US ON SOCIAL!

Version Control for AI Prompts

What is Prompt Engineering?

‍Prompt vs Code Version Control

Evaluation Data as a Moat

From the newsletters

The Holding Company

Satellites + Data Centers

AI’s Impact on the Job Market

Soulless Social

Health and Hardware

LatAm's Local Gig Economy

Local vs Cloud AI

Mobile Web Shops: The Great Platform Unbundling

India’s $23B Ban: A Warning for Investors

Digital Gate Keeping

Risk-On in the Face of Uncertainty

AI Kids Toys

Satellites & Digital Markets (+$157bn / year)

K-pop x Gaming

Reimagining the Familiar

Don’t Play With Your Food

$100m+ Gaming Exit Founders

In Reddit We Trust

Flow to Flaws: Vibe Coding

Realism in Games

The Great Sensory Rebalancing

Drowning In Decisions

Rewiring: A Screenless Future

Sports Betting: Take A Gamble

Grand Theft Auto VI

3D: From Standardized to Scaled

AI Guardians: Nurturing Young Minds

IP Licensing: Weathering the Storm

Evolution of Console Business Models

The Lifeblood of Robotics

FOLLOW US ON SOCIAL!