One of the problems with building generative AI into your applications is there’s no standard way of managing prompts. Too often, each team that builds AI into their code takes a different approach and manages data in different ways. They’re reinventing the wheel again and again, failing to learn from other teams and other projects.

Building a new AI interaction model for each application and having different ways of storing, using, and updating prompts wastes time. AI developer resources are limited, and experienced developers are stretched across multiple projects. It’s not effective to have to remember how each application works and how they need to structure and test prompts.

Using different AI models adds complexity. A team may be using a large language model (LLM) like Open AI’s GPT, Facebook’s Llama, Anthropic’s Claude, or a custom tool based on an open source model from Hugging Face. Perhaps they decided to build an application that uses a local small language model, such as Microsoft’s Phi.

Introducing Prompty

What’s needed is a model-agnostic way of working with LLMs that allows us to experiment with them inside our development tools so we can use them without context switching. That’s where the Microsoft-sponsored Prompty project comes in. It’s a Visual Studio Code extension that helps solve many of the issues involved with working with LLMs and other generative AI tools.

You can find Prompty on GitHub, where it’s an active open source project. You can contribute code or make requests from the development team. If you prefer to start writing code, Prompty is available in the Visual Studio Code marketplace and integrates with its file system and code editor. Documentation is on the project website, and although a little thin at present, it is more than enough to get you started.

Prompty is a very straightforward tool. Its easy-to-understand format takes its cue from familiar configuration languages like YAML. The approach makes sense, as what we’re doing with prompts is configuring a generative AI. A prompt can be thought of as a way of defining the semantic space that the model searches to deliver its answers.

At the heart of Prompty is a domain-specific language that describes interactions with a generative AI. This is embedded in a Visual Studio Code extension that takes advantage of features like its language server for formatting and linting, highlighting errors and offering code completion. There’s support for both Python and C# output as well, with future versions targeting JavaScript and TypeScript.

If you weren’t drilling down into the Build 2024 session content, you may have missed an interesting session on using Prompty as part of your AI development platform.

Building prompts with Prompty

Working with Prompty in your code is no different than working with any other library. Alongside the Visual Studio Code extension, you’ll need to create an application framework that contains the appropriate packages. Once you have an application skeleton with access to an LLM endpoint, you can use the Prompty extension to add a prompt asset to your code. Inside the Visual Studio Code explorer, right-click on the root folder for your application and create a new Prompty. This will add a .prompty file to the folder, which you can rename as necessary.

Open the .prompty file to start building a prompt asset. This is a formatted document containing two sections: a detailed description of the application you are building with details of the model being used and any parameters that need to be used by your application, as well as samples of any information being based to the model. The second section contains the base system prompt to define the type of output you’re expecting. That’s followed by the context, information supplied by a user or an application that is using the LLM for a natural language output.

Usefully, you can use Prompty to test your prompts and display the output in the Visual Studio Code’s output pane. This lets you refine the behavior your LLM output will use, for example, switching between an informal, chatty output to one that’s more formal. You will need to provide appropriate environment variables, including any authentication tokens. As always, it is good practice to hold these in a separate file so you don’t inadvertently expose them.

Using Prompty with an LM orchestrator

Once you’ve written and tested your prompts, you can export the prompt asset data and use it with your choice of LLM orchestrator, including both Prompt Flow in Azure AI Studio and Semantic Kernel for building stand-alone AI-powered agents. This approach allows you to use a Prompty prompt as the basis of a retrieval-augmented generation (RAG)-powered application, reducing the risks of incorrect outputs by adding grounding data and using your prompt to produce a natural language interface to external data sources.

The resulting functions use the Prompty prompt description to build the interaction with the LLM, which you can wrap in an asynchronous operation. The result is an AI application with very little code beyond assembling user inputs and displaying LLM outputs. Much of the heavy lifting is handled by tools like Semantic Kernel, and by separating the prompt definition from your application, it’s possible to update LLM interactions outside of an application, using the .prompty asset file.

Including Prompty assets in your application is as simple as choosing the orchestrator and automatically generating the code snippets to include the prompt in your application. Only a limited number of orchestrators are supported at present, but this is an open source project, so you can submit additional code generators to support alternative application development toolchains.

That last point is particularly important: Prompty is currently focused on building prompts for cloud-hosted LLMs, but we’re in a shift from large models to smaller, more focused tools, such as Microsoft’s Phi Silica, which are designed to run on neural processing units on personal and edge hardware, and even on phones.

If we’re to deliver edge AI applications, tools like Prompty should be part of our toolchains, and they need to work with local endpoints, generating API calls for common SDKs. It will be interesting to see if Microsoft extends Prompty to work with the Phi Silica classes it has promised to deliver in the Windows App SDK as part of the Copilot Runtime. This would give .Net and C++ developers the necessary tools to manage local prompts as well as those that target the cloud.

Growing the AI toolchain

Tools like this are an important part of an AI application development toolchain, as they allow people with different skill sets to collaborate. Here, prompt engineers get a tool to build and manage the prompts needed to deliver coherent AI applications in a way that allows application developers to use them in their code. Visual Studio Code lets us assemble extensions into a coherent toolchain; this approach may well be better than having a single AI development environment.

If you’re tuning models, you can use the Windows AI Toolkit. If you’re building prompts, then Prompty is for you, while developers can use the tools for their choice of orchestrator alongside the Windows App SDK and their choice of C+ or C++ tooling. Visual Studio Code lets you pick and choose the extensions you need for a project, and architects can build and manage appropriate development environments with appropriate toolchains, using Microsoft’s Dev Box virtual machines or GitHub Codespaces.

Prompty is a big part of delivering a more mature approach to LLM application development. By documenting your prompts while testing and debugging them outside your code, you’re able to build applications and prompts in parallel, helping prompt engineers and application developers collaborate more effectively, much like front-end technologies such as Figma can power similar collaborations with designers on the web.