Transforming how users build software from scratch, to code, to application with
Replit Agent  

Introduction

From scratch, to code, to app —in a flash

Building a fully functioning software app is hard work. From coding the application logic to setting up environments and databases, there’s a lot that developers have to set up before anyone can interact with the app. The Replit team recently launched Replit Agent, a first-of-its-kind AI agent that helps users create applications from scratch. 

While current tools are great for code completion and incremental development, Replit Agent can think ahead and take the right sequence of actions to help you build that e-commerce web app, financial analysis tool, or any newfangled idea you’ve been dreaming up. It pairs with you as a co-pilot and has all the same tools you’d have access to in Replit to help you go from idea to working code, fast.

Problem

Overcoming blank page syndrome

Designing and building an app without a set rulebook can be overwhelming. It’s easy for developers to be hit with “blank page syndrome,” causing a lot of staring at an empty code editor even if armed with the right tools. 

Replit Agent lowers the activation barrier for new users to create software, allowing users to whip up a project with a simple prompt in plain English. Its ability to support multi-step task execution and manage infrastructure also eases the build-experiment-test-deploy process.

Cognitive architecture

Keep reliability high and users in the loop

The Replit team focused on reliability, constraining their AI agent’s environment to the Replit web app and tools already available to Replit developers. Their agent was a ReAct style agent that could iteratively loop. 

Over time, the Replit Agent adopted a multi-agent architecture. When there was only one agent managing tools, the chance of error increased – so the Replit team limited their agents to each perform the smallest possible task. They assigned roles to their different agents, including:

  • A manager agent to oversee the workflow.
  • Editor agents to handle specific coding tasks.
  • A verifier agent to check the code and frequently interact with the user.

Michele Catasta, President of Replit, notes a key difference in their building philosophy:

We don’t strive for full autonomy. We want the user to stay involved and engaged.”

Their verifier agent, for example, is unique in that it doesn't just check code and try to progress with a decision. It often falls back to talking to the user in order to enforce continuous user feedback in the development process.

Prompt engineering

Build and organize prompts for relevant insights

Replit employed a range of advanced techniques to enhance the performance of their coding agents, especially for complex tasks like file edits.

Few-shot and long instructions

Replit frequently uses few-shot examples along with long, task-specific instructions to guide the model effectively. For more difficult parts of the development process, such as file edits, Replit initially experimented with fine-tuning. But, this didn’t yield any breakthroughs. Instead, significant performance improvements came from leveraging Claude 3.5 Sonnet.

Dynamic prompt construction & memory 

Replit also developed dynamic prompt construction techniques to handle token limitations, similar to the system used by OpenAI's popular prompt orchestration libraries. They condense and truncate long memory trajectories to manage ever-growing context. This involved compressing memories with LLMs to ensure only the most relevant information is retained.

Structured formatting for clarity

To improve their model understanding and prompt organization, Replit incorporates structured formatting. In particular, XML tags were helpful in delineating different sections of the prompt, which guided the model in understanding tasks. For lengthy instructions, Replit relies on Markdown, as it’s often within the model’s training distribution.

Tool calling

Notably, Replit didn’t do tool calling in a traditional way. Instead of using the function calling offered by OpenAI’s APIs, they chose to generate code to invoke tools themselves, as this approach proved more reliable. With Replit’s extensive library of 30+ tools, each tool required several arguments to function correctly, making the tool invocation process complex. Replit wrote a restricted Python-based DSL (Domain-Specific Language) to handle these invocations, improving tool execution accuracy.

UX

Bringing the user along in the agent journey

Replit focused on enabling key human-in-the-loop workflows when designing their UX. First, the Replit team implemented a reversion feature for added control. At every major step of the agent’s workflow, Replit automatically commits changes under the hood. This lets users “travel back in time” to any previous point and make corrections.

In a complex, multi-step agent trajectory, the first few steps tend to be most successful, while reliability drops off in later steps. As such, the team decided it was particularly important to empower users to revert to earlier versions when necessary. Beginner users can simply click a button to reverse changes, while power users have added flexibility to dive deeper into the Git pane and manage branches directly. 

Because the Replit team scoped everything into tools, users can see clear, concise update messages about agent actions whenever the agent installs a package, executes a shell command, creates a file etc.

Instead of focusing on the raw output of the LLM, users can see their app evolving in time and decide how hands-on they want to be in the agent’s thought process (e.g. choosing to expand to view every action the agent has taken and the thought behind it, or ignore it).

Unlike other agent tools, Replit also lets you deploy your application in a few clicks. The ability to publish and share applications is integrated smoothly in the agent workflow.

Evaluation

Real-time feedback and trace monitoring

To gain confidence in their agent, Replit relied on a mix of intuition, real-world feedback, and trace visibility into their agent interactions. 

During Replit Agent’s alpha phase, they invited a small group of ~15 AI-first developers and influencers to test their product. To gain actionable insights from the alpha feedback, Replit integrated LangSmith as their observability tool to track and action upon problematic agent interactions in their traces.

The Replit team would search over long-running traces to pinpoint any issues. Because Replit Agent allowed human developers to come in and correct agent trajectories as needed, multi-turn conversations were common. They were able to monitor these conversational flows in logical views within LangSmith to identify bottlenecks where users got stuck and could require human intervention.

The easy integration and readability of their LangGraph code in LangSmith traces was a big bonus for using both the agent framework (LangGraph) and observability tool (LangSmith) together.
Conclusion

Empowering creativity for developers

Replit Agent is simplifying software development for novice and veteran developers alike.  By prioritizing human-agent collaboration and visibility into agent actions, the Replit team is helping users overcome initial hurdles to unleash their creativity. 

While the world of agents has offered so many powerful new use cases, debugging or predicting the agent’s actions is still often uncharted water. Alongside the developer community, Replit looks forward to pushing the boundaries and working on tricky cases like evaluating AI agent trajectory.

And on the path to building useful and reliable agents, Michele Catasta puts it best:

“We’ll just have to embrace the messiness.” 
And that's not all...
Discover more breakout AI agent stories below from the most cutting-edge companies.  
Breakout Agentic Apps
Go back to main page
Read next story

Ramp

Ready to start shipping 
reliable GenAI apps faster?

LangChain, LangSmith, and LangGraph are critical parts of the reference 
architecture to get you from prototype to production.