Accelerating enterprise-grade AI development with Vellum’s Workflow SDK
Overview
In this episode of DEMO, host Keith Shaw welcomes Noa Flaherty, CTO and co-founder of Vellum, to showcase how enterprises can streamline AI development with Vellum’s powerful new Workflow SDK. Designed to bridge the gap between technical and non-technical teams, Vellum allows organizations to experiment, iterate, deploy, and monitor production-grade AI systems with ease.
Noa walks through a live demo, building an agentic chatbot that pulls grounded, factual answers from policy documents using RAG (retrieval-augmented generation). He demonstrates how Vellum’s platform supports both low-code UI development and Python-based SDK integration—making collaboration seamless across product managers, subject matter experts, and developers.
Watch the full demo above and read the full transcript below for a detailed look into how Vellum is helping enterprises go from prototype to production faster than ever.
Transcript
Keith Shaw: Hi everybody, welcome to DEMO, the show where companies come in and show us their latest products and platforms. Today, I'm joined by Noa Flaherty, he is the CTO and co-founder of Vellum. Welcome to the show, Noa. Noa Flaherty: Thank you so much, Keith.
Happy to be here. Keith: All right, so tell us about Vellum and a little bit about what you're going to be showing us today. Noa: Awesome.
Vellum is an AI development platform that we sell to mid-market and enterprise companies who are trying to build robust, reliable, enterprise-grade AI systems, features, and products — but they need help up-leveling their teams and skills to do so. Our platform helps across the entire development process.
Keith: And who in the enterprise is this geared toward? I would assume developers and people building AI applications, or are there other roles within the company that could benefit? Noa: It’s primarily for product engineering teams.
One thing that’s unique about AI engineering is that, while you need a really strong technical team, there's a lot of domain context required to determine whether AI is doing what it's supposed to.
So we also see a lot of involvement from less technical folks — think product managers and subject matter experts. Vellum acts as a bridge between those two groups. Keith: What's the main problem you're solving? Why should anybody care about this?
Noa: Specifically, we're helping companies build high-quality, production-grade AI features quickly — and bring them to market faster without sacrificing reliability. Keith: And what would companies be doing without this? Would they be using multiple tools? Working more slowly? Noa: Exactly.
The status quo is either building tooling in-house or iterating very slowly. For example, engineering tweaks a model or prompt, runs manual evaluations, hands it off to a product manager who visually QAs everything, sends it back — it’s a slow loop that usually results in lower quality. Keith: Great.
Let’s jump into the demo. Show us some of the cool features. Noa: Awesome. On the screen, you’ll see how we think about AI development.
I won’t spend too much time on this slide, but it’s about defining use cases, running experiments efficiently to find the right models, prompts, and architectures, and evaluating them quantitatively — not just based on vibes.
Once you're happy with the outputs, you deploy, integrate with your app layer, monitor the system in production, close feedback loops, and iterate again. Vellum helps with two things: Establishing that feedback loop. Spinning it quickly.
One way we accelerate iteration is by involving the right people at the right time. Less technical folks usually handle the experimentation and evaluation phase, while developers work across all steps but especially deployment and monitoring.
Today I’m going to show you how this works in Vellum itself, with a focus on our new product: the Workflows SDK. Keith: Okay, cool. Noa: This is a low-code editor that lets me drag and drop nodes to define the backend of an AI system.
This use case is an agentic chatbot that answers questions about Vellum’s security and privacy policies. So if you're a CISO or CIO automating internal processes, your team could use this to build that backend.
I take a user input — like a question — and simulate what the AI would do. I’ve connected a number of nodes. When I run it, it outputs a factually correct, grounded answer using retrieval-augmented generation (RAG).
This low-code editor is powered by the Workflows SDK, a Python framework for defining these graphs and agentic workflows declaratively. Here’s the most interesting node: the prompt node.
It pulls context from a vector store — like PDF paragraphs about our security policies — that are semantically similar to the question. I’ve added some prompt engineering to ensure the answer is factual, concise, and grounded in the quotes, and I’ve limited the response length.
So, if you’re less technical, you might change the prompt or model directly through the UI.
If you’re a developer, you might use the SDK. I’ll show you that workflow. Here’s a CLI that pulls down the workflow. I’ve already built it in our Vellum Workflow Sandbox, and I’ve set up my local environment in PyCharm.
Just run pip install vellum-ai, export your API key, and copy the command from the UI. Now I’ll run the CLI. It generates a directory with locally runnable code. You can use your IDE or codegen tools like Cursor to iterate faster.
Run the workflow locally and get the same output you saw in the UI. Now let’s push a change back up. Say I want to change the prompt length from 200 to 250 words. I save the file, grab the CLI command, and push it.
After refreshing the page…
You’ll see the prompt node now shows 250 words. You can edit any aspect of the graph — define your own business logic in custom nodes or even your own runtime using Docker images.
Developers build the components, push them to Vellum, and non-technical folks assemble and iterate through the UI. Keith: Got any other features to show? Noa: This is the Workflow Sandbox. We also offer a full suite for evaluations, deployments, and monitoring.
But the key takeaway here is that you can define, simulate, and iterate on AI pipelines — quickly. Once you're happy, you evaluate, deploy, and integrate with your app layer. Keith: Sounds like a great feature set. How can customers try this out? Is there a free trial? Noa: Yes!
We recently raised a Series A, and we’re opening up broader access. You can try it for free — just visit vellum.ai and click the “Start for Free” button. Keith: All right.
Noa Flaherty, thanks for joining us and for the demo. Noa: Thank you so much, Keith. Keith: That’s going to do it for this episode of DEMO. Be sure to like the video, subscribe to the channel, and leave your thoughts in the comments.
Join us every week for new episodes. I’m Keith Shaw — thanks for watching!