
AI can build apps these days... right? We thought so too. Over the past few months, we worked on a complex platform with many states, dependencies, and connected logic. Not a simple application, but a system where actions, deadlines, and communication constantly influence one another. So when AI tools became more powerful, we decided to test it seriously: could we have Claude build this?
First step: connecting AI directly to the design
We gave Claude access to Figma via MCP; a way to let AI work directly with external tools such as design files. That allowed it to use our full design file, including all pages and the component library.
Although we thought that with such a direct connection Claude would take the design straight into the front end, that did not work well at first. Because of the number of states and page variations, AI lost the overview. Components were mixed up and screens did not match. At the same time, we saw that the main structure was often usable. Familiar patterns such as filtering, sorting, and assigning people worked almost right away.
The contrast was clear: generic functionality went well, but as soon as context and coherence mattered, things went wrong.
Until you reach the logic…
As soon as we touched the core of the system, that difference became even sharper. This platform does not run on screens, but on logic: statuses that change based on actions, deadlines that expire and affect multiple users, and messages that may only be sent under specific conditions.
What we saw was not a complete failure, but something more subtle. AI generated solutions that were correct on their own, but did not take into account the behavior of the system as a whole. A change that made sense locally created inconsistency somewhere else.
AI follows instructions, but it does not safeguard coherence.
The turning point: from explanation to specification
So we had to stop explaining and start specifying exactly.
Instead of loose prompts, we began structuring the full business logic in Excel. For each row, we described one situation, with columns for status, action, actor, trigger, next status, and next steps. In addition, we specified exactly what had to be shown where, including exceptions and dependencies.
In the last column, we built a combined prompt based on all these fields. This made the prompts very consistent and described everything about one situation. Excel became not only documentation, but a way to generate prompts systematically.
It worked… but was not testable yet
With this approach, we immediately saw improvement. Claude could translate the logic into working functionality for the most part. But testing remained difficult. Flows got stuck, messages did not really come in, and processes were not completed end to end. You could see individual steps work, but never the whole picture.
Test flows as the missing link
To solve that, we designed our own test flows. Routes through the system that touch all logic and go through the statuses. For each step, we generated separate pages so that we could actively test every transition.
That made it possible to trace behavior. If something was not right, we could trace it back to a specific rule in Excel and therefore to a specific prompt.
A new way of giving feedback
To give feedback on the front end developed by AI, we set up a structure similar to the one we used for the logic. In Excel, we collected prompts with screenshots, links to Figma components, and concrete descriptions. With help from Claude and ChatGPT, we further refined that feedback.
By doing this consistently, we saw improvements. Not because AI suddenly fully understood the system, but because our input got better.
What this does to our work
We entered this experiment with the idea that AI would reduce our work. In practice, we actually had to think more, structure more, and be much more explicit.
Everything that was implicit had to be made concrete. And that is exactly what shows where your own logic is still off, where gaps exist, or where assumptions are hidden.
Even in a traditional project, we would have had to pass this logic on to developers. The difference is that this often happens through conversations, interpretation, and iterations. By writing it out this explicitly, you force yourself to fully understand the system—and you surface many ambiguities and potential errors before they end up in the build.
Where we stand now
We are still in full swing. What we do see: in a short time, something is in place that would likely have taken months in a traditional process. At the same time, it is not stable yet and still requires a lot of attention to get it working.
AI does not replace development here, but it does change it. It shifts the work forward: less building on instinct, more making explicit in advance what needs to happen. That requires more thinking, but it also leads to sharper systems.
And that is exactly what makes it interesting.
