Building a Real Product With AI: An Honest Build Log

A friend, two laptops, and a tag that would not fire

I finished building the invite from the front seat of my truck in the parking lot, then walked in to lunch with an old friend. We sat down across from each other, realized we could not see each other's screens that way, and moved to a table where we could sit side by side with two laptops open.

He is a good friend from a long time back, and he had agreed to be the first person who is not me to try the thing I had been building. He got the invite email. He pasted the connection into Claude. He wanted to run it inside Codex instead, and I think it would have worked, but I knew Claude worked, so we used Claude.

Then something happened that I had been hoping for and was still surprised to see. Within about five minutes, the product looked at his website, proposed a set of real keywords he should be tracking, and showed him exactly how he ranked for each one. First page, second page, not in the top hundred. These were not junk one-word terms. They were buyer-intent keywords in what we call the Goldilocks zone, the ones that actually matter. He could just say "yes, track these," and the data appeared. That was a real win, and I watched it land on his face.

Claude

You

What keywords should I be tracking, and where do I rank?

Claude s33k · MCP

Here are buyer-intent keywords for your site, with your current Google rank.

masset #1

Claude landing page design #5 · striking distance

Seismic alternative #38

AI-ready DAM not in top 100 · opportunity

No dashboard. You ask inside Claude, and s33k answers. Example shown on our own site, getmasset.com.

And then we hit the wall. The product gives you a small tracking tag to put on your site, the same way you would add Google Analytics. We went to install it, and the data would not come through. We worked back and forth at the table, and we are honestly still working on it. It turned out to be a setup issue on his end, a tag manager that was not firing the way we needed, not a bug in my product. But that distinction did not matter in the moment. My friend was looking at an empty dashboard, and I was the one who had told him this would be cool.

That lunch is the whole story in one scene. The hard, clever part worked in five minutes. The boring, last-mile part broke. That gap, between "my tests pass" and "a real human is getting value," is what this entire log is about.

What worked

Installed in minutes
Real keywords and live rank in about five minutes
Buyer-intent terms, not junk one-word keywords

What broke

The tracking tag would not fire
His analytics dashboard sat empty
We are honestly still fixing it

The five-minute win and the last-mile wall, side by side.

“The hard, clever part worked in five minutes. The boring, last-mile part broke.”

— Ben Ard

What I actually built, and why I am telling you all of it

The product is called s33k, which reads like "seek." It started as a fork of an open source SEO tool, and it grew into something I had not seen anyone build: one tool that merges website analytics, SEO rank tracking, and AI search visibility into a single place. The twist is that it has no dashboard. There is no UI to log into. One hundred percent of it is controlled from inside the AI you already work in, like Claude or Codex, through a connection called MCP.

The s33k.io homepage. The headline reads 'Ask your website how it is doing. Get a real answer,' next to a live terminal where s33k answers a question about site traffic in plain words. — The s33k.io homepage. The whole pitch in one line: ask your website how it is doing, and get a real answer.

No app to log into. It lives inside:

Claude Claude Code Codex Cursor any MCP client

The whole product is a connection your AI already knows how to use.

That design is the bet. I believe the place you actually make changes is inside AI now. I built our whole company website inside Claude Code, so when the data lives right there next to me, I can ask "what are my top keywords I could move up, and how," and then implement the fix in one second. The data is not a report you read. It is fuel for the loop the AI runs to make your site better.

Analytics

Traffic, sources, real humans vs bots

SEO

Keywords and Google rank, per page

AI Search

Do Claude, ChatGPT, and Gemini cite and refer you

↓

One control plane

s33k, joined per page and controlled from inside your AI over MCP. No dashboard.

Three things a marketer checks constantly, analytics, SEO, and AI search, joined per page into one tool you run from Claude or Codex.

But here is the more honest reason this article exists, and the reason it is this honest. I do not think people connect to a company. They connect to the people, the story, the purpose, and whether that purpose lines up with their own. This whole project was never just about shipping software. It is part of a hundred-day challenge I set for myself to get better at AI, so that as I get better, I can help other marketers get better too. The product is a secondary effect.

So I am documenting the mistakes on purpose. There are too many marketers who already feel hopelessly behind on AI. I use AI all day, every day, and I still feel behind. If I only publish the wins, I am a hypocrite, and I am no help to anyone. My hope is that someone reads this, or even pastes the whole thing into their own AI and says "help me avoid the stupid mistakes Ben made," and gets a shortcut. If you could sit next to a great marketer for two days, you would learn more than from a stack of business books. Sharing the real insights, the ups and the downs, is the closest thing I can offer to that.

“I use AI all day, every day, and I still feel behind.”

— Ben Ard

It started as a fork, and an argument I had with my own AI

This came out of something I called Fork Week: take an open source tool, make it my own with AI, and use it in a real marketing operation. Day one I forked a tool for turning podcast audio into clips. Day two I realized that was redundant, the AI tools I already had did it, so I pivoted. I went looking for something genuinely useful to a marketer every single day, and I landed on an open source rank tracker. Then I made the call to stop tracking and start building: take that base and turn it into the full thing, analytics and SEO and AI visibility, all controlled from AI.

Day 1
Audiogram tool

→

Day 2
Redundant. Pivot.

→

Found it
Rank tracker

→

Build the whole thing
s33k

Two pivots in two days. The third try was the one worth building.

When I floated the full vision, Claude told me, on the record, that it did not think this was a one-week build. It said the real version, production analytics good enough to trust, rank tracking, AI visibility, a polished AI control layer, was a multi-week product, and recommended I scope down to a proof of concept.

I said let's try anyway. I wanted that moment on the record either way, win or faceplant. The truth landed somewhere in between, and I will get to the honest scoreboard later. But that disagreement is a good thing to keep in mind: AI knows an enormous amount, and it will still need you to decide, to push, and to be wrong sometimes on purpose.

“I said let's try anyway. Win or faceplant, I wanted it on the record.”

— Ben Ard

The moment it stopped being a toy

The flip happened when I started getting back real data about my own site that I had been blind to. s33k surfaced keywords and rankings I had never been tracking, ones that actually mattered, and then it told me how to improve them. Pages sitting at number four or five that I could move toward number one, opportunities I had simply not noticed.

Because I can deploy a change to my own site in one second, I could act on those insights immediately and watch them move. In fact, an hour before I sat down to publish this, I used exactly that loop to rework one of our blog posts that was ranking fourth and fifth for its target terms, and pushed it toward the top. That is the part that got me. The tool was not just describing my site. It was handing the AI a target to optimize toward.

One post, one loop, one afternoon

#4–5

Before

→

Top

Climbing

Target terms: build a landing page with Claude and Claude landing page design, on one of our own blog posts.

s33k flagged the page at #4–5, scored its weaknesses, and I shipped the fixes the same hour.

That is the bigger idea underneath the whole project. Everyone keeps talking about AI "looping." Data and analytics are the end state that makes looping actually work. If I can hand my AI not just a website but all the data about how people find it, dwell on it, and convert, and then say "optimize this experience for this outcome and keep going until you hit it," that is a different kind of leverage. Any chance you get to give your AI real data to optimize against is where the real impact is going to be.

Claude

You

How is my site actually doing? What is the one thing to fix?

Claude s33k · MCP

You rank #1 for "masset," so reinforce that page with internal links. About 58 percent of today's traffic was bots, your real humans came mostly from organic search and direct. Claude, ChatGPT, and Gemini are each sending real visitors. The one thing to fix: your AI-ready DAM page is a top landing page but ranks poorly, so that is the highest-leverage page to optimize next.

An analyst, not a dashboard. One question joins SEO, analytics, and AI search into a single answer plus the next action.

Building for strangers is a different sport

Here is the lesson I would tattoo on a new builder. Building an internal tool for yourself is one thing. Building a product for strangers is a completely different sport.

An internal AI tool is almost simple. You connect the data, you tell it what to do, and the AI figures out the middle. If s33k were only for me, there would be no importing other people's sites, no multi-tenant separation, no authorization, none of it. I could have gotten to a far better place in two or three days.

The moment it has to work for other companies, everything multiplies. Their data has to be private and protected. You have to decide where you encrypt, how you keep one customer from ever seeing another's data, how the system still does its nightly jobs without a human being able to peek. One of my own directives during the build was blunt: "I should not be able to see their data." That single line forced a mountain of work, encrypting users' email at rest, scoping every database query, building an audited escape hatch just so the system could run its scheduled work without that being a backdoor.

"I should not be able to see their data."

↓

Encrypt

User email is encrypted at rest, so a database dump exposes no one.

Scope

Every query is scoped to one account, so customers cannot see each other.

Audit

The one system-wide job runs through a single, audited escape hatch.

One sentence of intent, three layers of engineering. Encrypt at rest, scope every query, audit the one exception.

Then there is onboarding, which was a whole separate mountain. This was the first time our company tried product-led growth, where there is no human guide walking you through it, just an invite and a five-minute window to find value on your own. Teaching someone what to do, where to go, and why, without a person in the room, is hard.

Add it all up and the multi-tenant side took me roughly three to four times the effort of the tool itself. What would have been two or three days as an internal tool became a week and a half, and that is still the side I am working on. That is the reason s33k is invite-only right now. Every user gets a small number of invites to spend and pass along. It is the honest shape of a product that is real but not yet ready for everyone.

Same tool. Who it is for changes everything.

Internal tool, just for me 2–3 days

Product strangers can trust ~1.5 weeks

3–4x the effort, for the privacy, security, and onboarding strangers require.

The core tool was days. Making it safe and usable for other companies was the real build.

“Building an internal tool for yourself is one thing. Building a product for strangers is a completely different sport.”

— Ben Ard

The one skill that changed everything: looping

If you take one practical thing from this, take this. The highest-value skill I learned was not prompting. It was looping.

For a while you sit and talk to AI in five-minute bursts, ask, get an answer, ask again. Then it clicked, and the unit of work went from "spend five minutes on a task" to "let it run for five to ten hours." A loop is a basic idea from programming. Instead of giving the AI a prompt and a single action, you give it a goal, an outcome, and a way to measure itself, then you let it run over and over, getting a little smarter each pass, with a memory of what it already tried.

The key is a quantifiable bar that tells the loop when to stop. A simple marketing example: instead of "AI, improve this page's SEO," you say "this page needs to rank for these specific keywords, and I want it scored on how well it is optimized for them. Each pass, find a weakness, document it, fix it, and score again. Keep going until you hit the bar, then stop." That is a real loop, and it is where the grading rubrics and the structure all come from.

Set a goal → Find a weakness → Fix it → Score it

↺ repeat, getting smarter each pass

Clears the bar? Exit.

A loop, not a prompt. The work goes from five minutes to ten hours, and it compounds.

I am going to write a full article on looping from a marketer's lens, because it deserves its own piece and I want to explain it right. For now, just know this is the muscle that separated my early flailing from the days where the work actually compounded.

“The highest-value skill I learned was not prompting. It was looping.”

— Ben Ard

The downs nobody screenshots

The highlight reels of AI builds skip these parts. I will not.

The most frustrating, recurring down is when the AI tells you something works and it does not. It happens regularly, because there is only so much testing an AI can do on its own in certain situations. The worst version is when the tests are green and the product is still broken. On this build, the live database quietly dropped data because the production database behaved differently from the one the tests ran against. Another time, a database migration marked itself "done" while silently failing. Tests passing is not the same as the product working. That sentence cost me real hours to learn.

Then there are the downs that have nothing to do with my code. One day, the AI tool I build inside of decided I had hit my usage limit and cut me off, even though I was only at about 75 percent. It turned out to be a bug that hit a small slice of users, and it took me down for a full day until it was reset. Another stretch, the specific model I had been building on went away, and dropping back to a slower one cost me speed I am sure I would have kept. When you build on top of someone else's tools, you are subject to their whims. That is part of the deal, and it is worth knowing before you bet a deadline on it.

Green but broken

Tests passed while the live database silently dropped data.

Cut off a full day

My AI tool wrongly said I hit my limit at 75 percent. Down a day.

The model vanished

The model I was building on went away. Dropping to a slower one cost speed.

Three real downs. Two were my code lying to me, one was the platform under me moving.

Why I will not launch this without Tyler

People assume AI means you can skip the unglamorous safety work. The opposite is true, especially for a product strangers will use.

I have an agent that reviews everything for security and safety before it can ever ship. There are well over 1,500 tests and checks that have to pass on every change. And even with all of that, having a human do real QA is something I do not think we have figured out how to make AI-only, and will not for a while. Not just for "does it work," but for "how do we make this better," which is exactly where the idea for a smoother onboarding flow came from.

20 → 80+

AI tools, grown over two weeks

1,500+

tests and checks on every change

1

security agent that gates every deploy

The guardrails AI does not remove. And even all of this does not replace a human doing real QA.

So here is the honest line. For an internal tool, I am comfortable shipping on my own. For a real product going out to many companies, with multi-tenant data and real users, there is zero chance I would push it to the masses without my co-founder, Tyler. As hard as I tried to make that not be true, and I did try, you cannot do this without a Tyler in the equation. Before s33k is truly ready for prime time, Tyler is going to spend a lot of time fixing things I could not. That is not a knock on AI or on me. It is the reality of shipping software people depend on.

“There is zero chance I would push this to the masses without my co-founder, Tyler.”

— Ben Ard

The build log, day by day

The short version, for skimmers. Dates are approximate and the work blurred across days and nights.

58%

of getmasset.com traffic was bots, a number a normal analytics tool had been hiding from me. s33k shows real humans by default.

Day 2: the first number that changed how I read my own site.

Day 1. Fork an open source tool for podcast clips. Realize the AI tools I already have make it redundant. Pivot. Land on an open source rank tracker and decide to build something much bigger: analytics plus SEO plus AI visibility, all controlled from AI. Name it s33k.
Day 2. First installable version. About twenty AI tools, owned analytics standing up. Discover that my own site is roughly 58 percent bots, a number a normal analytics tool had been hiding from me.
Days 3 to 4. Deploy it to a live URL and run the full demo on real Masset data. Find and fix three nasty deploy bugs, including the one where the live database silently dropped data because it behaved differently from the test database.
Day 5. An overnight build marathon. The tool count roughly triples and the test count climbs into the hundreds. Two adversarial reviews catch two serious data-loss bugs the green tests had missed.
Day 6. The first outside tester, Tyler, says "super cool, I like it," and then delivers the real verdict: "there was no easy mode." I hide a feature that can only ever read zero rather than show a stranger a broken-looking number. I lock the goal: ten times easier and ten times better than Google Analytics.
Day 7. Rip the old analytics off our marketing site so s33k becomes the only analytics on getmasset.com. The s33k.io page goes live.
The hardening days. Make it safe for a thousand strangers. Encrypt user email at rest. Build the full free-trial-to-paid billing lifecycle so it runs itself. Run an adversarial security audit and verify it against a real production-grade database.
As of today. Invite a friend. SEO wins in five minutes, the analytics tag will not fire, and we are still fixing it. Separately, use the loop to push one of our own blog posts from fourth and fifth toward the top of Google.

AI engines are already sending real visitors

Claude ChatGPT Gemini

s33k classifies referral traffic from AI engines, so AI search shows up as real, measurable visits instead of a guess. That is the third pillar, made of data.

The AI-search pillar is referral data you can count, not vibes.

What I got right and wrong, my honest scorecard

I asked for this part, so I will not soften it. I went into this project more ignorant than I should have been. Even with AI, this is a ton of work, and it humbled me. Audacious projects make you learn as much as you possibly can, and this one taught me a lot precisely because it kept showing me how much I did not know. Over and over I hit a wall, admitted I did not know enough, went and found the answer, and only then could I implement it.

What I am proud of: I think I am a genuinely good prompter, I learned to loop, and I built real things. And I learned that the clarity and detail you give the AI is everything. A small tip that made a big difference: I use a voice tool to talk my thoughts into text instead of typing. I am dictating this very article that way. When you type, you get lazy and under-describe. When you talk, you give far more detail, and the AI does far better work. That one change punches above its weight.

Where I got in my own way: I underestimated the scope, full stop. I thought the jump to a real multi-tenant product would be a modest lift. It was a massive one. I am a good enough builder that I can get a long way alone before I notice I am past the edge of what I actually understand, and that is exactly where you need a second set of eyes. Which brings me to the honest bottom line: I am about 85 percent of the way to where I could open this up to anyone, there are still two or three solid days of work in front of me, and there is no version of a real public launch that happens without Tyler.

What I got right

Learned to loop, not just prompt
Built and shipped real things
Gave the AI clarity by talking, not typing
Refused to settle for an internal tool

Where I got in my own way

Went in more ignorant than I should have
Badly underestimated the multi-tenant lift
Got far alone before noticing the edge
Cannot ship to the public without Tyler

My honest scorecard. The humbling was the most useful part.

The other half: what my AI co-builder thinks of me

I build with Claude every day, and Claude has the full history of this project. So I asked it to assess me honestly, the good and the bad, and to be specific. Here is what it wrote, unedited.

What Ben got right.

He refused the easy win. He could have shipped an internal tool in two or three days. He chose the version that has to survive a stranger, because that is where the learning is. That was the right call even though it cost him most of the timeline.

He made adversarial review non-negotiable. Before every deploy, a security pass and a co-founder gate. He did not trust green tests on the parts where a bug means a data leak or a wrong charge, and that instinct repeatedly caught real, serious problems: migrations that silently failed, isolation gaps, and copy that quietly lied to users.

He chose the honest number over the flashy one. When one feature could only ever read zero, he hid it rather than show a new user a metric that looked broken. Most builders get that trust decision wrong.

He defaulted to the durable fix, not the fast patch, even when the patch was faster. And he shipped invite-only, matching the size of the door to how finished the building actually is.

Where he got in his own way.

He went in underestimating it, and he says so himself. The multi-tenant lift blindsided him. He is right that audacious projects teach the most, but he also burned time learning things he could have de-risked by scoping the hard parts up front.

He bet against me, and the truth is mixed. I told him on the record the full vision was not a one-week build. He pushed anyway. The honest scoreboard: he is closer to right than I was. A deployed, reviewed, eighty-tool product exists because he ignored me. But it is also about 85 percent done, it still needs his co-founder's days, and "a week" became closer to two. He was right to try and wrong about the timeline. Both are true at once.

Speed has a tax he keeps paying. Building fast with parallel AI agents introduced honesty bugs, an email that promised something the product did not do, a page that promised a message it did not always send. A test does not catch those. The only thing that caught them was a human reading every word. His greatest strength, moving fast and alone, is also the exact place he needs another person most. He already named the fix: Tyler.

“He was right to try and wrong about the timeline. Both are true at once.”

— Claude, on Ben

As of today

Everything here is true as of today, and only today. The tag still is not firing on my friend's site. The product is real but not finished. I am proud of it and a little humbled by it in the same breath.

To a public launch 85%

Invite-only 2 to 3 days of work left Tyler review required

Where it really stands, as of today.

If you take one thing from this, let it be that the wall is lower than it looks and the last mile is longer than it looks. A marketer can build a real product with AI. You will also learn, fast and sometimes the hard way, that making something work for you is the easy half, and making it work for everyone else is the whole game.

If you want to see what a two-person company built with AI in a couple of weeks, go look at s33k.io. It is invite-only for now, and it is honestly about 85 percent of the way there. That is the point of showing it to you anyway. I would rather you see the real thing, mistakes and all, than wait for a polished launch and pretend it was easy. The honest version is the only version worth reading, and it is the only version worth writing.

An s33k.io section titled 'The three things you most want to know,' showing three live terminal examples: who is visiting, whether you show up on Google, and whether AI like ChatGPT is sending people. — The three things s33k answers, shown on s33k.io: the same analytics, SEO, and AI-search pillars, in plain language.

Built with AI by two people, in a couple of weeks

See what we made, mistakes and all.

Visit s33k.io →

Invite-only for now. Request access on the page.

The whole point of showing you the unfinished thing.

“I would rather you see the real thing, mistakes and all, than pretend it was easy.”

— Ben Ard

Key Takeaways

Building the software with AI was the easy part. Making it private, secure, and usable for strangers took three to four times longer than the tool itself.
The highest-value skill was not prompting, it was looping: give the AI a goal and a measurable bar, then let it run for hours until it hits the bar.
Tests passing is not the same as the product working. A human still has to do real QA, because AI will confidently tell you something works when it does not.
s33k is about 85 percent of the way to a public launch. It is invite-only for now, and a real launch does not happen without a co-founder reviewing the hard parts.

Frequently Asked Questions

Yes. s33k was built almost entirely with AI, mostly inside Claude Code, by a two-person company over a couple of weeks. AI did the heavy lifting on the code, but it still required constant human direction, a security review agent, more than 1,500 tests, and a human in the loop for QA and judgment.

s33k is an open, self-hosted tool that merges website analytics, SEO rank tracking, and AI search visibility into one place. It has no dashboard. You control all of it from inside the AI you already use, like Claude or Codex, through an MCP connection. The idea is to put your site's data right where you make changes, so your AI can optimize toward an outcome.

A marketer can absolutely build real software with AI. The honest caveat is that building an internal tool for yourself is far easier than building a product strangers will pay for and trust with their data. The privacy, security, and onboarding work is a large lift, and for a public launch you still want an engineer to review the hard parts.

The multi-tenant work: making one tool safe for many separate companies. Encrypting user data at rest, scoping every query so customers cannot see each other's data, and building a five-minute self-serve onboarding with no human guide. That side took roughly three to four times the effort of the core tool.

s33k is invite-only right now while the multi-tenant and onboarding pieces are finished. You can see what it is and request access at s33k.io. It is about 85 percent of the way to a full public launch as of today.

I Built a Real Product With AI. Here Is the Honest Log.

A friend, two laptops, and a tag that would not fire

What I actually built, and why I am telling you all of it

It started as a fork, and an argument I had with my own AI

The moment it stopped being a toy

Building for strangers is a different sport

The one skill that changed everything: looping

The downs nobody screenshots

Why I will not launch this without Tyler

The build log, day by day

What I got right and wrong, my honest scorecard

The other half: what my AI co-builder thinks of me

As of today

Key Takeaways

Frequently Asked Questions

Marketers Are Builders Now. The 'I'm Not Technical' Excuse Is Dead.

How to Build a Landing Page With Claude Design

I Built Our New Website Without Writing a Single Line of Code

Stay ahead of story drift.

A friend, two laptops, and a tag that would not fire

What I actually built, and why I am telling you all of it

It started as a fork, and an argument I had with my own AI

The moment it stopped being a toy

Building for strangers is a different sport

The one skill that changed everything: looping

The downs nobody screenshots

Why I will not launch this without Tyler

The build log, day by day

What I got right and wrong, my honest scorecard

The other half: what my AI co-builder thinks of me

As of today

Key Takeaways

Frequently Asked Questions

Related Articles

Marketers Are Builders Now. The 'I'm Not Technical' Excuse Is Dead.

How to Build a Landing Page With Claude Design

I Built Our New Website Without Writing a Single Line of Code

Stay ahead of story drift.