Skip to content
Writing
·6 min read

What 115 articles a day actually requires

Drafted through my n8n + AI pipeline, edited by me.

By the end of this you'll know what 'autonomous' actually requires under the hood, and why the impressive part is the boring reliability layer, not the AI.

The mess

'Autonomous' is a word people use loosely. Most 'set and forget' systems quietly need a person every single day: restart this, re-run that, patch the thing that broke overnight. For SourceRated the word has a literal meaning. Nobody touches the pipeline on a normal day, and it still publishes 115+ credible articles across four categories.

The wrong way people solve it

They build the happy path and call it done. The scraper works on a good day, the model scores cleanly on tidy input, the publish succeeds when nothing is wrong. Then a source changes its layout, two articles turn out to be the same story in different headlines, the model is unsure, a publish fails halfway, and the 'autonomous' system needs exactly the babysitting it was supposed to remove.

The system view

The loop, end to end, is not exotic. A scheduled scrape triggers it. The pipeline decides what is worth keeping using semantic dedup, an AI credibility score, and a community-voting check. It publishes on schedule. A human reviews only when the model is genuinely unsure, an alert fires the moment a stage fails, and every decision is recorded. The work is not in the steps. It is in the seams between them.

Trigger (scheduled scrape) → Decision (dedup + AI credibility score + community vote) → Action (publish on schedule) → Human review (only when unsure) → Alert (a stage fails) → Record (every decision logged).

What I would build

Exactly what makes it survive a bad day. Custom scrapers across multiple sources. SQLite-based semantic dedup, so the same story in a different headline gets stripped before it publishes. An AI credibility score paired with a community-voting layer for a crowdsourced check. A publish step with live auto-refresh, running in Docker. And then the part that earns the word autonomous: retries, circuit breakers, and alerting that tells me the moment something genuinely needs a human.

What can break

A source goes down or quietly changes its structure. Two articles are near-duplicates and both slip through. The model scores something at low confidence. A publish dies halfway and leaves a half-written record behind. Bad input data poisons the feed. Every one of these is handled on purpose, because the alternative is finding out from a reader.

What the business gets

A system that genuinely runs without you, credibility you can defend instead of claim, and your attention back for the work that actually needs judgment. Reliability here is not a feature bolted on at the end. It is the product.

Anyone can ship an automation that works in the demo. The valuable part is making it fail loudly and survive the day you are not watching.

Bring me the workflow you wish ran without you. I'll tell you what I'd make autonomous first, and what still needs a human.

Building something this should run inside?

Book a systems call