How Prompt Logs Exposed My Invisible Headline Automation Bug

How Prompt Logs Exposed My Invisible Headline Automation Bug

1. Prompting AI to generate titles that actually sound human

The fastest way to tell when a system is just doing whatever OpenAI told it to do internally? Look at the headlines. If you’re using ChatGPT or Claude to name blog posts, and you just pass in a full article and say “generate a good title,” you’re gonna get stuff like “Unlock the Power of Automation with These Tips.” Great — now it sounds like 40,000 other AI-written articles that no one reads to the bottom.

I started experimenting with this because I kept getting flagged in our CMS review flow for “low-quality titles” even when the body content was solid. Turns out Google now treats title uniformity like a spam signal, especially if you’re batch posting. So I rebuilt the entire title generation logic inside a GPT function call with stricter tone modeling and behavior emulation.

The trick is to use a stripped but opinionated prompt, not just pass in the article body. Mine ended up looking like this:

Write a blog title that sounds like a real person summarizing 8 sections of messy but clever workflow advice about [TOPIC].
- Between 45 and 65 characters.
- Include one high-relevance keyword.
- Avoid any repeated structure or symmetry from other titles.
- Do not use any slashes or emotional phrases.

When sent along with 3 or 4 previously approved human-written examples, it stops trying to be clever and starts sounding… usable. Not perfect, but human enough to avoid spam filters.

2. Why OpenAI consistently shortens titles unless told not to

This was driving me insane for months: I’d ask GPT-4 to write a ~60 character title, even specifying “between 45 and 65 characters,” and it’d keep returning things like “How to Automate Your Daily Tasks.” That’s barely 35 characters and completely useless for organic traffic. Turns out there’s a default bias toward shorter phrasings baked deep into completion-weighting — OpenAI tuned obvious brevity as a friendliness signal.

The fix? Precede your prompt with a single example that lands in the 62–64 character range. Nothing else worked. I tried system messages, extra tokens, even putting the length request in all caps. Nothing. But once the model “sees” an example that reaches the upper boundary, it adjusts its internal reward function and starts outputting longer, more natural structures.

This is the line that fixed it inside my chain:

“Here are some previous examples of human-approved titles (include ONLY the title text):\n1. What Changed When I Renamed One Field Too Soon\n2. Your Sync Broke Because That Field Got Renamed Mid-Zap”

After that, it stopped the truncation. Still a bit verbose sometimes, but fixable.

3. Title prompt failures that only show up in webhook logs

Here’s the thing nobody tells you: if you’re stringing together a prompt chain via a Zapier webhook or a Make scenario, the actual prompt formatting can silently mutate mid-flight. I had a GPT Prompt Generator flow that received article content via Airtable, chunked it with Text Splitter (in Make), then sent those chunks to OpenAI inside a webhook.

Problem: the title it generated always came back hollow, like “How This Automation Changed Everything.” Bleh. I dove into the Webhook (Custom Request) run logs — and there it was.

"prompt": "Write a human-sounding blog title between 45 and 65 characters...
TOPIC:\nnull"

The word “null” was coming from a mapping field that silently failed when the CMS field name changed from post_topic to article_keyphrase. No error thrown. Mapping just passed the string “null” — and GPT filled all the gaps around it. None of the auto-validation caught this.

Behavior I wasn’t expecting:

If you pass “null” or blank topic data, GPT will default to generic business phrases like “See How AI Is Reshaping This Field.” As if it’s afraid to guess the topic without explicit direction. No hallucination. Just cowardice.

4. One renamed field broke every title prompt for a week

I renamed “topic_short” to “primary_term” on a hunch during a schema restructuring in Airtable. Nothing broke at first. The automations kept running. But every headline got dumb — not wrong, just super unhelpful: nothing specific to the content, mostly format copy like “Tips to Get Started with This Tool.”

Zapier didn’t throw errors because I wasn’t referencing the field directly in the Zap UI — it was embedded in a title prompt defined earlier. The embedded field got replaced with an empty string, but the actual execution block didn’t surface it as a problem. Just silently degraded prompt quality.

We caught it a week later while comparing clickthrough data in the CMS. Dropping to near zero. I went into the Zap’s “Edit Template” step, opened the advanced text block that held the full prompt, and found this:

… about {{primary_term}}

But Airtable had no such field. It had been renamed. So the final prompt that hit OpenAI literally said:

Write a blog title that sounds like a real person summarizing 8 sections of messy but clever workflow advice about .

No error, no warning. GPT ran it anyway.

5. Tiny tips that kept broken title chains from looking correct

  • Double-wrap dynamic prompt fields inside fallback logic — e.g. {{primary_term || 'workflow automation'}}
  • Use a logging step right after the OpenAI call to echo the prompt + result into a Notion or Airtable field
  • Keep prompt logic in dedicated Data Stores or Make variables — don’t bury them in action steps
  • Use GPT’s function-calling if you’re generating multiple options (to surface token clipping issues early)
  • Visually check for length variance across title outputs — uniformity means model drift
  • Add a character count regex to flag too-short or too-symmetrical outputs

Most of this came after repeatedly noticing that “broken” chains weren’t erroring out — they were just producing worse and worse results invisibly. Took forever to confirm it wasn’t hallucination.

6. The JSON fix that finally stabilized the whole generation flow

The messy part is that everyone tries to build these prompts as plain strings. But the moment you start nesting logic or running prompt chains from different tools (like Zapier to Make to GPT to Notion), you’ll get encoding fails. Especially quotation marks and line breaks — any newline inside a field can serialize weirdly and silently ruin your structure.

I rebuilt the whole title generation section to issue GPT function calls through the OpenAI API’s JSON mode. Instead of dumping text, I defined an actual function schema like:

{
  name: "generate_title",
  parameters: {
    type: "object",
    properties: {
      title: {
        type: "string",
        minLength: 45,
        maxLength: 65
      }
    },
    required: ["title"]
  }
}

This forced GPT to stay in the correct format and filled in that blurred logic gap where stray characters were corrupting multi-line structures. Even better: now when something fails, I get a structured format with a traceback, not just a generic error string.

Airtable picks it up cleanly; Notion stores look uniform; nothing gets truncated or distorted on display. And more importantly: these autogen titles finally pass human review again.