How I Used GPT to Analyze Feedback Without Breaking Airtable

Table of Contents

1. Logging user feedback from multiple tools into one clean table

This was the easy part, which should’ve tipped me off. I used Zapier to catch new entries from a feedback form in Typeform, then a separate Zap to watch for in-app feedback events pushed into a Slack channel via Intercom. Both routed nicely into Airtable using the same base, writing to a table called “Feedback Raw” with fields like source, timestamp, user_email, and message.

For the first 48 hours it ran clean. Then a user submitted a 3000+ character message from a mobile Typeform response. Airtable truncated it (quietly), and GPT later hallucinated a strange sentiment because half the sentence was cut off mid-word. Also, my Slack Zap triggered twice when emojis were edited in the feedback thread — I assume that was a Slack webhook oddity, but I haven’t chased it down. Worth watching.

If you’re setting this up, make sure every tool sends in UTF-8 clean strings. Typeform did. Slack did not — quotes and some markdown came in mangled, which made GPT behave inconsistently when summarizing tone.

2. Passing raw text into GPT using well-shaped JSON payloads

The key to getting usable GPT summaries was sending it text that always looked the same, because GPT wildly overreacts to formatting inconsistencies. I wrapped the entire message in clean JSON like:

{
  "input_type": "user_feedback",
  "source": "Typeform",
  "user_email": "user@domain.com",
  "timestamp": "2023-11-03T18:27:00Z",
  "message": "The dashboard kept logging me out every 5 minutes. It’s unusable like this."
}

That worked fine, until Slack-style feedback started appearing with triple backticks or bullet points. GPT interpreted those as part of prompts sometimes. Once, it replied in code blocks only. I had to add a pre-cleaning step in Make.com to flatten all rich formatting to plain text. Weirdest issue: feedback where users used emojis consistently skewed sentiment positive — even when the message said “this sucks 😤”, it still tagged it as neutral or slightly positive.

Add this to your pre-prompt note: “Ignore formatting like markdown or emojis, and analyze the message text literally.” GPT-4 followed that 80% of the time. 3.5 didn’t care.

3. Prompt structure issues that broke sentiment scoring entirely

This ate two hours. I thought the summaries were wrong, or maybe the fine-tuned system I used on GPT-3.5 Turbo was hallucinating. Turned out I had accidentally written:

"Please return a JSON object including the 'sentiment' key with a value of 'Positive', 'Neutral'. or 'Negative'”

…which GPT obeyed exactly, but that rogue period inside the quotes meant it sometimes returned:

{ "sentiment": "Neutral." }

So my Airtable formula field checking for valid tags kept returning blank. Airtable won’t compare "Neutral" to "Neutral." as equal. No warning. No error. Just blanks. If your GPT sentiment tags seem to vanish inside Airtable, check for stray punctuation in the prompt structure. Also: GPT-4 is more likely to include the punctuation than 3.5. Unpredictably.

4. Flattening structured GPT output into Airtable fields was not obvious

I originally asked GPT to return multiple things, like:

{
  "sentiment": "Negative",
  "summary": "User is frustrated by forced logouts.",
  "product_area": "dashboard/authentication"
}

Solid idea, except Zapier doesn’t parse JSON well out-of-the-box unless you explicitly use its Formatter step. I had to manually insert a step between OpenAI and Airtable using Formatter → Utilities → “Text” → “Line Itemizer” (yes, really) → and told it to treat commas as delimiters. Mostly worked… until someone included a comma in their feedback like: “…which, frankly, sucks.” Broke the whole thing.

I tried Make.com instead and just used its native JSON parse step to extract fields cleanly. Worked instantly. I’m not abandoning Zapier, but massaging JSON through it always feels like coaxing a raccoon out of a container store display window. If you’re doing multi-field structured output from GPT, Make.com is less brittle for this step.

5. Unexpected effects of changing field names inside Airtable

I renamed GPT_Sentiment to Sentiment because I wanted the column title to read cleaner in a shared view. No warning. No prompt. Just… the Zap that was writing to it silently failed. Turns out Zapier uses internal field IDs behind the scenes, but some paths rely on matching field names at time of setup — it’s not consistent. Airtable didn’t complain. Zapier didn’t complain. The data just stopped arriving. Took me three tries to track it down.

What made it trickier: One Zap still wrote to another field in that record successfully, so I assumed the Zap was running fine. It was — it just decided that Sentiment didn’t exist anymore on that table structure. I had to go into the Zap Editor, re-select the new field (it showed up as a separate entry entirely), then re-test. Data flowed again immediately.

Checklist if data from GPT stops arriving after a rename:

Check Airtable field name didn’t change casing or spelling
Verify Zapier still maps the correct field (check dropdown in Editor)
See if test records are arriving in any other field by accident
Re-test the Zap manually with fake data to see where it breaks
Search Airtable’s recent field changes in the Undo history
If using Make.com, re-sync field list to refresh field references

6. GPT analysis reliability broke above a certain token size

I fed multiple user messages (batched by email) into GPT in a single call so I could try persona clustering. That was the moment things got weird. Anything over ~1500 words, even in GPT-4, started returning way-too-brief summaries like “User has concerns about product stability.” It ignored nuance, flattened contradictory opinions, or worse, omitted angry feedback completely in favor of praise that showed up last in the input. That’s just GPT trying to summarize a tone curve it doesn’t understand.

What worked better: batching feedback by theme (dashboard performance, billing, onboarding clarity) instead of by user. I used a Notion-linked tag field from earlier manual triage as the grouping logic, then ran GPT on those groupings. Sentiment skewed truer, and the summaries didn’t get weirdly polite.

Also: If a GPT call is returning suspiciously vague summaries, check the full input token length. You’re probably asking it to do too much. It won’t warn you.

7. Adding human approval layers without breaking the automation chain

After the fourth time GPT mis-tagged Internal Beta feedback as External Complaint, I added a manual review step. But I didn’t want to break the automation chain entirely. I set up a new table GPT Review Queue where GPT writes its JSON summary, and I use a checkbox field Approved. Only when that’s checked does another automation move the record into Feedback Analyzed.

The edge case here is that Airtable automations count checkbox = true differently than checkbox is not empty. You’ll spend time debugging why approvals don’t trigger — if you’re filtering on “is checked” instead of “is true” sometimes it doesn’t fire. No idea why.

Best part: because Zapier saw the manual checkbox as a ‘change’ event, the final Zap reliably moves the approved record to the final table without an extra webhook or delay. Just remember to disable automations temporarily if you’re bulk-approving 50 rows or you’ll throttle your run limit for the day without knowing it.