If GPTZero, Turnitin, or Originality just flagged your essay as AI-generated — even though you wrote it yourself — you're not alone, and you're not crazy. AI detectors flag patterns, not actual AI. And those patterns show up in plenty of human writing too.
Here's the strange truth about modern AI detection: it doesn't really detect AI. It detects a specific style of writing that AI happens to produce, but that lots of human writers also produce — especially academic writers, ESL students, business professionals, and anyone who writes in a clean, careful, "essayistic" voice.
This article walks through exactly what AI detectors look for, why innocent writers get caught in the net, and what you can do about it whether you're trying to clean up an actual AI draft or vindicate writing you produced yourself.
How AI detectors actually work
Most AI detectors don't have a magic AI-spotter inside them. What they have is a statistical model trained on millions of examples of "known AI" text and "known human" text. The model learns which patterns correlate with each category, and then scores new text based on how closely it matches the patterns of one or the other.
The two main statistical concepts are perplexity and burstiness.
Perplexity measures how predictable the next word in a sentence is. AI models tend to choose the most statistically likely next word more often than humans do, which makes their text more "predictable" to other models. Lower perplexity = more AI-like.
Burstiness measures how much sentence-level complexity varies across a passage. Humans write in bursts — a short sentence, a medium sentence, a long winding one, then maybe a fragment. AI tends to write in steady, uniform sentence lengths. Lower burstiness = more AI-like.
Different detectors weight these signals differently and add their own proprietary tricks, but those two concepts are the core of almost every modern AI detector. Once you understand them, you understand 80% of what's going on under the hood.
The 9 signals detectors look for
Beyond perplexity and burstiness, detectors look for specific surface-level patterns. Here are the nine that matter most.
1. Uniform sentence length
This is the single biggest tell. AI models — especially when generating "professional" prose — produce sentences that cluster around 18-22 words each. Humans don't. Real human writing has wild swings: a four-word punch next to a 35-word winding thought, then a fragment, then a medium sentence.
If every sentence in your draft is between 15 and 22 words, a detector will flag you. Even if you wrote every word yourself.
2. Formal transitional phrases
"Furthermore." "Moreover." "Additionally." "In conclusion." "It is important to note that." These transitional phrases are catnip for AI detectors. Why? Because AI models love them. They were trained on academic and business writing where these phrases were common, and they over-use them as a result.
Two or more of these phrases in a short passage is a strong AI signal. Even one in a casual context can flag you.
3. Latinate verbs
"Utilize" instead of "use." "Demonstrate" instead of "show." "Facilitate" instead of "help." "Leverage" instead of "use." "Implement" instead of "put in place." "Optimize" instead of "improve."
Real humans reach for the simpler word most of the time. AI models reach for the longer, more formal word because their training data over-rewards "sophisticated" vocabulary. Stack three or four of these in a paragraph and you'll trip detectors.
4. Missing contractions
"It's." "Don't." "They're." "You'll." "Won't." Real humans contract reflexively in almost everything they write, including formal pieces. AI often spells out the full forms ("it is," "do not," "they are") because formal training data taught it that contractions are "less professional."
If your essay has zero contractions across 500+ words, that alone can shift your detector score by 15-20 percentage points.
5. Hollow intensifiers
"Significantly." "Substantially." "Remarkably." "Extensively." "Crucially." Critically." These words add nothing to a sentence — you can usually delete them without changing the meaning — but AI uses them constantly to sound emphatic. Stacking them is a classic AI tell.
6. Hedge-heavy openings
"It's worth noting that..." "One might argue that..." "It could be said that..." Real humans usually just state the thing. AI models are trained to sound balanced and cautious, so they pile on hedge phrases that delay getting to the point.
7. Listy disguised prose
"This enables X, Y, and Z by doing A, B, and C." Three-part parallel constructions stacked together, often in the same sentence. AI loves these because they sound thorough. Humans rarely write this way naturally — we'd just bullet point it or pick the most important item.
8. Safe, generic word choice
Where humans reach for the specific word that fits the moment ("brutal," "exhausting," "obvious," "weird"), AI reaches for the safe defensible word ("difficult," "challenging," "clear," "interesting"). The vocabulary feels washed out, like everything was filtered through a thesaurus designed to avoid offending anyone.
9. Lack of voice
This is the hardest one to articulate but the most damning. AI text rarely has opinions, asides, jokes, sarcasm, frustration, or any other signs that a human had a point of view. Even technically perfect prose without voice reads as AI. Humans almost always leave fingerprints — a phrase that makes you smile, a word choice that's slightly weird, a sentence that breaks the flow because the writer cared about something specific.
Why human writing gets flagged
Now here's the awkward part. Look back at those nine signals. Notice anything?
Most of them describe good academic writing. Or standard business writing. Or the kind of careful, formal English that ESL students are specifically taught to produce in school.
The signals AI detectors use to identify AI weren't designed by linguists. They were learned from training data. And the training data heavily over-represented "AI-generated" text that looks almost identical to certain styles of careful human writing.
This is why ESL students, students with formal writing styles, and anyone trained in clean professional prose are statistically more likely to get false-flagged than students who write with natural, casual rhythm. It's not that the detectors are racist or biased on purpose — they're trained on patterns that happen to overlap with how non-native English speakers and academically-trained writers produce text.
If you've been flagged for AI when you wrote the piece yourself, this is almost certainly why. Your writing is "too clean" — meaning it has too many of the surface features that AI also has.
How to fix flagged writing
Whether you're cleaning up actual AI output or trying to vindicate your own writing, the fix is the same: introduce burstiness and voice. Specifically:
Vary your sentence lengths aggressively
Mix three-word sentences with thirty-word ones. Use fragments. Use sentences that start with "And," "But," or "So." Make the rhythm uneven on purpose. If every sentence in your essay is roughly the same length, a detector will flag you regardless of who wrote it.
Cut the formal transitions
Search your draft for "Furthermore," "Moreover," "Additionally," and "In conclusion." Delete them. If you need a connective at all, use "Also," "On top of that," "So," or — better — just start the next sentence without one. Real writers use connective words sparingly.
Swap Latinate verbs for plain English
Find/replace your draft: "utilize" → "use," "demonstrate" → "show," "facilitate" → "help," "leverage" → "use," "implement" → "put in place." This single change can drop your AI score by 20 points on its own.
Use contractions everywhere
"It is" → "it's." "Do not" → "don't." "Cannot" → "can't." Even formal English uses contractions in 2026 — it's only stiff academic prose that avoids them. Adding contractions makes writing read more natural and trips fewer detector signals.
Introduce a small voice marker
Drop in a phrase that signals a human had a point of view. "Honestly," "to be fair," "the obvious move," "which is wild," "let's be real" — anything that wouldn't appear in neutral AI prose. One per paragraph is enough.
Or just use a humanizer
If manually rewriting feels exhausting, that's exactly what AI humanizers like Forgely are for. A good humanizer applies all of the above transformations automatically while preserving your meaning. Free, takes a few seconds, and the output should score significantly more human than the original.
Try Forgely's free humanizer
Paste up to 1,000 words. Pick a tone. Get a humanized version that reads like a real person wrote it.
Humanize my text →Which detector is most accurate?
The honest answer: none of them, fully. Every detector produces false positives (flagging human writing as AI) and false negatives (missing actual AI). The question is which one's wrong patterns you need to work around.
GPTZero is one of the most popular and is reasonably calibrated, though it's known to flag academic and ESL writing more often than it should. It explicitly looks for low perplexity and low burstiness.
Turnitin's AI detector is what most universities use. It's stricter than GPTZero in our experience, partly because schools want to err toward catching cheaters. It famously generated false positives in early 2024 that led to widely-reported student appeals.
Originality.ai is positioned as a content marketing tool and tends to be aggressive about flagging anything that looks polished. SEO writers complain about it constantly because it flags edited human writing as AI.
Copyleaks is a more recent entrant that markets itself as more nuanced, but in practice produces similar results to the others — anything carefully written triggers it.
Winston AI claims very high accuracy on its homepage, but in independent tests it performs comparably to the others, just with slightly different bias patterns.
The takeaway: don't get attached to one detector's score as ground truth. If your text passes Forgely's detector but fails Turnitin's, that doesn't mean your text is AI — it means you've optimized for a different pattern profile than what Turnitin checks for. The whole space is wobbly.
Where AI detection is heading
Honestly? Probably nowhere good for the detection industry.
The fundamental problem is that AI text and human text are converging. AI is getting better at sounding human (more burstiness, more voice, more contractions), and humans who write a lot are picking up patterns from reading AI output. The "clean detectable line" between them is shrinking every year.
Watermarking — embedding invisible signals in AI output that tell detectors "this came from us" — was supposed to solve this. So far it has not. Watermarks are easily stripped by paraphrasing, and the major AI providers have been slow to deploy them anyway.
What's likely to happen instead: detection will become probabilistic and contextual. Schools will rely less on "AI score" and more on holistic assessment, edit history, and assignment design. Content marketers will give up on "AI-free" promises and focus on quality. The detector arms race will continue, but most of the public will stop caring about scores.
For now though, scores still matter. If your essay needs to pass a detector, the answer isn't to write worse — it's to write with more rhythm, more voice, and less of the AI-pattern surface markers that detectors are tuned to catch. Which is, conveniently, also what makes writing better in general.
Tools like Forgely's free humanizer can do the heavy lifting for you, but the underlying skill — varying rhythm, choosing specific words, leaving voice on the page — will serve you whether or not you ever paste your text into another AI tool.
Check or fix your text in seconds
Forgely's AI Detector tells you exactly which sentences look AI-flagged. Forgely's Humanizer rewrites them. Both free, no signup.
Open Forgely →