Blind Test With Faculty Panel: 5 Human Essays vs 5 ChatGPT Essays — Could We Reliably Tell Before Checking Originality.ai?

ErnestPrice · February 24, 2026, 12:33am

We ran a small internal experiment last month.

Ten short argumentative essays.
Five written by actual undergraduates.
Five generated using ChatGPT from the same prompts.

We removed names, formatted everything identically, and asked six faculty members to label each as “Human” or “AI” before running anything through Originality.ai.

Before revealing results, we asked them to explain what they were looking for.

Common answers:

Overly balanced paragraph structure
Repetitive transition phrases
Safe, non-controversial claims
Lack of specific lived detail

After the blind vote, we checked the results.

Confidence was high. Accuracy wasn’t.

Two human essays were labeled AI by a majority.
One ChatGPT essay was confidently labeled human by four out of six reviewers.

When we later checked Originality.ai, one of the human essays also scored unexpectedly high probability.

The takeaway wasn’t that detectors are useless. It was that both human intuition and software have detector limitations.

The human vs AI writing comparison is getting harder, not easier.

Has anyone else run structured blind tests like this?

strawtin · February 26, 2026, 12:52am

What stands out to me is the confidence gap.

People often believe they can “just tell” based on AI writing patterns. But perception isn’t proof.

As generative tools improve, stylistic markers of AI text become less obvious. Especially when prompts are detailed.

The risk is overconfidence leading to premature accusations.

Appartement · February 27, 2026, 1:19am

There’s also convergence happening.

Students are subconsciously adopting AI-like structure — clean symmetry, clear topic sentences, controlled tone — because that’s what high-scoring writing often looks like.

So when we perform a human vs AI writing comparison, we’re not comparing two distant categories anymore. The distributions overlap.

That makes detector limitations inevitable.

Pumpkinavigation · March 2, 2026, 12:36am

As a teacher, this is exactly why I avoid relying on a single signal.

If both humans and Originality.ai can misclassify texts, then no single metric should trigger discipline.

Process matters. Prior writing samples matter. Context matters.

Blind tests like this are valuable because they expose our assumptions.

MrDoubtfire · March 3, 2026, 1:51am

From an editorial perspective, I look for friction.

AI-generated essays often read smoothly from start to finish. Very little hesitation. Very few rough edges.

But I’ll admit — some of the strongest junior writers I’ve worked with also produce that kind of controlled prose.

Which makes me cautious about treating smoothness as evidence.

penhillcurrant · March 5, 2026, 12:27am

This experiment highlights something deeper.

We’re trying to detect authorship through surface style when authorship is increasingly collaborative.

Even if ChatGPT drafts something, humans revise it. Even if humans draft something, they borrow structural habits from machine output.

The boundary is blurring.

Detector limitations aren’t just technical. They’re conceptual.

Arclion003 · March 6, 2026, 1:27am

Student perspective here. If faculty confidence exceeds actual accuracy, that’s scary.

False positives don’t just affect grades. They affect trust. If schools are going to use tools like Originality.ai, transparency about uncertainty should be part of the conversation.

Otherwise students are judged by a system no one fully understands.

Topic		Replies	Views
After 120 Essays This Semester: How I Try to Eyeball AI Writing Before Checking Copyleaks or Originality.ai AI Detection Tools	5	14	March 3, 2026
How to check if writing is AI-generated -- what methods are people actually using beyond the obvious tools? AI Detection Tools	5	2	March 19, 2026
If Students Already Use Grammarly and ChatGPT, What Does 'Original Writing' Even Mean in 2026? General AI Discussions	5	27	March 9, 2026
I wrote my entire paper myself but I'm terrified it'll get flagged anyway Education & Academic Writing	5	3	March 12, 2026
Why does my writing get flagged as AI? I ran a test and the results didn't make sense AI Text Detection	5	2	March 16, 2026

Blind Test With Faculty Panel: 5 Human Essays vs 5 ChatGPT Essays — Could We Reliably Tell Before Checking Originality.ai?

Related topics