Why does my writing get flagged as AI? I ran a test and the results didn't make sense

okay so i tried to actually understand this and the results are making me more confused, not less.

background: i’m a CS student and i write technical docs, readme files, and the occasional lab report. i’ve been flagged twice now on written assignments and i wanted to figure out what the pattern was.

i ran a small test. took four pieces of writing: one README i wrote myself, one lab discussion section i wrote myself, one paragraph i copied directly from our course textbook, and one paragraph i actually generated with a chatbot as a control.

results were not what i expected. the textbook paragraph scored highest. my README scored second highest. my actual AI-generated paragraph scored third. my lab discussion scored lowest.

so the tool flagged a textbook paragraph as more AI than an AI-generated paragraph.

i know why this is probably happening technically – the textbook is formal, precise, structured, low perplexity. same reason my README scores high – technical writing is terse by design. but knowing why it happens doesn’t make the situation less absurd.

my lab reports are getting flagged not because i used AI but because i write technical documentation for fun and that’s apparently the same register. is there any way to work with this that doesn’t involve writing worse on purpose?

y’all the textbook-scoring-highest result is genuinely the most useful piece of evidence i’ve seen for explaining this problem to non-technical people. you can hand that result to an administrator and say “this tool flagged an assigned textbook as AI-generated” and there’s no good response to that.

the technical explanation – low perplexity in formal edited prose – is correct. it’s also something tool providers don’t advertise. they benchmark accuracy on corpora that don’t include professional technical writing because those corpora are inconvenient for their claims.

From an SEO background where we deal with content quality signals a lot – the pattern you’re describing makes sense if you think about what “low perplexity” actually means in practice. Technical and formal writing converges toward specific vocabulary and sentence structures because precision requires it. The model reads that convergence as machine-like because machines also converge toward specific patterns, just for different reasons.

The mismatch between what the tool is measuring and what it claims to be measuring is biggest in exactly the writing types that are most consistently formal by design.

This is more complicated than it looks, and your test result is exactly the kind of empirical evidence that should be in front of every academic integrity committee using these tools. It’s not theoretical. It’s a demonstration that the tool cannot distinguish between an assigned course textbook and AI output.

If I were in your position and faced a formal process, I would lead with this result. Not defensively – factually. “Here is what the tool does. Here is evidence of it doing that.”

to answer your actual question: the most effective approach i’ve seen for technical writers who keep getting flagged is adding more explicit process narration to formal submissions – not dumbing down the technical content but adding sentences that describe your choices, your uncertainty, your reasoning. those sentences add the personal voice and variability markers that formal technical writing strips out.

it shouldn’t be necessary. but it’s more effective than trying to change the writing register in ways that compromise the actual communication.

the process narration suggestion is actually useful and i’m going to try it. adding a sentence or two about why i made a specific technical choice would also probably make the reports better pedagogically, which is maybe the point.

still think the underlying situation is absurd. but “write better reports that show your thinking” is a more actionable takeaway than “fix the broken tool.”