AI art detectors: tested five of them on the same 50 images

Appartement · April 28, 2026, 12:25am

For a class project I put together a test set of 50 images: 25 AI-generated across different models and styles, 25 human-created across photography, illustration, and digital art. I ran all 50 through five different AI art detectors and recorded the results. Here’s what the data showed.

Overall accuracy across all five tools: ranged from 61% to 79% correct classification. That’s meaningfully better than chance but not reliable enough to use as a binary classifier. The best-performing tool was wrong 21% of the time.

Agreement between tools: the five tools agreed on 58% of the images. For the remaining 42%, at least one tool diverged from the majority classification. On some images, tools were evenly split.

Style effects: all tools performed substantially better on photorealistic AI images than on stylized or illustrative content. For the photography subset, average accuracy was around 82%. For stylized digital art, it dropped to 67%. Human-created digital art in styles that overlap with common AI aesthetics was the hardest category for every tool.

False positive rate: 18% of human-created images were flagged as AI by at least one tool. 11% were flagged by a majority of tools. For images in styles that AI generators commonly produce, that rate went up significantly.

My takeaway: these tools are useful for generating hypotheses, not conclusions. A classification from any single tool should be treated as ‘worth looking at more carefully’ not ‘confirmed AI.’ Using them to make consequential decisions about authorship requires understanding that the error rates are high enough to matter.

Happy to share the full methodology if anyone wants to replicate.

tallest.tower.2016 · April 28, 2026, 10:55am

This is methodologically sound and the results are consistent with what I’ve seen in text detection. The false positive rate on human-created work in AI-adjacent styles is the finding that institutions should be paying attention to. Any policy that relies on these tools as determinative has to account for an error rate in the 11-18% range on plausible human work.

CheezyPeezey · April 29, 2026, 4:40pm

What most teams miss here is that the 79% accuracy ceiling on the best tool means you’re looking at roughly one wrong classification in five under optimal conditions. At the scale most content operations run, that’s a meaningful error volume. The tools are useful as one input into a workflow. They’re not useful as a final answer.

strawtin · April 30, 2026, 9:55am

The ‘generates hypotheses not conclusions’ framing is exactly right and it applies to text detection tools too. The confusion happens when people treat a probability estimate as a verdict. Your data makes the gap between those two things very concrete.

AlfioRo88 · April 30, 2026, 12:10pm

The stylized art result tracks with what I’d expect from a creative standpoint. Certain aesthetics have been colonized by AI generation to the point where human artists working in those styles are going to get flagged systematically. That’s a real harm to human creators in specific communities and it’s not being discussed enough.

penhillcurrant · April 30, 2026, 12:25pm

The 42% disagreement between tools on individual images is striking. That’s not a small margin. In my editorial context, a result that five experts disagreed on 42% of the time would not be considered a usable instrument. Worth being explicit about what that number means for anyone using these tools to make decisions.

Topic		Replies	Views
Can you reliably tell if an image is AI generated? I'm starting to doubt it AI Detection Tools	5	0	April 30, 2026
About the AI Detection Tools category AI Detection Tools	0	3	December 24, 2025
How to Test AI Detectors Properly (A Simple Framework) AI Detection Tools	3	11	January 28, 2026
Welcome to AI Humanizer Tools: Share Tests, Tools & Insights Here AI Humanizer Tools	0	3	December 29, 2025
Best AI content detector for academic writing. My testing notes Experiments & Case Studies	5	0	July 4, 2026

AI art detectors: tested five of them on the same 50 images

Related topics