The paradox of the false positive


The Paradox of the False Positive

Back in February, the plagiarism detection service Turnitin announced that it had developed an AI writing detector, and word on the street is that Turnitin will enable this new tool for campuses on April 4th. The purpose of this new tool is to analyze writing submitted by students and determine whether it's likely the writing was actually written by an AI text generation tool like ChatGPT. For instructors or institutions that have decided that student use of AI text generation tools counts as plagiarism or unauthorized aid or some other violation of academic integrity policies, Turnitin's new tool ostensibly provides a way to detect such violations.

Turnitin says that the new AI writing detector tool "focuses on accuracy--if we say there's AI writing, we're very sure there is." See this video with David Adamson, an AI scientist with Turnitin, for some details on their process. Turnitin reports a 1% false positive rate for the tool, which means that if you submit 100 samples of actual student writing, on average one of those samples will be (falsely) flagged as likely written by AI. That sounds like a pretty accurate tool, but please allow me to do a little math here in the newsletter and introduce you to the paradox of the false positive.

I'm not going to ask ChatGPT to explain the paradox of the false positive to you, although I bet it could do a pretty good job. Instead, let me quote from the young adult novel Little Brother by Cory Doctorow. This is the novel I ask my cryptography students to annotate collaboratively using Perusall, and I'm grateful to Doctorow for releasing the novel under a Creative Commons license so I can easily do so. Here's how he introduces the paradox of the false positive in Little Brother via his very opinionated narrator Marcus:

#####

If you ever decide to do something as stupid as build an automatic terrorism detector, here's a math lesson you need to learn first. It's called "the paradox of the false positive," and it's a doozy.
Say you have a new disease, called Super-AIDS. Only one in a million people gets Super-AIDS. You develop a test for Super-AIDS that's 99 percent accurate. I mean, 99 percent of the time, it gives the correct result -- true if the subject is infected, and false if the subject is healthy. You give the test to a million people.
One in a million people have Super-AIDS. One in a hundred people that you test will generate a "false positive" -- the test will say he has Super-AIDS even though he doesn't. That's what "99 percent accurate" means: one percent wrong.
What's one percent of one million?
1,000,000/100 = 10,000
One in a million people has Super-AIDS. If you test a million random people, you'll probably only find one case of real Super-AIDS. But your test won't identify one person as having Super-AIDS. It will identify 10,000 people as having it. Your 99 percent accurate test will perform with 99.99 percent inaccuracy.
That's the paradox of the false positive. When you try to find something really rare, your test's accuracy has to match the rarity of the thing you're looking for.

#####

Doctorow goes on to apply this notion to the challenge of detecting terrorists. If the vast majority of people in a given population aren't terrorists, even if your hypothetical terrorist detector if very accurate, you'll end up with lots of false positives. And what might those false positives lead to? You can probably imagine, but I'll share the innocuous end of the spectrum: My friend Dave Stewart used to have a lot of problems getting on airplanes because some other guy named Dave Stewart was on a no-fly list somewhere. Of course, the other end of the spectrum involves the nastier parts of the United States legal system.

What does the paradox of the false positive imply for the use of AI writing detection tools like the new one from Turnitin? Let's suppose for a minute that you have three hundred student assignments you'll be grading this month, and let's further suppose that none of your students use ChatGPT or any other AI writing tool when writing these assignments. That 1% false positive rate from Turnitin's new tool means that about three of those three hundred assignments will be flagged as likely AI-written.

That's three assignments that you might feel compelled to investigate. How exactly will you go about confirming or denying Turnitin's assessment of the writing in question? Will you call those students into your office and share the Turnitin report with them? Will you ask them about their process for completing the assignment? Will you require them to somehow prove that they wrote the assignment without AI help? This gets messy pretty fast, and it's not likely to be easy for you or the innocent students.

To be fair, I am not someone who thinks Turnitin's tools should be rejected out of hand as unwarranted surveillance of students and their academic work. When teaching my writing seminar, I would regularly use Turnitin's "originality report" with students to help them understand better and worse ways of working with their sources. If the report flags a few sentences in a student's paper as matching some other source, that's a great opportunity to talk with that student about their decisions and methods for quoting or summarizing sources. (I adopted this approach from Vanderbilt chemistry instructor Michelle Sulikowski, whom I interviewed for a podcast episode way way back in 2008.) It's hard for me to imagine how an AI writing detection report from Turnitin could be used in similarly productive way.

The driver behind the paradox of the false positive is the ratio of students not using AI tools to students using AI tools. If that ratio is very high, that is, if most students aren't using these tools, then any false positive rate, even one as low as 1%, becomes problematic. The higher the ratio, the more problematic the false positive rate becomes. What percent of a given population of students are likely to use AI writing tools when they have been told not to do so? I don't know, but that's the probability that educators should be attending to and working to influence.

Around the Web

This is the part of the newsletter where I link to things that I find interesting in the hopes that you do, too.

  • Vanderbilt CFT Teaching Guides - Did you know that the entire collection of teaching guides written by the Vanderbilt Center for Teaching (CFT) has been shared with a Creative Commons attribution-noncommercial license? That means that you can copy and paste them into your website, as long as you provide an attribution and don't try to sell them. These guides are authored by numerous CFT staff and graduate fellows over the years, and they cover a range of topics, including active learning, accessible learning environments, digital timelines, teaching outside the classroom, and much more.
  • Citing Generative AI in MLA Style - Here in the newsletter, I've raised the question before: How do you go about citing one's use of an AI text generation tool like ChatGPT? Well, the Modern Languages Association (MLA) has now issued guidance for doing so as part of the MLA style guide. The citation recommendations are a little clunky, but that was probably unavoidable. Notably, they don't recommend listing ChatGPT as an author, but they do recommend including the prompt used with ChatGPT. Also in the MLA guidelines are recommendations for citing AI-generated images, like those created by DALL-E.
  • What Students Want (and Don't) from Their Professors - This Inside Higher Ed piece by Colleen Flaherty reports on a recent student survey conducted by IHE and College Pulse. "More than half of respondents to the recent Inside Higher Ed/College Pulse survey of 3,004 students at 128 four- and two-year institutions say teaching style has made it hard to succeed in a class since starting college." Putting aside the language about learning styles in the report (problematic since 2010!), I'm skeptical that all students know what teaching styles they need to learn and to succeed. I'm thinking about that Harvard study (which I recapped here) showing students felt that traditional lecturing worked better for them when in fact the active learning instruction in the experiment was more effective.

Thanks for reading!

If you found this newsletter useful, please forward it to a colleague who might like it! If you have thoughts on any of the topics above, please reach out via email, Twitter, Mastodon, or LinkedIn.

Consider supporting Intentional Teaching through Patreon. For just $3 US per month, you get access to bonus podcast content, Patreon-only teaching resources (like a guide to facilitating class sessions on Zoom), an archive of past newsletters, and a community of intentional educators. Patreon supporters also get 20% off my book, Intentional Tech: Principles to Guide the Use of Educational Technology in College Teaching, when they order from West Virginia University Press.

Intentional Teaching with Derek Bruff

Welcome to the Intentional Teaching newsletter! I'm Derek Bruff, educator and author. The name of this newsletter is a reminder that we should be intentional in how we teach, but also in how we develop as teachers over time. I hope this newsletter will be a valuable part of your professional development as an educator.

Read more from Intentional Teaching with Derek Bruff

Hi friends, Many of you know I'm based in Nashville, and we had an epic ice storm last weekend. My power went out Sunday morning and didn't come back on until Thursday afternoon. And I'm one of the lucky ones! There are still 80,000 households in Nashville without power. And other parts of the Southeast were hit at least as hard... I know my friends in Oxford, Mississippi, are struggling. Please send your warm thoughts our way! All that to say, there was no podcast episode this week, and I'm...

Upcoming Appearances I have a few speaking engagements coming up this spring that I thought I would share. One of them is a free webinar anyone can attend, while the other two are for particular audiences. February 6th - "Integrating AI into Assignments to Support Student Learning," a webinar for the Kentucky Community and Technical College System, 9:30am to 10:30am Central February 19th - "Thinking about Thinking: Using Formative Practice to Grow Metacognitive Learners," a panel webinar for...

Students and AI Literacy with Annette Vee How do students feel about generative AI and learning? What kind of guidance are they looking to their instructors for? How can we be understand our students so that we can together figure out how to adapt to a world with generative AI? This week on the podcast, I talk with Annette Vee, associate professor of English at the University of Pittsburgh. Annette and her colleagues have talked to a lot of students at Pittsburgh about AI, and she has a lot...