The Turing Test is upon us as a measure of our reality. For seventy-five years, it has influenced the dreamers and developers of artificial intelligence (AI)(prior AI articles here) The breadth of Hollywood excursions into AI is incredibly diverse, and many focus on AI becoming sentient. Those have been noted here in various posts, but a good summary of the best was published in Resident.
The test, according to Britannica involves
"a remote human interrogator, within a fixed time frame, must distinguish between a computer and a human subject based on their replies to various questions posed by the interrogator."
It is "an imitation game," that is played. Britannica acknowledges that "Turing sidestepped the debate about exactly how to define thinking." He dodged the issue of sentience and instead proposed a test of convincing. A machine need not think but must be sufficiently adept at convincing us it has. That is a critical distinction, but perhaps not for mechanics.
According to the Stanford Encyclopedia of Philosophy, the Turing Test:
"refer(s)to a proposal made by Turing (1950) as a way of dealing with the question whether machines can think."
The contentions in that article are challenging and illuminating. Perhaps the roots of Turing's thought may extend to Descartes (1637), and Cartesian de Cordemoy (1668). Descartes suggested that we question our perceptions, beliefs, and conclusions. However, he was seemingly convinced of one universal truth: cogito ergo sum ("I think, therefore I am"). This is history, largely quoted, and perhaps the best or the worst revelation depending on your perspective (some love Descartes, others not so much).
Thus, the Turing test is philosophical. It is subjective, and dependent upon the thoughts and processes of the observer. It is consensus in the basest form and yet is championed as science. See Consensus in the Absence of Proof (January 2021); Tootsie Pops Make You Think (August 2021). I have not necessarily been a persistent fan of consensus.
In April 2025, the news broke that two AI Large Language Models (LLM) had passed the Turing Test. see Avatars to Replace Lawyers (April 2025). From this, we are asked to conclude they are sentient, that they "think" and perhaps "feel." From this subjective consensus, we are asked to conclude that the AIs are indeed thinking beings. With that will come those who think they are our peers. See Rights for the Toaster (October 2024)
A word of caution, as noted by Futurerism, is that the conclusions in this recent study are currently "awaiting peer review." That is perhaps critical. Peer Review is a specifically subjective consensus process in which various smart people are asked to review the conclusions of other smart people to see if the seconds agrees with the firsts.
That people are smart does not make the process science, per se. As noted in Consensus in the Absence of Proof (January 2021), consensus did not favor intellects like Copernicus and others.
Generally speaking, science is not consensus. It is hypothesis, test, measure, and replication. If a million people believe a dumb idea, it is still a dumb idea, despite the consensus. Indeed, the science may perhaps be deemed to exist through replication of the recent tests that have made the news.
Nonetheless, any repetition is likely to have the same arguably fatal flaws of subjectivity and observer bias that are inherent in the Turing Test. Subjectivity and observer bias are ever-present as challenges in science, study, predictions, conclusions, and more.
In announcing the results of the (as yet) unreviewed testing, the Daily Mail proclaimed
"Robots are now as intelligent as HUMANS"
That is fundamentally flawed because the test is not about intelligence but about sentience. A well-reasoned 2024 Built In article voiced distinction:
"Intelligence is about cognition and the ability to acquire and apply knowledge, while sentience relates to the capacity to feel and have subjective experiences."
In short, intelligence is not synonymous with sentience. Furthermore, sentience, intellect, and humanity come in degrees. Certainly, I am likely as sentient as Turing ever was, but it is unlikely I will ever be as intelligent.
In the recent testing, which "awaits peer review" (or some additional consensus bias), the foundation, the AI LLM responses, was observations by
"126 undergraduate students from University of California San Diego and 158 people from online data pool Prolific."
Who are these people? What are their qualifications to observe, interact, engage, and evaluate? They are, undoubtedly, mere humans (as am I). Almost half of them are undergraduate students with potentially only a modicum of life experience. It is possible that they are all of average intellect and ability, but then again aren't we all?
In this admittedly subjective and consensus-driven evaluative process, who will be the judges of sentience? As the Turing Test is administered, will the interactions be occupationally evaluated?
In other words, might an AI LLM more likely seem indistinguishable from a "real person" if the evaluator lacks foundation? If the inquiries are about baseball and the evaluator has never played or watched a game, might the result depend more on form than knowledge (how many touchdowns does it take to win a baseball game?).
If the subject matter is medicine, is it logical that the evaluators would be non-physicians?
If the subject matter is law, is it logical that the evaluators would be non-lawyers?
This list could go on for days inquiring about engineering, accounting, management, acting, and virtually anything else. In any event, does it make sense the evaluators of such technical topics would be undergraduate students?
Certainly, there are also people in each of these professions who are perhaps not the most knowledgeable, intellectual, or proficient. They may not be the best trained, or the most engaged. Nonetheless, it is likely the lowest functioning expert in any field is still a bit more qualified there than the randomly selected undergraduate student?
What we gather from Turing's expansion on Descartes is important: exceptionally smart people often build on the work of other exceptionally smart people. It appears probable that Turing did so. AI LLMs will also likely do so.
Humorists have suggested we similarly need such tests for humans (think "Florida Man"). While that is humorous, it is also a little too close to home perhaps. We do not doubt the human sentience when it makes untoward decisions, errors, and worse. Will that cause us to doubt AI LLMs?
Nonetheless, the folks at Stanford suggest entities may pass the Turing test, and
"use words (and, perhaps, to act) in just the kind of way that human beings do—and yet to be entirely lacking in intelligence, not possessed of a mind, etc."
Thus, the Turing Test is perhaps largely one of mimicry. We might similarly employ a group of observers to listen to birds. Could they distinguish between a mockingbird and the bird being imitated? If I am the observer, then the answer is a hard "no." But what if the observer has a deep knowledge of birds, including mockingbirds? Then, perhaps the answer is more likely or probably yes.
This will all be critical as human society debates how, where, when, and why we integrate AI, LLMs, Robots, and more into our world. These are scary, exciting, and intriguing times.