AI Scams Are the Point

Why does it matter how we talk about artificial intelligence? Some, mainly tech firms and their useful idiots, maintain we are about to immanentize the eschaton (which translates roughly as: dissolve all of society’s problems). Others insist we are summoning a false god in the form of an artificial general intelligence that will destroy civilization. Those feelings of awe and terror aren’t particularly assuaged by the numbers: Tens of billions of dollars are raised each year by startups in this sector—incumbents hope to raise trillions more.

Welcome to the age of AI and its revolutionary potential. AI will make us more productive or unemployable, diversify or homogenize culture, alleviate human suffering with leisure or zero it out via extinction. Hype about predictive and generative AI, hysteria about the existential threat of artificial general intelligence, concern over the deployment of automated systems, the proliferation of Potemkin AI (human-powered products and services rebranded as AI-powered) and Habsburg AI (dysfunctional AI trained on AI-generated data), confusion over what artificial intelligence actually is and how it works—all of it builds as employers and governments and militaries buy it and integrate it without much concern for the welfare of their subjects.

AI Snake Oil: What Artificial Intelligence Can Do, What It Can’t, and How to Tell the Difference
by Arvind Narayanan and Sayash Kapoor
Buy on Bookshop
Princeton University Press, 360 pp., $24.95

The tech industry has made it difficult to discuss these developments critically, by training us to use the term AI to refer to wildly different systems: algorithms that determine how much to underpay Uber drivers while overcharging customers, claim to predict crime but simply assign police to nonwhite neighborhoods, falsely accuse migrants of welfare fraud, and that generate sentences one could pass off as human-made.

Arvind Narayanan and Sayash Kapoor open their new book, AI Snake Oil, with a thought experiment: Imagine a world where “vehicle” was the only word we used to refer to every mode of transportation. Debates about environmental impact, safety, cost, and so on would be confusing because we would be conflating bikes with spacecraft or trucks with buses. The field of “vehicle technology” would be rampant with scams, propaganda, deceit, and an overwhelming mountain of bullshit. “Now replace the word ‘vehicle’ with ‘artificial intelligence,’ and we have a pretty good description of the world we live in,” Narayanan and Kapoor declare.

Narayanan and Kapoor, both Princeton University computer scientists, argue that if we knew what types of AI do and don’t exist—as well as what they can and can’t do—then we’d be that much better at spotting bullshit and unlocking the transformative potential of genuine innovations. Right now, we are surrounded by “AI snake oil” or “AI that does not and cannot work as advertised,” and it is making it impossible to distinguish between hype, hysteria, ad copy, scam, or market consolidation. “Since AI refers to a vast array of technologies and applications,” Narayanan and Kapoor explain, “most people cannot yet fluently distinguish which types of AI are actually capable of functioning as promised and which types are simply snake oil.”

Narayanan and Kapoor’s efforts are clarifying, as are their attempts to deflate hype. They demystify the technical details behind what we call AI with ease, cutting against the deluge of corporate marketing from this sector. And yet, their goal of separating AI snake oil from AI that they consider promising, even idealistic, means that they don’t engage with some of the greatest problems this technology poses. To understand AI and the ways it might reshape society, we need to understand not just how and when it works, but who controls it and to what ends.

There are broadly three main types of AI in Narayanan and Kapoor’s taxonomy: prediction, generation, and content moderation. Predictive AI is used to inform decision-making by anticipating future events, though the pair convincingly document how it is fundamentally in-capable of doing this despite widespread use across society. Generative AI is the object of the most recent wave of AI hype, capable of synthesizing and producing media. Content moderation AI refers not only to algorithms responsible for assessing social media platform policy violations, but also to those that personalize user feeds and experiences.

The book’s genesis lies in a lecture Narayanan gave at M.I.T. in 2019 on how hiring automation was undiluted AI snake oil. The algorithms used in hiring could be gamed through simple changes, such as padding a keyword with résumés, or “wearing a scarf or glasses” in a video interview. Narayanan would go on to co-teach Limits to Prediction, a course focused on, well, the limits of predictive AI systems like automated hiring. This is where he met Kapoor—a former Facebook software engineer who’d just joined Princeton’s computer science Ph.D. program. Kapoor asked every potential adviser, “What would you do if a tech company files a lawsuit against you?” before finding a satisfactory answer with Narayanan (“I would be glad if a company threatened to sue me for my research, because that means my work is having an impact”). So began the next four years of research.

Already, large swaths of society quietly use algorithms to automate decision-making by generating rules from patterns within past data—to make predictions about what will happen in the future. States use automated risk assessment tools to decide on granting bail. Countless welfare agencies use automated systems to make accusations of benefit fraud. Most of these models are pitched as accurate because they use sophisticated statistical techniques, as efficient because they can be trained on already existing data, and as cost-saving since they require no human oversight. In reality, they are none of these—to such a degree that developers will often “include fine print saying that they should always be used with human supervision” while knowing full well this will not happen.

Take automated risk systems, which are often too narrow or general to make reasonable assessments (trained on local data for nationwide use or vice versa); the result is that countless people are erroneously marked as “high risk” and kept in jail for reasons that have little to do with evidence. Automated welfare can issue unappealable decisions that are often incorrect. Two examples the authors invoke: The prime minister of the Netherlands and his entire cabinet resigned in 2021 after 30,000 parents (predominantly Turkish, Moroccan, and Eastern European immigrants) were falsely accused of fraud; from 2013 to 2015, Michigan’s tax authority incorrectly collected $21 million, thanks to automated decision-making.

Predictive AI imposes fundamental limits on predictive systems. Automated systems cannot account for the impact of automated decisions; are trained on preexisting datasets generated by one population but applied to different ones; and, when kept low-cost (human-free), often preserve or intensify discrimination—encouraging people to game the supposedly fair, objective, and accurate system.

The fixes one might immediately suggest—collecting more data, developing more innovative algorithms, and integrating humans for oversight and accountability—not only reveal the point of adopting automated systems is moot (conceding these systems cannot make accurate, cost-saving, and bias-free predictions) but likely won’t yield significant improvements. “Despite the widespread use of computational predictions in social settings, you might be surprised to learn that there’s remarkably little published evidence of their effectiveness,” Narayanan and Kapoor write. When it comes to human behavior, there exist too many unknowns: There’s a practical limit to how much data we can collect; there is the possibility that exceeding the limit may not be enough; and there is data we may never think to collect or be able to (such as data on the cumulative advantages or disadvantages of granting bail). Since none of this deters companies “from selling AI for making consequential decisions about people by predicting their future,” the authors insist we must resist “the AI snake oil that’s already in wide use today” instead of pining for better predictive AI technology.

Many of the limitations of generative AI are familiar. Generative AI chatbots simply “predict” the next word in a sequence using methods that require vast computational resources, data, and labor. While they cannot “think” or “understand” language, they can produce “internal representations of the world through their training process” that allow them to “learn” a language’s structure without encoding its grammar. Producing a poem, answering factual questions, beating a human at a game—all these performances are about learning patterns and intuiting rules, then remixing whatever is in the dataset to generate an output. When playing chess or generating a poem, this is relatively straightforward. When answering questions that concern factual claims, we quickly encounter “automated bullshit”: Recall for instance Google’s AI-generated advice earlier this year to eat one rock per day, or add a serving of glue to pizza.

Narayanan and Kapoor are clear-eyed about some risks generative AI may pose. Automated bullshit is one thing, but they’re also concerned about the swell of AI-generated audio, images, and videos (e.g., deepfakes, the vast majority of which are nonconsensual porn). They spend a chapter arguing that “reframing existing risks as AI risks would be a grave mistake,” because that framing paints this technology as all-powerful, leads critics to “overstate its capabilities and underemphasize its limitations,” empowers firms “who would prefer less scrutiny,” and ultimately obscures our ability to identify and root out AI snake oil.

For the authors, the greatest tangible risk posed by AI is “the labor exploitation that is at the core of the way it is built and deployed today.” AI relies on thousands of human workers to test the technology, which often means reviewing harmful and hateful content at high volume for exceptionally low pay; “The work is so immiserating,” the authors report, “that many data annotation firms have taken to recruiting prisoners, people in refugee camps, and people in collapsing economies.” Doubtful that the market will fix itself, the authors anticipate the need for a new labor movement:

The Industrial Revolution famously led to a decades-long period of horrific working conditions as the demand for labor shifted from farms to accident-prone mines and factories located in overcrowded, disease-ridden cities. The modern labor movement arose as a response to these conditions. Perhaps there are lessons to be learned from that history.

This is not a particularly detailed recommendation (nor a particularly practical one in the case of displaced persons and workers in countries where organizing is illegal). And, in other places, the authors are jarringly naïve. Early on, Narayanan and Kapoor—two Princeton researchers who presumably haven’t had to look for a job in years—remark that if you “object to AI in hiring” you can “simply choose not to apply for jobs that engage in AI to judge résumés.” Sure.

They also suggest there is an underlying idealism among AI researchers. Some of their evidence for this cautious optimism is ImageNet. Created in 2007 but publicly released in 2009, ImageNet was a large image dataset lifted from the web for the purpose of training machine learning models. In 2010, ImageNet launched a competition to see which AI model would best classify images. The winner in 2012 was AlexNet—a “deep learning” neural network that used more (deeper) layers of networks to increase accuracy. Narayanan and Kapoor credit the ImageNet competition’s emphasis on benchmarking—forcing everyone’s models to use the same secret training data, ranking the results, and collectively tweaking the best models—with spawning a culture of openness that spread across the field:

If one company decided not to publish its methods, AI researchers would find it less appealing to work there, because they wanted their discoveries to contribute to human knowledge and not just to a company’s bottom line. This would put such a company at a competitive disadvantage. Today, this culture has changed to some extent as companies prioritize profits. The question of whether AI knowledge should be shared or hidden has become a major flash point in the community.

There is, again, a naïveté in this vision, which doesn’t even begin to offer a coherent explanation why the major players today—OpenAI, Anthropic, Microsoft, and Google, to name a few—don’t share research except when it advances corporate P.R. or profit margins. Such credulousness leads to generating snake oil—obscuring a material analysis focused on the distribution of resources, the structure of labor and capital markets, power dynamics, political economy, and state policy, and omitting the crucial question: Why does our “artificial intelligence” field look the way that it does?

In AI Snake Oil, you have to squint to find evidence of the military’s intimate involvement in the artificial intelligence field. The first part of Narayanan and Kapoor’s history begins in 1943 with the publication of a simplified mathematical model of a neuron. In the late 1950s, Frank Rosenblatt and his team created the Perceptron, the artificial equivalent of a neuron deployed for pattern-matching. This connectionist approach sought to replicate human cognition via networks of simulated neurons but was sharply constrained by its intense computational resource demands—neural network research stalled in the 1970s, and interest shifted to a symbols approach encoded by logic rules dictating how computers manipulate symbols. A revival came in the 1980s, when researchers discovered “deep” or multilayered neural networks could be trained in a certain way to help circumvent some computational demands. Interest waned in the 1990s and shifted to support vector machines—more computationally efficient and amenable to cheaper hardware—until 2012, when ImageNet helped kick off interest in neural networks and deep learning again.

From this, Narayanan and Kapoor declare a clear pattern: Research is driven by hype as the AI community gets excited over a specific approach. This excitement creates “a feedback loop” where “researchers and funders influence each other to propel work in that area forward.” Narayanan and Kapoor also single out peer reviewers as a major influence, insisting their skepticism disincentivizes approaches that have fallen out of favor. The result? As if being guided by an invisible hand, “the field often moves in sync toward the approach of the moment, almost entirely abandoning earlier research programs.”

The second part of their history centers on hype, rejecting the “Gartner hype cycle”—which imagines hype as a technology’s initial expectations being inflated, crashing, then ticking back up to a more realistic plateau as mainstream adoption is reached. Picking up from Rosenblatt, Narayanan and Kapoor point to a 1969 book by M.I.T. researchers Marvin Minsky and Seymour Papert highlighting the limitations of the Perceptron. A 1972 report by mathematician James Lighthill, commissioned by the U.K. government, cast much AI research progress as “illusory.” Together, these helped hasten the first “AI winter,” which saw funding dry up. In the 1980s, a thaw came with an “AI spring,” thanks to the rise of “expert systems” that encoded rules and data based on expert decision-making. The excitement was short-lived, and another winter came by the end of the decade. This cycle continued until 2012, thanks again to ImageNet. Here Narayanan and Kapoor observe a second major pattern:

In the short term, hype can attract massive investment and lead to intense growth. But this hype also sets a high bar for real-world impact. AI winters result when the usefulness of AI applications doesn’t live up to the hype.

These principles of hype—AI researchers build hype, funders try to seize upon it, and everyone is disappointed by it—are interesting to consider, but is this the full picture? Does it all boil down to easily excitable researchers and easily disappointed funders? Is it true that the former have been concerned with the public good but frustrated by peer reviewers and corporate funders? Not quite.

While Rosenblatt proposed the Perceptron program in 1957, the first Perceptron machine (MARK I Perceptron) was actually built with funding from the U.S. Navy in 1958. The MARK I Perceptron was then used by the Central Intelligence Agency from 1960 to 1964 to identify military targets, with a series of reports issued in 1965 on the feasibility of automated target recognition systems. There is one oblique mention of all this in the book, found in the second part of Narayanan and Kapoor’s history and presented as a warning about leakage (when AI models are tested on training data): We learn of “an apocryphal story from the early days of computer vision” where “a classifier was trained to discriminate between images of Russian and American tanks with seemingly high accuracy,” but it turns out Soviet tanks had been photographed on a cloudy day, the American tanks on a clear day, and the classifier was detecting brightness.

The military’s role is thoroughly neglected in this book. AI Snake Oil does not mention the RAND Corporation, an indispensable appendage of the Cold War military-industrial complex that played a central role in hosting, funding, and connecting key pioneers of the AI research field. Early forays into simulation, modeling of human psychology, and problem solving in the field were conducted at RAND by RAND employees, alumni, and outfits. The Atlantic once called RAND “the paramilitary academy of United States strategic thinking,” due to its penchant for fusing military and civilian personnel, assets, and ideas—in line with Dwight D. Eisenhower’s call to subordinate civilian scientific research to grand strategy. Nor does AI Snake Oil mention the Defense Advanced Research Projects Agency, or DARPA—save for one line about the role it played in funding the early internet. We get no mention of how DARPA was used to fund the creation of major AI research labs at U.S. universities—again, as envisioned by America’s postwar plan to merge civilian and military research. One could go on like this, but the problem is clear: Though they set out to deconstruct contemporary myths about artificial intelligence, Narayanan and Kapoor present their own mythology, which shows the industry in a generous light and ignores institutional features born out of its origins and historical developments.

If you ignore the role of the military, the rise and fall and rise of neural networks is a story about hype cycles and stodgy peer reviewers and noble researchers and recently greedy companies. But if you think for a second about the role of our postwar military at home and abroad, it makes sense that our artificial intelligence research keeps looping back to surveillance, data extraction, prediction, pattern matching, and social control. Coming out of the Cold War’s military-industrial complex, is it any surprise that funders (namely venture capitalists and large technology firms) are designing and pursuing automated systems that prioritize such odious ends?

This depoliticized history leaves us with relatively meager proposals from Narayanan and Kapoor. Take their proposal to focus on curtailing demand for, rather than the supply of, AI snake oil. Yes, the dealers include firms eager to sell predictive AI, researchers who want flashy results, and sensationalist journalists (or public figures). Demand for AI snake oil, however, is about “misguided incentives in the failing institutions that adopt” it.

What do hiring managers, media organizations, educational institutions, and teachers, all have in common in this telling? They represent “institutions that are underfunded or cannot effectively perform their roles,” write Narayanan and Kapoor. Those of us within these broken institutions should oppose “harmful predictive AI” by organizing within our workplaces or communities, as well as looking to alternative decision-making systems that embrace randomness—like partial lotteries for hiring, grant funding, and college admissions. Regulation should be supported to advance responsible innovation, while thwarting regulatory capture (defined as when regulators are “either misinformed or lack the resources and funding to function independently of the companies they are regulating”).

But what about institutions where this dysfunction serves a purpose, or where the institution is working as intended?

Take the following scenario. Universities have been defunded over the years as part of a larger project by the conservative movement to consolidate political, social, economic, and cultural power—in the meantime, our schools have become financial institutions with real estate portfolios, hospital complexes, and a small academic outfit conjoined to the hip. Reversing their demand for AI will likely require a transformative vision beyond lottery admissions and collective bargaining.

Or consider this: Police departments, border authorities, and military forces demand certain AI because these institutions work as intended. Police departments are some of the most well-financed, perpetually expansive, and unaccountable institutions in this country, representing one appendage of a system that plunders for profit at a variety of levels. Predictive AI, which reinforces priors and naturalizes bias and eschews accountability and transmutes bullshit into gold, is a perfect fit. Border authorities and military forces are concerned with force projection, surveillance, deterrence, and social control—why wouldn’t they want artificial intelligence built along those lines, regardless of whether it works or not?

When institutions are using AI as intended, or when institutions have undergone a seismic reformation that makes them more aligned with using AI in a certain way, what good does a demand-side proposal do? This boils down to the mode of tepid analysis that passed for tech criticism a decade ago: If only the consumer knew the truth and executives read the right books, then they could change their beliefs and get back to innovating the future!

This is the height of ambition possible in AI Snake Oil: vague calls for greater antitrust enforcement, a new labor movement, and some new regulation (but not too much!)—so that our consumption and innovation can move along more smoothly.

A decade ago, writing in The Baffler, Evgeny Morozov pointed out the fundamental limits of technology criticism when removed from a real political project. Relatively little has changed:

Thus, technology critics of the romantic and conservative strands can certainly tell us how to design a more humane smart energy meter. But to decide whether smart energy meters are an appropriate response to climate change is not in their remit. Why design them humanely if we shouldn’t design them at all? That question can be answered only by those critics who haven’t yet lost the ability to think in nonmarket and non-statist terms. Technological expertise, in other words, is mostly peripheral to answering this question.

Tech critics offered visions of informed consumers and enlightened executives working together to avoid institutional critiques and political debates—an elaborate dance that impoverished our capacity to imagine alternative arrangements for the funding, development, and deployment of technology. Narayanan and Kapoor offer a vision of informed consumers and enlightened executives working together to field some institutional critiques and entertain some political debates. They cannot entertain more substantive fixes because they don’t seem to be aware of more substantive problems. AI Snake Oil has relatively little to say on what we should do about AI besides make it work in vaguely more sustainable and responsible ways.

If you think, however, that the problem of artificial intelligence—and technology under capitalism more generally—is that it has been beholden to the military-industrial complex from the start and serves to concentrate power, a few questions make a good start: How should we develop artificial intelligence, how should it be funded, how should computational resources be distributed, how should various technologies and the means of producing them be owned? Narayanan and Kapoor seem to believe that the main problem is hype. Whether they realize it or not, that’s a form of snake oil, too.