You are using an outdated browser.
Please upgrade your browser
and improve your visit to our site.
Podcast

The Great A.I. Hallucination

Is the hype (and doomsaying) around generative A.I. programs like ChatGPT obscuring what the technology actually does—and its genuine limitations and dangers?

Illustration by Dalbert Vilarino

Tech futurists have been saying for decades that artificial intelligence will transform the way we live. In some ways, it already has: Think autocorrect, Siri, facial recognition. But ChatGPT and other generative A.I. models are also prone to getting things wrong—and whether the programs will improve with time is not altogether clear. So what purpose, exactly, does this iteration of A.I. actually serve, how is it likely to be adopted, and who stands to benefit (or suffer) from it? On episode 67 of The Politics of Everything, hosts Laura Marsh and Alex Pareene talk with Washington Post reporter Will Oremus about a troubling tale of A.I. fabulism; with science fiction author Ted Chiang about ramifications of an A.I-polluted internet; and with linguist Emily M. Bender about what large-language models can and cannot do—and whether we’re asking the right questions about this technology.

ABC News clip: This will be the greatest technology mankind has yet developed.

Alex Pareene: That’s Sam Altman, the CEO of OpenAI, claiming a space for his company’s chatbot at the top of the list of human’s great achievements ahead of the microprocessor, the printing press, and, I guess, the wheel. He’s not exactly an impartial observer, but he’s also not alone in his assessment of the technology’s significance.

Laura Marsh: The release of a new class of generative A.I. programs like ChatGPT set off a cycle of both doomsaying and hype.

Alex: In The New York Times, Kevin Roose described how a conversation with ChatGPT gave him a foreboding feeling that A.I. had crossed a threshold and that the world would never be the same.

Laura: Both the hype and the warning sound straight out of science fiction, which has practically trained us to expect the eventual development of a truly thinking machine. Every chat with Bing is colored by our familiarity with these stories about superintelligent A.I.

Alex: But what about an A.I. that just gets stuff wrong and makes things up? What if instead of hijacking a starship or dominating a planet, a sentient computer just gave bad advice for fixing the warp drive, or invented a fictitious harassment accusation about the captain?

Laura: Today on the show, we are talking about ChatGTP’s strange imagination and what it tells us about the real limits and dangers of so-called artificial intelligence. I’m Laura Marsh.

Alex: And I’m Alex Pareene.

Laura: This is The Politics of Everything.

Laura: The new generation of language bots has an eerie range of abilities. They can generate snippets of conversation, write restaurant reviews and little essays on literature and history, even compose poems with complex rhyme schemes. But these bots also have a tendency to simply make things up. As more story surfaces of outlandish falsehoods generated by A.I., what happens when it’s nearly impossible to tell the difference between truth and lies? Will Oremus reports on technology for The Washington Post and recently wrote an article about what happened during a research study on A.I., when a law professor asked ChatGPT to generate a list of lawyers who had sexually harassed someone. Will, thanks for coming on the show.

Will Oremus: Thanks for having me on.

Laura: So can you talk us through what happened next in this research study?

Will: So ChatGPT spits out this list of lawyers, and it not only gives names, but it gives citations, which is really helpful. You can look up the stories about when they were accused of sexually harassing people. And the lawyer who gets the list is looking through it and he recognizes one of the names: Jonathan Turley. He’s a pretty prominent lawyer. The guy who was looking it up, Volokh, says, “Well, that’s odd. I don’t remember that controversy.” And so he follows the citation and it actually cited a Washington Post story about the supposed incident, and it doesn’t exist. It’s just completely fabricated out of thin air. So he emails Turley and says, “Hey, did you know ChatGPT is accusing you of sexually harassing a student on a trip?” It was very detailed, right? A trip to Alaska. It sounded like the thing you wouldn’t make up, but in fact, ChatGPT did make it up.

Laura: We are used to seeing rumors and random smears in little corners of the internet, but the difference between that and this is that it looks credible if you didn’t click through to the Washington Post link or dig into these details. “Well, you know, there’s a Washington Post piece about it, they say.”

Will: It’s almost like, if you didn’t know that these things are just language models that are rearranging language in plausible ways, a little psychopathic. It’s one thing to get the name wrong of somebody who was accused of sexual harassment, it’s another thing to make up a whole fake citation about it to convince people.

Alex: Well, right. It invented a plausible sounding story with details that sounded like something you would actually had seen reported about someone accused of sexual harassment. It did that because what it’s sort of designed to do is that. It’s designed to take language that it has seen before and make something that sounds natural. In this case, something that sounded natural was effectively libel.

Will: Yeah. At some very basic level, it’s essentially figuring out which words tend to go together in human language that’s been written on the internet, and it has access to this unimaginably vast set of data. We don’t know exactly what OpenAI includes, because even though it’s called OpenAI, it’s very closed and won’t tell us. But stuff like all of Reddit, or like everything that’s ever been written on Tumblr or Wikipedia, every patent that’s ever been filed in the world. I would say, more than half the time when you ask ChatGPT a factual question, the response is true. So it can really easily mislead you into thinking that something in the A.I. knows what’s true and what’s false, but it really doesn’t.

Alex: Yeah. Generally, I would say the last person you want to invent an accusation of misconduct about would be a high-profile attorney. But in this case, who can be sued? What is actionable here?

Will: Yeah. It raises really interesting and somewhat novel liability questions. So, go back to the days even before AOL, CompuServe, and Prodigy and there were, all of a sudden, these message boards that anybody in the world could post on at any time and anybody else in the world could see it. This quickly arose this question: What happens when somebody libels somebody in an online forum? Does the computer service bear some responsibility? Can you sue them, too, for hosting this crap? Congress was like, “Let’s say you can’t sue the interactive computer service for the bad stuff that the person says on it because otherwise it’s gonna chill the development of this internet, and that’s an exciting thing and we don’t want to do that.” Section 230 says that the computer service can’t be treated as the speaker or publisher of something that a third party posted on its site.

Laura: Though, in this case it would be ChatGPT who said it. What’s interesting about the story that you wrote is that ChatGPT came up with this damaging falsehood, and it was a serious enough accusation that this warrants a piece in The Washington Post. Computers are making up really awful lies about people that if they were believed, could be life-changing to that individual. Yet actually it seems like this is so far from a one-off for ChatGPT. The same day I read your piece, I was playing around with it, and it told me a series of crazy lies. One of them had told me, Harlan Crow owned the childhood home of President Harry Truman. And I was like, “Whoa, that’s such a cool fact.” And then my next thought was, “I feel like I would actually know that already.” And then I asked for a source and it was like, “Oh, I made that up.” It did the thing it does where it’s like, “I’m so sorry, it appears that I wasn’t correct.”

Will: Actually, even when it apologizes it might be wrong. If you tell it to add two plus three and it comes up with five and you’re like, “No, that’s incorrect.” It’ll be like, oh, I’m sorry. I’m so sorry I got that wrong. Let me try again. 

Laura: Right. Its personality is the worst research assistant ever because its first instinct is like, “Quick, make something up,” and then its second instinct is like, “Okay, cool, cool, no, just walk it back.” I think what really interested us about this story, too, is if these kinds of falsehoods and errors can proliferate at this rate, how is that going to change the information environment that we live in?

Will: At the risk of looping back into the weeds, which you had so definitely pulled us out of legal liability, it’s worth closing the circle on that question because, when you type in a prompt to ChatGPT and it tells you something that’s false about Alex. OpenAI is going to argue that no harm has been done. It hasn’t impacted Alex’s life in any way, but their strategy is going to be: If Laura takes that information and then publishes that somewhere else, “I just heard that Alex collects Hitler memorabilia or whatever.”

Laura: But only to remind himself of how awful these crimes were.

Will: But if you then take that false information, you put it on Twitter, OpenAI is going to want to say, “That was you publishing it, not ChatGPT. You were the one who did the harm here by amplifying that information to other people.” So because somebody searched chatGPT and got something false about Turley, that hasn’t risen to the level of damage that you would need to prove a libel suit, it was actually the guy who then went and told the media about it. Then here’s the crazy part: Turley wrote an op-ed about it in USA Today. We wrote the story about him in The Washington Post. The next day or two days later, I went in and I searched some of these bots again for Turley. And now they were saying that he did it and they were citing his own USA Today op-ed. They were citing my own Washington Post article. So the debunking stories got sucked back into the language models.

Laura: Oh wow. And his name, and these words, or key search times are all out there on the internet wrapped up.

Will: That’s exactly right, and I think that’s probably how it happened in the first place. Turley has been in the news as a commentator on stories about sexual harassment in the legal profession. His name was in articles that have the words lawyer, and sexual harassment. And that’s probably how it came up with him doing this in the first place.

Laura: I’m thinking, if you, as a journalist, write a story about someone who has been credibly accused of sexual harassment, the way you write that story has to be extremely careful. And I wonder if a large language model is really capable of doing that at scale.

Will: Yeah. The way that OpenAI has been addressing this, which I think is the template that the others are following as well, is to just refuse to answer questions that look like they’re on a sensitive topic. And so if anybody’s used ChatGPT a good amount, they will have run up against this: they ask it a question and they’ll say, “Sorry, but as an A.I. language model, I can’t discuss that issue or I’m not going to wade into that question.” I regularly lurk on these Reddit forums and discord channels where people are posting their experiences with these A.I. chatbots, and one of the big complaints has been this thing used to be awesome. Now it’s useless because it refuses to answer all the interesting questions I ask it. And it really is an open question of can you get a model that isn’t prone to saying wrong stuff in dangerous ways, but that is still useful in the ways that we want it to be.

Laura: Thank you so much Will.

Will Oremus: Thanks to both of you. I enjoyed it.

Alex: Read Will Oremus’s article, “ChatGPT invented a sexual harassment scandal and named a real law prof as the accused,” at washingtonpost.com. 

Alex: Well that’s one screw up for ChatGPT, but surely it’ll just get better from here, right?

Laura: After the break, we are talking with author Ted Chiang, who is not so sure.

Alex: It’s hard to avoid the language of consciousness when talking about programs like ChatGPT. We chat with them after all; their creators refer to training them. And when they invent facts, we say they’re hallucinating. But what is a large language model like ChatGPT actually doing when it seemingly creates a story? In a recent New Yorker article, science fiction author Ted Chiang compares these artificial intelligences to essentially a bad Xerox of the internet. He asks, “What use is there in having something that rephrases the Web?” Ted, thank you so much for joining us today.

Ted Chiang: Thank you for having me.

Alex: Your piece in the New Yorker says ChatGPT is a blurry JPEG of the internet. Could you explain that analogy for us?

Ted: If you think about all the information on the web, and then compare the size of that to the size of one of these large language models like ChatGPT, the program cannot contain all the text on the web because it is simply much too small. It is maybe 1 percent of the size of the text that it was, say, trained on. So it’s not memorizing the text. It is a compressed representation of the text on the web

Alex: Right, just like a jpeg. A jpeg is a compressed image, so it’s not the raw image file with all of the information of the original.

Ted: ChatGPT seems like it has a lot of the information that is on the web. It is accurate sometimes, in the same way that a blurry JPEG does give you a sense of what the original image looked like. But a blurry JPEG is not going to let you see the details accurately.

Alex: And the more you compress something, the more artifacts you introduce, the more things that weren’t originally there end up in it.

Ted: The thing is that when we look at a blurry JPEG, we can see that it’s a blurry JPEG and we don’t mistake it for a high resolution original. But ChatGPT, it’s giving text which is indistinguishable from accurate text. The fact that there are these compression artifacts is not readily visible and the hallucinations that these models are prone to are a compression artifact where the program is giving its best guess. It’s extrapolating because it doesn’t have the actual answer, but it looks like it’s returning actual search results. It’s not. A search engine will return things like no results found but ChatGPT never is at a loss for an answer. A lot of people have said that these models are auto-complete on steroids like when you’re typing in your phone and you type a few words and it suggests the next word.

Alex: It can always think of what the implausible next word would be. At all times it can say, well, after that word, this word might come.

Ted: Yes. It’s always gonna suggest something and that something may be radically incorrect.

Laura: There’s a really interesting point you make in this recent piece that the errors are actually part of what makes the output of ChatGPT look so impressive on the surface. If you wanted to just find out the answer to a question, you could search for it right in Google, and you’ll probably pull up an accurate result pretty quickly. That’s kind of impressive. If you ask ChatGPT the same question, it won’t just pull up that result. It has to transform it in some way and generate new text to try and give you an answer. And that new text may have errors in it, like it’s a copy of a copy of a copy, which makes it original, but also wrong. Somehow the wrongness is essential to this form, at least, of fake originality.

Ted: If you make an analogy to a student who you are quizzing. If the student recites something from the textbook verbatim, you know that they have a really good memory, but you don’t know whether they’ve really understood the material. And with students, if they are able to rephrase it in their own words, then we think that they may have actually understood the material. In this analogy, a conventional search engine is the student reciting something verbatim, but ChatGPT is rephrasing it. That makes us think that it actually understands the text that it has processed.

Alex: And that’s why it can seem so impressive, because as you say rephrasing something you read in your own language is a skill a lot of us have to work very hard at, but this is just a program that is designed to do that.

Ted: And a rephrasing program, that is legitimately impressive. If it didn’t have this tendency to make things up when it didn’t have a good answer, then that would be amazing. That would be pretty cool. But it remains unclear whether you can get one without the other.

Alex: And that’s what it seems like the idea is again for Microsoft and for Google and for these companies to turn it into a product that makes them money instead of just something that’s fun to play around with. That seems like a question that should be asked a lot more often, because it seems vital to be able to separate the rephrasing from the hallucinating.

Ted: That’s the multimillion dollar question. If you weren’t selling it as a search engine, if you weren’t selling it as a replacement for Google or Bing, that’d be one thing. Something that is designed to generate text, maybe that’s not a good fit for the application that we have for search engines.

Alex: Well, Laura likes to say that what ChatGPT is actually good at is making sestinas. It’s very good at writing poetry.

Laura: Or formally complex, but spiritually blah.

Alex: Yes, it’s not good at writing good poetry, but yes, it’s good at writing difficult-to-compose poetry. I consider you someone who’s skeptical of a lot of the hype around A.I., but you’ve raised a really important question, which is: In future iterations of ChatGPT, are they going to be training them on material produced by ChatGPT? Because if the web is now going to be filled with things created by these programs, whether or not they’re included in the training materials would seem to speak to the quality of what they’re actually creating.

Ted: There probably isn’t vastly more text available to train them on. If successors to ChatGPT are going to be trained on vastly greater sums of text, where is that text going to come from? If it comes from the output of a program like ChatGPT, then we are not going to get good results. In that case not only will a successor to chat jit not be probably any better, we will also have polluted the entire internet and maybe reduced its usefulness as a source of information.

Laura: It’s like the opposite of the singularity.

Alex: I was just going to say, that’s exactly right. It’s literally the opposite of the singularity.

Laura: The more it generates, the worse the technology will get.

Alex: Because that’s a common fear among certain people on the internet, this idea of endlessly self-improving A.I. that becomes super intelligent. You are suggesting in fact, that it could actually just make itself dumber and dumber in a way.

Laura: This isn’t a fanciful scenario. Just today there was a report in Bloomberg where they reviewed 49 websites that were entirely content farms generated by A.I. just flooding the internet with ersatz news stories. One of them is a website called celebritiesdeaths.com that reported that Joe Biden was dead and that Kamala Harris was acting as president in a plausible way that such a story on a low quality news website might be written. If that happens, that flood of really low value slash may even be harmful or at least incorrect A.I. content has already started.

Ted: Yes. That is maybe a good use of these large language models, because they’re designed to generate text. They’re at least fit for that purpose.

Laura: Which is basically spam is what we’re saying. It would be funny if “what time is the Super Bowl” content just gets completely thrown into chaos by A.I. spilling out millions of different answers.

Alex: No one will know what time the Super Bowl is.

Laura: You have also written about, on a more philosophical level, why you think A.I. is not going to get smarter. Because one of the things that boosters for A.I. and tech are saying, “Well, if you look at the progress we’ve made in the last five years, it’s going to just keep getting better and better.” You’ve written about reasons why you are not convinced by that. Could you just talk us through a couple of them?

Ted: A lot of it has to do with a misconception of what intelligence even is. We have this idea of an I.Q. test where intelligence is reduced to a single number, but there’s a lot of debate about whether there is something that we can meaningfully call intelligence, which can be reduced to a single number. Even if we grant that for the sake of argument, it does not at all follow that someone can increase their own number. If you have an I.Q. of 160, that doesn’t mean that you can increase your I.Q. to 200.

Alex: You would think if it was following this rule that the smarter you are, it would follow this curve of ‘you should be able to get even smarter’. If you’re so smart, you should make yourself even smarter.

Ted: Again, it also depends on what you mean by smarter. Because you can acquire more information over time, but that’s an entirely different thing than being smarter. There’s this computer scientist named François Chollet, and he wrote a paper called “On the Measure of Intelligence.” In it, he draws this distinction between skills and intelligence. Skill is how well you perform a certain task. And intelligence is how rapidly you gain new skills. This formulation makes a lot of sense when applied to people. Most people can gain new skills. The people who improve their skills the fastest, that’s what we think of as intelligence. By this measure, computers are highly skilled because they perform very well at certain tasks, but they’re not at all intelligent. Have we created any program that can actually acquire new skills? Have we created any program that can learn chess and then learn to make you scrambled eggs? We don’t have a program that can acquire new skills in that way at all. It might be like a fundamental category error to think that a computer could make itself smarter.

Alex: It strikes me that long before we have to worry about super intelligent computers, we will have to worry about how much less intelligent computers will be used by people. You’ve written about this quite a lot.

Ted Chiang: Yeah. One of the things that I often say is that our fears of technology are best understood as fears of how capitalism will use technology against us. It’s not that A.I. wants to put us out of work; it’s that capital wants to put us out of work, and capital will use A.I. to do it. In a lot of ways it’s analogous to the way that capital managed to get workers resentful of overseas laborers because of job outsourcing. It’s not anyone in India who wants to take your job. It’s management who wants to replace you with someone cheaper. The same thing is true for A.I.

Laura: I also wonder if this whole narrative about the fear of super intelligent A.I. is really just a way of creating a myth that this stuff is really better than it is. To what extent do you think science fiction collaborates in some of those myths?

Ted: It’s a historical accident that we wound up using the phrase artificial intelligence to describe radically different things. We use that phrase to describe this hypothetical idea often found in science fiction stories of a machine that thinks, like HAL9000 in 2001. But we’ve also used that same phrase to describe what are essentially applied statistics programs. If we had come up with different phrases, maybe there wouldn’t be this confusion. This confusion works toward company’s advantage, because again it makes their product sound cooler.

Laura: It makes this thing sound like a big adventure, rather than something that’s halfway between boring and a basic labor relations problem.

Ted: Yes. The algorithm that determines whether a bank should give you a loan or not, that is not an exciting thing to talk about. But if you can say that we are consulting a superhuman intelligence, it’s like, “Well, alright—”

Alex: It’s much more exciting to imagine, I guess, SkyNet denying you a loan than a badly programmed algorithm.

Ted: Or even a well programmed algorithm.

Alex: A well programmed algorithm, which is used by people for bad ends.

Ted: Yeah.

Alex: All right Ted, thank you so, so much.

Ted Chiang: Thanks very much. I had fun. This was a good time.

Laura: Read Ted Chang’s article “ChatGPT Is a Blurry JPEG of The Web” at newyorker.com.

Laura: If artificial intelligence isn’t actually intelligent, what are these programs?

Alex: After the break, we’re talking to linguist Emily M. Bender, who says we’re thinking about A.I. in the wrong way.

Laura: Despite the apparent limits of A.I., the hype persists with some journalists claiming A.I. has “mastered language” and boosters promising it will replace lawyers and therapists. But what is it really doing and what isn’t it doing? We’re talking to Emily M. Bender. She’s a professor of linguistics at the University of Washington, and she’s taken a closer look at these claims and broken them down in her writing. Emily, welcome to the show.

Emily M. Bender: Thank you. I’m very happy to be here.

Laura: You responded, in fact, to a piece with the title “A.I. is Mastering Language.” What is A.I. companies telling us their technology can do?

Emily: It seems like they’re claiming they’ve made everything machines. The reason they’re doing that is that we use language in every domain of our lives. In the massive amounts of text on the internet, there’s language about every topic imaginable. If you have a text synthesis machine that’s trained on that big pile of data, you can get it to spit out plausible looking text on any topic imaginable. Then if you want to imagine that you’ve created or are about to create a robo-lawyer, a robo-therapist, a robo-poet, those claims are easy to make and maybe easy to fall for, but they’re false.

Laura: Earlier in the show, we’ve been talking about a variety of errors, basically in generative A.I. We keep saying, “Why does ChatGPT lie?” I can imagine that you have a problem with that. Why is that the wrong word to use?

Emily: It’s the wrong word because lie expresses a relationship to the truth. If a person lies, it’s because there’s something they believe to be true and they’re specifically dissembling or trying to dissuade someone from knowing that thing. What ChatGPT and the other large language models are doing is outputting sequences of words that are highly rated by humans as serving as an answer to what came last, and also that are plausible given their training data. Sometimes, that corresponds to, once we’ve interpreted it, a true statement about the world and sometimes it’s not. Rather than talking about it as lies or misinformation, I talk about synthetic text, I talk about noninformation, and both of those things are hugely problematic. They can pollute our information ecosystem, they can mislead people. To say that ChatGPT is lying is to attribute too much of a relationship to the content of what we interpret its output to mean.

Laura: Those two things you just mentioned, noninformation and synthetic texts, how will you describe those for someone who hasn’t encountered those terms before?

Emily: Synthetic text is text, right? It’s well-formed, let’s say English, but it’s synthetic because it was produced through an artificial process that is very different to the process that we use when we are writing or when we are speaking. Noninformation is a term that I use to highlight the fact that what’s come out of ChatGPT, as far as the algorithm is concerned, is only a sequence of letters. It only becomes information or misinformation or disinformation when a person interprets it.

Laura: I can imagine someone saying, “Well, isn’t that true of anything that you read? If I go on a website, I read The New York Times, I’m interpreting it and it really only has meaning once I start reading those words and comparing them to what I know about the world.” So how is the output of ChatGPT different from that?

Emily: As far as how we interact with it, not at all. As far as where it came from, entirely. The New York Times, so far as I know, unlike CNET, is not using a tech-synthesis machine to write its articles; those articles came about because some person had an intention to communicate and chose the words in order to do that communication. Then they sit there as just marks on the page or pixels on the screen until someone else comes and reads them, but in doing that we are performing intersubjectivity. We are imagining the mind of the person who wrote it. There’s a well-grounded reason for doing that inference, that the author of this New York Times piece did some investigative reporting. They talked to some people. They gathered some information. They’re presenting it this way. As I read the piece and I use that background information to make sense of it, that’s well grounded. In the case of ChatGPT, it’s not well grounded to do that inference.

Laura: You mentioned a moment ago, the idea of chatbot replacing, say, a therapist. And I find that really interesting, because if you’ve been to therapy, you are talking about experiences you’ve had. You could do that without a therapist being there, but that wouldn’t be a therapeutic relationship. It’s not just saying words and getting some words back, there is actually something else going on when two people talk to each other. It seems like a lot of that other stuff that happens around language is being completely excluded from this conversation about large language models.

Emily: Yeah, exactly. Language is about community, it’s about communication, it’s about joining intentions with other people. All of those things produce forms, but the form is not the experience. So in this New York Times magazine piece that Steven Johnson wrote, I think it was last year, he said something about how GPT-3 could write essays better than the average high school student. My reaction to that was, this entirely misses the point of writing essays in high school. It’s not that we have to keep the world’s supply of essays topped up. It’s about the experience of doing the writing. Just like we’re saying, in a therapeutic relationship, it’s about the experience of being heard by someone who’s sympathetic, and also someone who is skilled at figuring out what questions to ask next. The idea that you could just replace that with synthetic text is frightening, especially because many people who are in therapy are there because they are specifically vulnerable and in danger. You absolutely don’t want to throw a giant Magic 8 Ball in as their conversation partner there.

Alex: Right, which brings me to the intention behind this. What is ChatGPT meant to do?

Emily: Well, that’s a great question. Timnit Gebru, in a talk about what she calls the TESCREAL ideology bundle (there’s a bunch of ideologies like transhumanism and longtermism that are all connected with AGI), makes the point that AGI, artificial general intelligence, as an unscoped system. It doesn’t have a specific function. If you’re doing serious engineering as Timnit reminds us in that talk, you start by scoping the problem and saying, What’s this for? What are the dangers? How do I test that It is performing its function well and safely? And ChatGPT isn’t for anything in particular, and that makes it an incredibly shaky foundation for anything you might try to use it for.

Alex: Yeah, no, I can’t think of too many things that would be useful for, beyond just playing with it.

Emily: Yeah. These might be variants of playing with it, but you could imagine that some people like using it as a conversation partner in a language that they’re learning, you could imagine using it to have the dialogue with nonplayer characters in a video game be more varied and entertaining, you could imagine using it as a brainstorming partner as you’re trying to get started with a writing project of some kind. None of those are risk-free, because one of the things about large language models is that they will suck up and amplify all of the biases in the text. 

Alex: For example, I’ve seen examples of people of ChatGPT assigning specific genders to specific jobs, which would be a bias it learned from the text it was trained on.

Emily: Yeah, exactly.

Laura: A question that we’ve been throwing around, too, is whether this disconnection from the truth is actually a bug in the technology or something that companies will use as a feature.

Emily: Arguably yes. I would wonder what the FTC would say about that. They’ve been coming out with some fantastic blog posts recently, putting the so-called A.I. companies on notice around truth and advertising, around the safety and reliability of their products.

Laura: I saw a great thread by one of these A.I. booster guys on Twitter who were listing all these potential uses for A.I., and one was like you could generate incredibly realistic photos of the food you’re selling. And yet there is actually a law about truth in advertising in food that you can’t generate the most delicious looking burger if that’s not the burger you sell. There are actually some regulations that already preclude the use of this technology.

Emily: Yeah. And health, including mental health, is one of them. There was a one of these boosters online who posted a video of apparently a large language model driven and then text-to-speech set up where this feminine sounding voice was introducing itself as a cognitive behavioral therapist and saying what CBT is, and then inviting the client to do a CBT session. And someone tagged me, so I saw it on Twitter and I was like, “Psychotherapists are a regulated workforce, like you actually have to be licensed.” So where is this guy? And I looked at his Twitter bio and it was Texas. I did a little Google search and I found the relevant board in the state of Texas and their link for reporting fraudulent claims. I replied with that. Bam, the tweet was gone.

Laura: So you’ve thought about the kinds of regulations that might be useful in this space. Is there something broader you would recommend beyond just referring to regulations that already exist in certain fields?

Emily: Yeah. Recently the FTC, the EEOC, the Department of Justice, and the Consumer Financial Protection Bureau came out with a joint statement saying, Hey, we regulate the activities of people in corporations, and the fact that those activities are being done via automation now doesn’t change that and doesn’t change our jurisdiction. So that’s great. I’m not a policymaker, so I can tell you the effects I would like to see. I can’t tell you how to go about making them happen.

Laura: What are some of those effects?

Emily: I would like to see transparency. I would like to see it be the case that you can always tell when you have encountered synthetic media text or image that you can always go back and find out about how the system was actually trained. Another one is recourse and I think a lot of the proposed regulations are heading in this direction. Certainly in Europe, maybe also in the U.S. Basically, if a decision is made about me through an automated means, I should be able to question it and find out why and talk to a person who would then consider whether it should be changed. The third thing is accountability. How would the world be different if OpenAI and Google and Microsoft were actually accountable for the output of the tech-synthesis machines? I’m guessing we wouldn’t have so much of them. Who would wanna be accountable for this random output?

Laura: Right. The timing of this kind of regulation feels really important. Based on what we’ve seen with social media regulation in that space where basically the whole thing exploded, regulators didn’t get there in time, and then social media companies basically said, “But everyone’s on social media now and you can’t just walk this back.”

Emily: On the other hand, we used to have leaded gasoline and we discovered that that was a problem, so regulations came in, it was phased out. We used to have CFCs creating problems for the ozone layer; we put in regulations and they’re gone. Just because things have been built around it doesn’t mean you can’t roll it back, but I think you’re right that the sooner the better. Another part of that narrative is the tech companies like to say, “Technology is moving so fast, the regulatory agencies can’t possibly keep up. You’re just going to have to deal.”

Laura: Because the technology is just moving on its own. No human, no humans involved, and moving it forward that quickly.

Emily: Yeah. So there’s that angle, but I think that it’s valuable to think in terms of not falling for that and basically saying, “No, regulation is about protecting rights, and those don’t change.” And that’s what the FCC and all are doing right now about saying, “No, we regulate the activities of corporations.” I think they said something like there’s no A.I. loophole here. This comes around to the terminology about A.I. There’s this wonderful rephrasing from Stefano Quintarelli, an Italian researcher, who suggested that we stop using the term A.I. and start using instead Systematic Approaches to Learning Algorithms and Machine Inferences, because the acronym of that is SALAMI.

Alex: See that is… I could get behind that.

Emily: So then if you ask a question, this is reading from his blog post still, “Will SALAMI have emotions? Can SALAMI acquire a “personality” similar to humans’?” you can see how absurd it is. But when we talk about artificial intelligence or machine cognition or any of these phrases, it makes it sound like it’s something that we could maybe one day assign accountability to. That’s dangerous, right? The accountability sits with the people who are making the things, who are making the decisions, not the automation.

Laura: I think also that a lot of these conversations about regulation have been derailed by people who are saying A.I. ethics is really important, but actually the questions that we are asking are: Will these things surpass humans and wield power over us?

Emily: There’s a whole lot of energy around what often gets called A.I. safety: How do we build these machines so that when we see the autonomy to them, they take good care of us? Those questions to me feel like an enormous distraction from the current urgent harms that are going on. So there’s the exploiting of labor practices behind today’s so-called A.I. systems. There’s all of the biases that then get pulled into these systems and reinforced, which cause problems. Automatic decision systems in the allocation of social benefits, for example, and there’s the way that recommendation systems can amplify harmful ideas, and this is the social media effect that you were referring to earlier. All of this is real existent current harm, but grappling with that harm means recognizing if you are a tech VC or a tech company CEO, etc., that you’ve caused some of that harm.

Alex: I do think, with a lot of this crowd, those sci-fi questions just serve to help ease whatever culpability these people might feel about their part in the problems facing us right now.

Emily: Exactly.

Laura: This idea that “well, this is an artificial mind” allows CEOs to tell a story about laying people off and replacing a skilled workforce with an inferior one, in many cases, that’s cheaper, because it sounds like I’m pushing my company to the future when actually, if it’s SALAMI, we’re using a crappy tool that doesn’t do stuff as well as humans. So my final question is: Does this stuff even have to improve that much if you have the hype around it to cause harm?

Emily: It’s absolutely causing harm in many ways right now. They don’t have to become sentient singularity machines to cause harm. But maybe it’s best not to talk about ChatGPT causing the harm, but rather the people deciding to use it in that way who are doing it.

Alex: Right. If a bad algorithm is responsible for you losing your job, it’s not because a super intelligent computer did it. It’s because someone chose to use a tool for that purpose.

Emily: Exactly. And they maybe got rewarded for it because, as you say, they are making it look like they’re taking their company into the future. I really hope that we get more and more people thinking about what’s the future that we actually want.

Laura: Thank you so much for talking to us.

Emily: My pleasure.

Alex: Emily M. Bender’s post, “On NY Times Magazine on AI: Resist the Urge to Be Impressed,”  is available on Medium and as an audio paper on SoundCloud.

Laura: The Politics of Everything is co-produced by Talkhouse.

Alex: Emily Cooke is our executive producer.

Laura: Lorraine Cademartori produced this episode.

Alex: Myron Kaplan is our audio editor.

Laura: If you enjoy The Politics for Everything and you want to support us, one thing you can do is rate and review the show. Every review helps.

Alex: Thanks for listening.