In 2013 and 2014, I wrote extensively about new revelations regarding NSA surveillance based on the documents provided by Edward Snowden. But I had a more personal involvement as well.

I wrote the essay below in September 2013. The New Yorker agreed to publish it, but the Guardian asked me not to. It was scared of UK law enforcement, and worried that this essay would reflect badly on it. And given that the UK police would raid its offices in July 2014, it had legitimate cause to be worried.

Now, ten years later, I offer this as a time capsule of what those early months of Snowden were like.


It’s a surreal experience, paging through hundreds of top-secret NSA documents. You’re peering into a forbidden world: strange, confusing, and fascinating all at the same time.

I had flown down to Rio de Janeiro in late August at the request of Glenn Greenwald. He had been working on the Edward Snowden archive for a couple of months, and had a pile of more technical documents that he wanted help interpreting. According to Greenwald, Snowden also thought that bringing me down was a good idea.

It made sense. I didn’t know either of them, but I have been writing about cryptography, security, and privacy for decades. I could decipher some of the technical language that Greenwald had difficulty with, and understand the context and importance of various document. And I have long been publicly critical of the NSA’s eavesdropping capabilities. My knowledge and expertise could help figure out which stories needed to be reported.

I thought about it a lot before agreeing. This was before David Miranda, Greenwald’s partner, was detained at Heathrow airport by the UK authorities; but even without that, I knew there was a risk. I fly a lot—a quarter of a million miles per year—and being put on a TSA list, or being detained at the US border and having my electronics confiscated, would be a major problem. So would the FBI breaking into my home and seizing my personal electronics. But in the end, that made me more determined to do it.

I did spend some time on the phone with the attorneys recommended to me by the ACLU and the EFF. And I talked about it with my partner, especially when Miranda was detained three days before my departure. Both Greenwald and his employer, the Guardian, are careful about whom they show the documents to. They publish only those portions essential to getting the story out. It was important to them that I be a co-author, not a source. I didn’t follow the legal reasoning, but the point is that the Guardian doesn’t want to leak the documents to random people. It will, however, write stories in the public interest, and I would be allowed to review the documents as part of that process. So after a Skype conversation with someone at the Guardian, I signed a letter of engagement.

And then I flew to Brazil.

I saw only a tiny slice of the documents, and most of what I saw was surprisingly banal. The concerns of the top-secret world are largely tactical: system upgrades, operational problems owing to weather, delays because of work backlogs, and so on. I paged through weekly reports, presentation slides from status meetings, and general briefings to educate visitors. Management is management, even inside the NSA Reading the documents, I felt as though I were sitting through some of those endless meetings.

The meeting presenters try to spice things up. Presentations regularly include intelligence success stories. There were details—what had been found, and how, and where it helped—and sometimes there were attaboys from “customers” who used the intelligence. I’m sure these are intended to remind NSA employees that they’re doing good. It definitely had an effect on me. Those were all things I want the NSA to be doing.

There were so many code names. Everything has one: every program, every piece of equipment, every piece of software. Sometimes code names had their own code names. The biggest secrets seem to be the underlying real-world information: which particular company MONEYROCKET is; what software vulnerability EGOTISTICALGIRAFFE—really, I am not making that one up—is; how TURBINE works. Those secrets collectively have a code name—ECI, for exceptionally compartmented information—and almost never appear in the documents. Chatting with Snowden on an encrypted IM connection, I joked that the NSA cafeteria menu probably has code names for menu items. His response: “Trust me when I say you have no idea.”

Those code names all come with logos, most of them amateurish and a lot of them dumb. Note to the NSA: take some of that more than ten-billion-dollar annual budget and hire yourself a design firm. Really; it’ll pay off in morale.

Once in a while, though, I would see something that made me stop, stand up, and pace around in circles. It wasn’t that what I read was particularly exciting, or important. It was just that it was startling. It changed—ever so slightly—how I thought about the world.

Greenwald said that that reaction was normal when people started reading through the documents.

Intelligence professionals talk about how disorienting it is living on the inside. You read so much classified information about the world’s geopolitical events that you start seeing the world differently. You become convinced that only the insiders know what’s really going on, because the news media is so often wrong. Your family is ignorant. Your friends are ignorant. The world is ignorant. The only thing keeping you from ignorance is that constant stream of classified knowledge. It’s hard not to feel superior, not to say things like “If you only knew what we know” all the time. I can understand how General Keith Alexander, the director of the NSA, comes across as so supercilious; I only saw a minute fraction of that secret world, and I started feeling it.

It turned out to be a terrible week to visit Greenwald, as he was still dealing with the fallout from Miranda’s detention. Two other journalists, one from the Nation and the other from the Hindu, were also in town working with him. A lot of my week involved Greenwald rushing into my hotel room, giving me a thumb drive of new stuff to look through, and rushing out again.

A technician from the Guardian got a search capability working while I was there, and I spent some time with it. Question: when you’re given the capability to search through a database of NSA secrets, what’s the first thing you look for? Answer: your name.

It wasn’t there. Neither were any of the algorithm names I knew, not even algorithms I knew that the US government used.

I tried to talk to Greenwald about his own operational security. It had been incredibly stupid for Miranda to be traveling with NSA documents on the thumb drive. Transferring files electronically is what encryption is for. I told Greenwald that he and Laura Poitras should be sending large encrypted files of dummy documents back and forth every day.

Once, at Greenwald’s home, I walked into the backyard and looked for TEMPEST receivers hiding in the trees. I didn’t find any, but that doesn’t mean they weren’t there. Greenwald has a lot of dogs, but I don’t think that would hinder professionals. I’m sure that a bunch of major governments have a complete copy of everything Greenwald has. Maybe the black bag teams bumped into each other in those early weeks.

I started doubting my own security procedures. Reading about the NSA’s hacking abilities will do that to you. Can it break the encryption on my hard drive? Probably not. Has the company that makes my encryption software deliberately weakened the implementation for it? Probably. Are NSA agents listening in on my calls back to the US? Very probably. Could agents take control of my computer over the Internet if they wanted to? Definitely. In the end, I decided to do my best and stop worrying about it. It was the agency’s documents, after all. And what I was working on would become public in a few weeks.

I wasn’t sleeping well, either. A lot of it was the sheer magnitude of what I saw. It’s not that any of it was a real surprise. Those of us in the information security community had long assumed that the NSA was doing things like this. But we never really sat down and figured out the details, and to have the details confirmed made a big difference. Maybe I can make it clearer with an analogy. Everyone knows that death is inevitable; there’s absolutely no surprise about that. Yet it arrives as a surprise, because we spend most of our lives refusing to think about it. The NSA documents were a bit like that. Knowing that it is surely true that the NSA is eavesdropping on the world, and doing it in such a methodical and robust manner, is very different from coming face-to-face with the reality that it is and the details of how it is doing it.

I also found it incredibly difficult to keep the secrets. The Guardian’s process is slow and methodical. I move much faster. I drafted stories based on what I found. Then I wrote essays about those stories, and essays about the essays. Writing was therapy; I would wake up in the wee hours of the morning, and write an essay. But that put me at least three levels beyond what was published.

Now that my involvement is out, and my first essays are out, I feel a lot better. I’m sure it will get worse again when I find another monumental revelation; there are still more documents to go through.

I’ve heard it said that Snowden wants to damage America. I can say with certainty that he does not. So far, everyone involved in this incident has been incredibly careful about what is released to the public. There are many documents that could be immensely harmful to the US, and no one has any intention of releasing them. The documents the reporters release are carefully redacted. Greenwald and I repeatedly debated with Guardian editors the newsworthiness of story ideas, stressing that we would not expose government secrets simply because they’re interesting.

The NSA got incredibly lucky; this could have ended with a massive public dump like Chelsea Manning’s State Department cables. I suppose it still could. Despite that, I can imagine how this feels to the NSA. It’s used to keeping this stuff behind multiple levels of security: gates with alarms, armed guards, safe doors, and military-grade cryptography. It’s not supposed to be on a bunch of thumb drives in Brazil, Germany, the UK, the US, and who knows where else, protected largely by some random people’s opinions about what should or should not remain secret. This is easily the greatest intelligence failure in the history of ever. It’s amazing that one person could have had so much access with so little accountability, and could sneak all of this data out without raising any alarms. The odds are close to zero that Snowden is the first person to do this; he’s just the first person to make public that he did. It’s a testament to General Alexander’s power that he hasn’t been forced to resign.

It’s not that we weren’t being careful about security, it’s that our standards of care are so different. From the NSA’s point of view, we’re all major security risks, myself included. I was taking notes about classified material, crumpling them up, and throwing them into the wastebasket. I was printing documents marked “TOP SECRET/COMINT/NOFORN” in a hotel lobby. And once, I took the wrong thumb drive with me to dinner, accidentally leaving the unencrypted one filled with top-secret documents in my hotel room. It was an honest mistake; they were both blue.

If I were an NSA employee, the policy would be to fire me for that alone.

Many have written about how being under constant surveillance changes a person. When you know you’re being watched, you censor yourself. You become less open, less spontaneous. You look at what you write on your computer and dwell on what you’ve said on the telephone, wonder how it would sound taken out of context, from the perspective of a hypothetical observer. You’re more likely to conform. You suppress your individuality. Even though I have worked in privacy for decades, and already knew a lot about the NSA and what it does, the change was palpable. That feeling hasn’t faded. I am now more careful about what I say and write. I am less trusting of communications technology. I am less trusting of the computer industry.

After much discussion, Greenwald and I agreed to write three stories together to start. All of those are still in progress. In addition, I wrote two commentaries on the Snowden documents that were recently made public. There’s a lot more to come; even Greenwald hasn’t looked through everything.

Since my trip to Brazil [one month before], I’ve flown back to the US once and domestically seven times—all without incident. I’m not on any list yet. At least, none that I know about.


As it happened, I didn’t write much more with Greenwald or the Guardian. Those two had a falling out, and by the time everything settled and both began writing about the documents independently—Greenwald at the newly formed website the Intercept—I got cut out of the process somehow. I remember hearing that Greenwald was annoyed with me, but I never learned the reason. We haven’t spoken since.

Still, I was happy with the one story I was part of: how the NSA hacks Tor. I consider it a personal success that I pushed the Guardian to publish NSA documents detailing QUANTUM. I don’t think that would have gotten out any other way. And I still use those pages today when I teach cybersecurity to policymakers at the Harvard Kennedy School.

Other people wrote about the Snowden files, and wrote a lot. It was a slow trickle at first, and then a more consistent flow. Between Greenwald, Bart Gellman, and the Guardian reporters, there ended up being steady stream of news. (Bart brought in Ashkan Soltani to help him with the technical aspects, which was a great move on his part, even if it cost Ashkan a government job later.) More stories were covered by other publications.

It started getting weird. Both Greenwald and Gellman held documents back so they could publish them in their books. Jake Appelbaum, who had not yet been accused of sexual assault by multiple women, was working with Laura Poitras. He partnered with Spiegel to release an implant catalog from the NSA’s Tailored Access Operations group. To this day, I am convinced that that document was not in the Snowden archives: that Jake got it somehow, and it was released with the implication that it was from Edward Snowden. I thought it was important enough that I started writing about each item in that document in my blog: “NSA Exploit of the Week.” That got my website blocked by the DoD: I keep a framed print of the censor’s message on my wall.

Perhaps the most surreal document disclosures were when artists started writing fiction based on the documents. This was in 2016, when Poitras built a secure room in New York to house the documents. By then, the documents were years out of date. And now they’re over a decade out of date. (They were leaked in 2013, but most of them were from 2012 or before.)

I ended up being something of a public ambassador for the documents. When I got back from Rio, I gave talks at a private conference in Woods Hole, the Berkman Center at Harvard, something called the Congress and Privacy and Surveillance in Geneva, events at both CATO and New America in DC, an event at the University of Pennsylvania, an event at EPIC and a “Stop Watching Us” rally in DC, the RISCS conference in London, the ISF in Paris, and…then…at the IETF meeting in Vancouver in November 2013. (I remember little of this; I am reconstructing it all from my calendar.)

What struck me at the IETF was the indignation in the room, and the calls to action. And there was action, across many fronts. We technologists did a lot to help secure the Internet, for example.

The government didn’t do its part, though. Despite the public outcry, investigations by Congress, pronouncements by President Obama, and federal court rulings, I don’t think much has changed. The NSA canceled a program here and a program there, and it is now more public about defense. But I don’t think it is any less aggressive about either bulk or targeted surveillance. Certainly its government authorities haven’t been restricted in any way. And surveillance capitalism is still the business model of the Internet.

And Edward Snowden? We were in contact for a while on Signal. I visited him once in Moscow, in 2016. And I had him do an guest lecture to my class at Harvard for a few years, remotely by Jitsi. Afterwards, I would hold a session where I promised to answer every question he would evade or not answer, explain every response he did give, and be candid in a way that someone with an outstanding arrest warrant simply cannot. Sometimes I thought I could channel Snowden better than he could.

But now it’s been a decade. Everything he knows is old and out of date. Everything we know is old and out of date. The NSA suffered an even worse leak of its secrets by the Russians, under the guise of the Shadow Brokers, in 2016 and 2017. The NSA has rebuilt. It again has capabilities we can only surmise.

This essay previously appeared in an IETF publication, as part of an Edward Snowden ten-year retrospective.

EDITED TO ADD (6/7): Conversation between Snowden, Greenwald, and Poitras.

In February, Meta released its large language model: LLaMA. Unlike OpenAI and its ChatGPT, Meta didn’t just give the world a chat window to play with. Instead, it released the code into the open-source community, and shortly thereafter the model itself was leaked. Researchers and programmers immediately started modifying it, improving it, and getting it to do things no one else anticipated. And their results have been immediate, innovative, and an indication of how the future of this technology is going to play out. Training speeds have hugely increased, and the size of the models themselves has shrunk to the point that you can create and run them on a laptop. The world of AI research has dramatically changed.

This development hasn’t made the same splash as other corporate announcements, but its effects will be much greater. It will wrest power from the large tech corporations, resulting in both much more innovation and a much more challenging regulatory landscape. The large corporations that had controlled these models warn that this free-for-all will lead to potentially dangerous developments, and problematic uses of the open technology have already been documented. But those who are working on the open models counter that a more democratic research environment is better than having this powerful technology controlled by a small number of corporations.

The power shift comes from simplification. The LLMs built by OpenAI and Google rely on massive data sets, measured in the tens of billions of bytes, computed on by tens of thousands of powerful specialized processors producing models with billions of parameters. The received wisdom is that bigger data, bigger processing, and larger parameter sets were all needed to make a better model. Producing such a model requires the resources of a corporation with the money and computing power of a Google or Microsoft or Meta.

But building on public models like Meta’s LLaMa, the open-source community has innovated in ways that allow results nearly as good as the huge models—but run on home machines with common data sets. What was once the reserve of the resource-rich has become a playground for anyone with curiosity, coding skills, and a good laptop. Bigger may be better, but the open-source community is showing that smaller is often good enough. This opens the door to more efficient, accessible, and resource-friendly LLMs.

More importantly, these smaller and faster LLMs are much more accessible and easier to experiment with. Rather than needing tens of thousands of machines and millions of dollars to train a new model, an existing model can now be customized on a mid-priced laptop in a few hours. This fosters rapid innovation.

It also takes control away from large companies like Google and OpenAI. By providing access to the underlying code and encouraging collaboration, open-source initiatives empower a diverse range of developers, researchers, and organizations to shape the technology. This diversification of control helps prevent undue influence, and ensures that the development and deployment of AI technologies align with a broader set of values and priorities. Much of the modern internet was built on open-source technologies from the LAMP (Linux, Apache, mySQL, and PHP/PERL/Python) stack—a suite of applications often used in web development. This enabled sophisticated websites to be easily constructed, all with open-source tools that were built by enthusiasts, not companies looking for profit. Facebook itself was originally built using open-source PHP.

But being open-source also means that there is no one to hold responsible for misuse of the technology. When vulnerabilities are discovered in obscure bits of open-source technology critical to the functioning of the internet, often there is no entity responsible for fixing the bug. Open-source communities span countries and cultures, making it difficult to ensure that any country’s laws will be respected by the community. And having the technology open-sourced means that those who wish to use it for unintended, illegal, or nefarious purposes have the same access to the technology as anyone else.

This, in turn, has significant implications for those who are looking to regulate this new and powerful technology. Now that the open-source community is remixing LLMs, it’s no longer possible to regulate the technology by dictating what research and development can be done; there are simply too many researchers doing too many different things in too many different countries. The only governance mechanism available to governments now is to regulate usage (and only for those who pay attention to the law), or to offer incentives to those (including startups, individuals, and small companies) who are now the drivers of innovation in the arena. Incentives for these communities could take the form of rewards for the production of particular uses of the technology, or hackathons to develop particularly useful applications. Sticks are hard to use—instead, we need appealing carrots.

It is important to remember that the open-source community is not always motivated by profit. The members of this community are often driven by curiosity, the desire to experiment, or the simple joys of building. While there are companies that profit from supporting software produced by open-source projects like Linux, Python, or the Apache web server, those communities are not profit driven.

And there are many open-source models to choose from. Alpaca, Cerebras-GPT, Dolly, HuggingChat, and StableLM have all been released in the past few months. Most of them are built on top of LLaMA, but some have other pedigrees. More are on their way.

The large tech monopolies that have been developing and fielding LLMs—Google, Microsoft, and Meta—are not ready for this. A few weeks ago, a Google employee leaked a memo in which an engineer tried to explain to his superiors what an open-source LLM means for their own proprietary tech. The memo concluded that the open-source community has lapped the major corporations and has an overwhelming lead on them.

This isn’t the first time companies have ignored the power of the open-source community. Sun never understood Linux. Netscape never understood the Apache web server. Open source isn’t very good at original innovations, but once an innovation is seen and picked up, the community can be a pretty overwhelming thing. The large companies may respond by trying to retrench and pulling their models back from the open-source community.

But it’s too late. We have entered an era of LLM democratization. By showing that smaller models can be highly effective, enabling easy experimentation, diversifying control, and providing incentives that are not profit motivated, open-source initiatives are moving us into a more dynamic and inclusive AI landscape. This doesn’t mean that some of these models won’t be biased, or wrong, or used to generate disinformation or abuse. But it does mean that controlling this technology is going to take an entirely different approach than regulating the large players.

This essay was written with Jim Waldo, and previously appeared on Slate.com.

EDITED TO ADD (6/4): Slashdot thread.

Ted Chiang has an excellent essay in the New Yorker: “Will A.I. Become the New McKinsey?”

The question we should be asking is: as A.I. becomes more powerful and flexible, is there any way to keep it from being another version of McKinsey? The question is worth considering across different meanings of the term “A.I.” If you think of A.I. as a broad set of technologies being marketed to companies to help them cut their costs, the question becomes: how do we keep those technologies from working as “capital’s willing executioners”? Alternatively, if you imagine A.I. as a semi-autonomous software program that solves problems that humans ask it to solve, the question is then: how do we prevent that software from assisting corporations in ways that make people’s lives worse? Suppose you’ve built a semi-autonomous A.I. that’s entirely obedient to humans­—one that repeatedly checks to make sure it hasn’t misinterpreted the instructions it has received. This is the dream of many A.I. researchers. Yet such software could easily still cause as much harm as McKinsey has.

Note that you cannot simply say that you will build A.I. that only offers pro-social solutions to the problems you ask it to solve. That’s the equivalent of saying that you can defuse the threat of McKinsey by starting a consulting firm that only offers such solutions. The reality is that Fortune 100 companies will hire McKinsey instead of your pro-social firm, because McKinsey’s solutions will increase shareholder value more than your firm’s solutions will. It will always be possible to build A.I. that pursues shareholder value above all else, and most companies will prefer to use that A.I. instead of one constrained by your principles.

EDITED TO ADD: Ted Chiang’s previous essay, “ChatGPT Is a Blurry JPEG of the Web” is also worth reading.

We will all soon get into the habit of using AI tools for help with everyday problems and tasks. We should get in the habit of questioning the motives, incentives, and capabilities behind them, too.

Imagine you’re using an AI chatbot to plan a vacation. Did it suggest a particular resort because it knows your preferences, or because the company is getting a kickback from the hotel chain? Later, when you’re using another AI chatbot to learn about a complex economic issue, is the chatbot reflecting your politics or the politics of the company that trained it?

For AI to truly be our assistant, it needs to be trustworthy. For it to be trustworthy, it must be under our control; it can’t be working behind the scenes for some tech monopoly. This means, at a minimum, the technology needs to be transparent. And we all need to understand how it works, at least a little bit.

Amid the myriad warnings about creepy risks to well-being, threats to democracy, and even existential doom that have accompanied stunning recent developments in artificial intelligence (AI)—and large language models (LLMs) like ChatGPT and GPT-4—one optimistic vision is abundantly clear: this technology is useful. It can help you find information, express your thoughts, correct errors in your writing, and much more. If we can navigate the pitfalls, its assistive benefit to humanity could be epoch-defining. But we’re not there yet.

Let’s pause for a moment and imagine the possibilities of a trusted AI assistant. It could write the first draft of anything: emails, reports, essays, even wedding vows. You would have to give it background information and edit its output, of course, but that draft would be written by a model trained on your personal beliefs, knowledge, and style. It could act as your tutor, answering questions interactively on topics you want to learn about—in the manner that suits you best and taking into account what you already know. It could assist you in planning, organizing, and communicating: again, based on your personal preferences. It could advocate on your behalf with third parties: either other humans or other bots. And it could moderate conversations on social media for you, flagging misinformation, removing hate or trolling, translating for speakers of different languages, and keeping discussions on topic; or even mediate conversations in physical spaces, interacting through speech recognition and synthesis capabilities.

Today’s AIs aren’t up for the task. The problem isn’t the technology—that’s advancing faster than even the experts had guessed—it’s who owns it. Today’s AIs are primarily created and run by large technology companies, for their benefit and profit. Sometimes we are permitted to interact with the chatbots, but they’re never truly ours. That’s a conflict of interest, and one that destroys trust.

The transition from awe and eager utilization to suspicion to disillusionment is a well worn one in the technology sector. Twenty years ago, Google’s search engine rapidly rose to monopolistic dominance because of its transformative information retrieval capability. Over time, the company’s dependence on revenue from search advertising led them to degrade that capability. Today, many observers look forward to the death of the search paradigm entirely. Amazon has walked the same path, from honest marketplace to one riddled with lousy products whose vendors have paid to have the company show them to you. We can do better than this. If each of us are going to have an AI assistant helping us with essential activities daily and even advocating on our behalf, we each need to know that it has our interests in mind. Building trustworthy AI will require systemic change.

First, a trustworthy AI system must be controllable by the user. That means that the model should be able to run on a user’s owned electronic devices (perhaps in a simplified form) or within a cloud service that they control. It should show the user how it responds to them, such as when it makes queries to search the web or external services, when it directs other software to do things like sending an email on a user’s behalf, or modifies the user’s prompts to better express what the company that made it thinks the user wants. It should be able to explain its reasoning to users and cite its sources. These requirements are all well within the technical capabilities of AI systems.

Furthermore, users should be in control of the data used to train and fine-tune the AI system. When modern LLMs are built, they are first trained on massive, generic corpora of textual data typically sourced from across the Internet. Many systems go a step further by fine-tuning on more specific datasets purpose built for a narrow application, such as speaking in the language of a medical doctor, or mimicking the manner and style of their individual user. In the near future, corporate AIs will be routinely fed your data, probably without your awareness or your consent. Any trustworthy AI system should transparently allow users to control what data it uses.

Many of us would welcome an AI-assisted writing application fine tuned with knowledge of which edits we have accepted in the past and which we did not. We would be more skeptical of a chatbot knowledgeable about which of their search results led to purchases and which did not.

You should also be informed of what an AI system can do on your behalf. Can it access other apps on your phone, and the data stored with them? Can it retrieve information from external sources, mixing your inputs with details from other places you may or may not trust? Can it send a message in your name (hopefully based on your input)? Weighing these types of risks and benefits will become an inherent part of our daily lives as AI-assistive tools become integrated with everything we do.

Realistically, we should all be preparing for a world where AI is not trustworthy. Because AI tools can be so incredibly useful, they will increasingly pervade our lives, whether we trust them or not. Being a digital citizen of the next quarter of the twenty-first century will require learning the basic ins and outs of LLMs so that you can assess their risks and limitations for a given use case. This will better prepare you to take advantage of AI tools, rather than be taken advantage by them.

In the world’s first few months of widespread use of models like ChatGPT, we’ve learned a lot about how AI creates risks for users. Everyone has heard by now that LLMs “hallucinate,” meaning that they make up “facts” in their outputs, because their predictive text generation systems are not constrained to fact check their own emanations. Many users learned in March that information they submit as prompts to systems like ChatGPT may not be kept private after a bug revealed users’ chats. Your chat histories are stored in systems that may be insecure.

Researchers have found numerous clever ways to trick chatbots into breaking their safety controls; these work largely because many of the “rules” applied to these systems are soft, like instructions given to a person, rather than hard, like coded limitations on a product’s functions. It’s as if we are trying to keep AI safe by asking it nicely to drive carefully, a hopeful instruction, rather than taking away its keys and placing definite constraints on its abilities.

These risks will grow as companies grant chatbot systems more capabilities. OpenAI is providing developers wide access to build tools on top of GPT: tools that give their AI systems access to your email, to your personal account information on websites, and to computer code. While OpenAI is applying safety protocols to these integrations, it’s not hard to imagine those being relaxed in a drive to make the tools more useful. It seems likewise inevitable that other companies will come along with less bashful strategies for securing AI market share.

Just like with any human, building trust with an AI will be hard won through interaction over time. We will need to test these systems in different contexts, observe their behavior, and build a mental model for how they will respond to our actions. Building trust in that way is only possible if these systems are transparent about their capabilities, what inputs they use and when they will share them, and whose interests they are evolving to represent.

This essay was written with Nathan Sanders, and previously appeared on Gizmodo.com.

Earlier this week, the Republican National Committee released a video that it claims was “built entirely with AI imagery.” The content of the ad isn’t especially novel—a dystopian vision of America under a second term with President Joe Biden—but the deliberate emphasis on the technology used to create it stands out: It’s a “Daisy” moment for the 2020s.

We should expect more of this kind of thing. The applications of AI to political advertising have not escaped campaigners, who are already “pressure testing” possible uses for the technology. In the 2024 presidential election campaign, you can bank on the appearance of AI-generated personalized fundraising emails, text messages from chatbots urging you to vote, and maybe even some deepfaked campaign avatars. Future candidates could use chatbots trained on data representing their views and personalities to approximate the act of directly connecting with people. Think of it like a whistle-stop tour with an appearance in every living room. Previous technological revolutions—railroad, radio, television, and the World Wide Web—transformed how candidates connect to their constituents, and we should expect the same from generative AI. This isn’t science fiction: The era of AI chatbots standing in as avatars for real, individual people has already begun, as the journalist Casey Newton made clear in a 2016 feature about a woman who used thousands of text messages to create a chatbot replica of her best friend after he died.

The key is interaction. A candidate could use tools enabled by large language models, or LLMs—the technology behind apps such as ChatGPT and the art-making DALL-E—to do micro-polling or message testing, and to solicit perspectives and testimonies from their political audience individually and at scale. The candidates could potentially reach any voter who possesses a smartphone or computer, not just the ones with the disposable income and free time to attend a campaign rally. At its best, AI could be a tool to increase the accessibility of political engagement and ease polarization. At its worst, it could propagate misinformation and increase the risk of voter manipulation. Whatever the case, we know political operatives are using these tools. To reckon with their potential now isn’t buying into the hype—it’s preparing for whatever may come next.

On the positive end, and most profoundly, LLMs could help people think through, refine, or discover their own political ideologies. Research has shown that many voters come to their policy positions reflexively, out of a sense of partisan affiliation. The very act of reflecting on these views through discourse can change, and even depolarize, those views. It can be hard to have reflective policy conversations with an informed, even-keeled human discussion partner when we all live within a highly charged political environment; this is a role almost custom-designed for LLM. In US politics, it is a truism that the most valuable resource in a campaign is time. People are busy and distracted. Campaigns have a limited window to convince and activate voters. Money allows a candidate to purchase time: TV commercials, labor from staffers, and fundraising events to raise even more money. LLMs could provide campaigns with what is essentially a printing press for time.

If you were a political operative, which would you rather do: play a short video on a voter’s TV while they are folding laundry in the next room, or exchange essay-length thoughts with a voter on your candidate’s key issues? A staffer knocking on doors might need to canvass 50 homes over two hours to find one voter willing to have a conversation. OpenAI charges pennies to process about 800 words with its latest GPT-4 model, and that cost could fall dramatically as competitive AIs become available. People seem to enjoy interacting with chatbots; Open’s product reportedly has the fastest-growing user base in the history of consumer apps.

Optimistically, one possible result might be that we’ll get less annoyed with the deluge of political ads if their messaging is more usefully tailored to our interests by AI tools. Though the evidence for microtargeting’s effectiveness is mixed at best, some studies show that targeting the right issues to the right people can persuade voters. Expecting more sophisticated, AI-assisted approaches to be more consistently effective is reasonable. And anything that can prevent us from seeing the same 30-second campaign spot 20 times a day seems like a win.

AI can also help humans effectuate their political interests. In the 2016 US presidential election, primitive chatbots had a role in donor engagement and voter-registration drives: simple messaging tasks such as helping users pre-fill a voter-registration form or reminding them where their polling place is. If it works, the current generation of much more capable chatbots could supercharge small-dollar solicitations and get-out-the-vote campaigns.

And the interactive capability of chatbots could help voters better understand their choices. An AI chatbot could answer questions from the perspective of a candidate about the details of their policy positions most salient to an individual user, or respond to questions about how a candidate’s stance on a national issue translates to a user’s locale. Political organizations could similarly use them to explain complex policy issues, such as those relating to the climate or health care or…anything, really.

Of course, this could also go badly. In the time-honored tradition of demagogues worldwide, the LLM could inconsistently represent the candidate’s views to appeal to the individual proclivities of each voter.

In fact, the fundamentally obsequious nature of the current generation of large language models results in them acting like demagogues. Current LLMs are known to hallucinate—or go entirely off-script—and produce answers that have no basis in reality. These models do not experience emotion in any way, but some research suggests they have a sophisticated ability to assess the emotion and tone of their human users. Although they weren’t trained for this purpose, ChatGPT and its successor, GPT-4, may already be pretty good at assessing some of their users’ traits—say, the likelihood that the author of a text prompt is depressed. Combined with their persuasive capabilities, that means that they could learn to skillfully manipulate the emotions of their human users.

This is not entirely theoretical. A growing body of evidence demonstrates that interacting with AI has a persuasive effect on human users. A study published in February prompted participants to co-write a statement about the benefits of social-media platforms for society with an AI chatbot configured to have varying views on the subject. When researchers surveyed participants after the co-writing experience, those who interacted with a chatbot that expressed that social media is good or bad were far more likely to express the same view than a control group that didn’t interact with an “opinionated language model.”

For the time being, most Americans say they are resistant to trusting AI in sensitive matters such as health care. The same is probably true of politics. If a neighbor volunteering with a campaign persuades you to vote a particular way on a local ballot initiative, you might feel good about that interaction. If a chatbot does the same thing, would you feel the same way? To help voters chart their own course in a world of persuasive AI, we should demand transparency from our candidates. Campaigns should have to clearly disclose when a text agent interacting with a potential voter—through traditional robotexting or the use of the latest AI chatbots—is human or automated.

Though companies such as Meta (Facebook’s parent company) and Alphabet (Google’s) publish libraries of traditional, static political advertising, they do so poorly. These systems would need to be improved and expanded to accommodate user-level differentiation in ad copy to offer serviceable protection against misuse.

A public, anonymized log of chatbot conversations could help hold candidates’ AI representatives accountable for shifting statements and digital pandering. Candidates who use chatbots to engage voters may not want to make all transcripts of those conversations public, but their users could easily choose to share them. So far, there is no shortage of people eager to share their chat transcripts, and in fact, an online database exists of nearly 200,000 of them. In the recent past, Mozilla has galvanized users to opt into sharing their web data to study online misinformation.

We also need stronger nationwide protections on data privacy, as well as the ability to opt out of targeted advertising, to protect us from the potential excesses of this kind of marketing. No one should be forcibly subjected to political advertising, LLM-generated or not, on the basis of their Internet searches regarding private matters such as medical issues. In February, the European Parliament voted to limit political-ad targeting to only basic information, such as language and general location, within two months of an election. This stands in stark contrast to the US, which has for years failed to enact federal data-privacy regulations. Though the 2018 revelation of the Cambridge Analytica scandal led to billions of dollars in fines and settlements against Facebook, it has so far resulted in no substantial legislative action.

Transparency requirements like these are a first step toward oversight of future AI-assisted campaigns. Although we should aspire to more robust legal controls on campaign uses of AI, it seems implausible that these will be adopted in advance of the fast-approaching 2024 general presidential election.

Credit the RNC, at least, with disclosing that their recent ad was AI-generated—a transparent attempt at publicity still counts as transparency. But what will we do if the next viral AI-generated ad tries to pass as something more conventional?

As we are all being exposed to these rapidly evolving technologies for the first time and trying to understand their potential uses and effects, let’s push for the kind of basic transparency protection that will allow us to know what we’re dealing with.

This essay was written with Nathan Sanders, and previously appeared on the Atlantic.

EDITED TO ADD (5/12): Better article on the “daisy” ad.

There’s good reason to fear that AI systems like ChatGPT and GPT4 will harm democracy. Public debate may be overwhelmed by industrial quantities of autogenerated argument. People might fall down political rabbit holes, taken in by superficially convincing bullshit, or obsessed by folies à deux relationships with machine personalities that don’t really exist.

These risks may be the fallout of a world where businesses deploy poorly tested AI systems in a battle for market share, each hoping to establish a monopoly.

But dystopia isn’t the only possible future. AI could advance the public good, not private profit, and bolster democracy instead of undermining it. That would require an AI not under the control of a large tech monopoly, but rather developed by government and available to all citizens. This public option is within reach if we want it.

An AI built for public benefit could be tailor-made for those use cases where technology can best help democracy. It could plausibly educate citizens, help them deliberate together, summarize what they think, and find possible common ground. Politicians might use large language models, or LLMs, like GPT4 to better understand what their citizens want.

Today, state-of-the-art AI systems are controlled by multibillion-dollar tech companies: Google, Meta, and OpenAI in connection with Microsoft. These companies get to decide how we engage with their AIs and what sort of access we have. They can steer and shape those AIs to conform to their corporate interests. That isn’t the world we want. Instead, we want AI options that are both public goods and directed toward public good.

We know that existing LLMs are trained on material gathered from the internet, which can reflect racist bias and hate. Companies attempt to filter these data sets, fine-tune LLMs, and tweak their outputs to remove bias and toxicity. But leaked emails and conversations suggest that they are rushing half-baked products to market in a race to establish their own monopoly.

These companies make decisions with huge consequences for democracy, but little democratic oversight. We don’t hear about political trade-offs they are making. Do LLM-powered chatbots and search engines favor some viewpoints over others? Do they skirt controversial topics completely? Currently, we have to trust companies to tell us the truth about the trade-offs they face.

A public option LLM would provide a vital independent source of information and a testing ground for technological choices with big democratic consequences. This could work much like public option health care plans, which increase access to health services while also providing more transparency into operations in the sector and putting productive pressure on the pricing and features of private products. It would also allow us to figure out the limits of LLMs and direct their applications with those in mind.

We know that LLMs often “hallucinate,” inferring facts that aren’t real. It isn’t clear whether this is an unavoidable flaw of how they work, or whether it can be corrected for. Democracy could be undermined if citizens trust technologies that just make stuff up at random, and the companies trying to sell these technologies can’t be trusted to admit their flaws.

But a public option AI could do more than check technology companies’ honesty. It could test new applications that could support democracy rather than undermining it.

Most obviously, LLMs could help us formulate and express our perspectives and policy positions, making political arguments more cogent and informed, whether in social media, letters to the editor, or comments to rule-making agencies in response to policy proposals. By this we don’t mean that AI will replace humans in the political debate, only that they can help us express ourselves. If you’ve ever used a Hallmark greeting card or signed a petition, you’ve already demonstrated that you’re OK with accepting help to articulate your personal sentiments or political beliefs. AI will make it easier to generate first drafts, and provide editing help and suggest alternative phrasings. How these AI uses are perceived will change over time, and there is still much room for improvement in LLMs—but their assistive power is real. People are already testing and speculating on their potential for speechwriting, lobbying, and campaign messaging. Highly influential people often rely on professional speechwriters and staff to help develop their thoughts, and AI could serve a similar role for everyday citizens.

If the hallucination problem can be solved, LLMs could also become explainers and educators. Imagine citizens being able to query an LLM that has expert-level knowledge of a policy issue, or that has command of the positions of a particular candidate or party. Instead of having to parse bland and evasive statements calibrated for a mass audience, individual citizens could gain real political understanding through question-and-answer sessions with LLMs that could be unfailingly available and endlessly patient in ways that no human could ever be.

Finally, and most ambitiously, AI could help facilitate radical democracy at scale. As Carnegie Mellon professor of statistics Cosma Shalizi has observed, we delegate decisions to elected politicians in part because we don’t have time to deliberate on every issue. But AI could manage massive political conversations in chat rooms, on social networking sites, and elsewhere: identifying common positions and summarizing them, surfacing unusual arguments that seem compelling to those who have heard them, and keeping attacks and insults to a minimum.

AI chatbots could run national electronic town hall meetings and automatically summarize the perspectives of diverse participants. This type of AI-moderated civic debate could also be a dynamic alternative to opinion polling. Politicians turn to opinion surveys to capture snapshots of popular opinion because they can only hear directly from a small number of voters, but want to understand where voters agree or disagree.

Looking further into the future, these technologies could help groups reach consensus and make decisions. Early experiments by the AI company DeepMind suggest that LLMs can build bridges between people who disagree, helping bring them to consensus. Science fiction writer Ruthanna Emrys, in her remarkable novel A Half-Built Garden, imagines how AI might help people have better conversations and make better decisions—rather than taking advantage of these biases to maximize profits.

This future requires an AI public option. Building one, through a government-directed model development and deployment program, would require a lot of effort—and the greatest challenges in developing public AI systems would be political.

Some technological tools are already publicly available. In fairness, tech giants like Google and Meta have made many of their latest and greatest AI tools freely available for years, in cooperation with the academic community. Although OpenAI has not made the source code and trained features of its latest models public, competitors such as Hugging Face have done so for similar systems.

While state-of-the-art LLMs achieve spectacular results, they do so using techniques that are mostly well known and widely used throughout the industry. OpenAI has only revealed limited details of how it trained its latest model, but its major advance over its earlier ChatGPT model is no secret: a multi-modal training process that accepts both image and textual inputs.

Financially, the largest-scale LLMs being trained today cost hundreds of millions of dollars. That’s beyond ordinary people’s reach, but it’s a pittance compared to U.S. federal military spending—and a great bargain for the potential return. While we may not want to expand the scope of existing agencies to accommodate this task, we have our choice of government labs, like the National Institute of Standards and Technology, the Lawrence Livermore National Laboratory, and other Department of Energy labs, as well as universities and nonprofits, with the AI expertise and capability to oversee this effort.

Instead of releasing half-finished AI systems for the public to test, we need to make sure that they are robust before they’re released—and that they strengthen democracy rather than undermine it. The key advance that made recent AI chatbot models dramatically more useful was feedback from real people. Companies employ teams to interact with early versions of their software to teach them which outputs are useful and which are not. These paid users train the models to align to corporate interests, with applications like web search (integrating commercial advertisements) and business productivity assistive software in mind.

To build assistive AI for democracy, we would need to capture human feedback for specific democratic use cases, such as moderating a polarized policy discussion, explaining the nuance of a legal proposal, or articulating one’s perspective within a larger debate. This gives us a path to “align” LLMs with our democratic values: by having models generate answers to questions, make mistakes, and learn from the responses of human users, without having these mistakes damage users and the public arena.

Capturing that kind of user interaction and feedback within a political environment suspicious of both AI and technology generally will be challenging. It’s easy to imagine the same politicians who rail against the untrustworthiness of companies like Meta getting far more riled up by the idea of government having a role in technology development.

As Karl Popper, the great theorist of the open society, argued, we shouldn’t try to solve complex problems with grand hubristic plans. Instead, we should apply AI through piecemeal democratic engineering, carefully determining what works and what does not. The best way forward is to start small, applying these technologies to local decisions with more constrained stakeholder groups and smaller impacts.

The next generation of AI experimentation should happen in the laboratories of democracy: states and municipalities. Online town halls to discuss local participatory budgeting proposals could be an easy first step. Commercially available and open-source LLMs could bootstrap this process and build momentum toward federal investment in a public AI option.

Even with these approaches, building and fielding a democratic AI option will be messy and hard. But the alternative—shrugging our shoulders as a fight for commercial AI domination undermines democratic politics—will be much messier and much worse.

This essay was written with Henry Farrell and Nathan Sanders, and previously appeared on Slate.com.

EDITED TO ADD: Linux Weekly News discussion.

Here’s an experiment being run by undergraduate computer science students everywhere: Ask ChatGPT to generate phishing emails, and test whether these are better at persuading victims to respond or click on the link than the usual spam. It’s an interesting experiment, and the results are likely to vary wildly based on the details of the experiment.

But while it’s an easy experiment to run, it misses the real risk of large language models (LLMs) writing scam emails. Today’s human-run scams aren’t limited by the number of people who respond to the initial email contact. They’re limited by the labor-intensive process of persuading those people to send the scammer money. LLMs are about to change that. A decade ago, one type of spam email had become a punchline on every late-night show: “I am the son of the late king of Nigeria in need of your assistance….” Nearly everyone had gotten one or a thousand of those emails, to the point that it seemed everyone must have known they were scams.

So why were scammers still sending such obviously dubious emails? In 2012, researcher Cormac Herley offered an answer: It weeded out all but the most gullible. A smart scammer doesn’t want to waste their time with people who reply and then realize it’s a scam when asked to wire money. By using an obvious scam email, the scammer can focus on the most potentially profitable people. It takes time and effort to engage in the back-and-forth communications that nudge marks, step by step, from interlocutor to trusted acquaintance to pauper.

Long-running financial scams are now known as pig butchering, growing the potential mark up until their ultimate and sudden demise. Such scams, which require gaining trust and infiltrating a target’s personal finances, take weeks or even months of personal time and repeated interactions. It’s a high stakes and low probability game that the scammer is playing.

Here is where LLMs will make a difference. Much has been written about the unreliability of OpenAI’s GPT models and those like them: They “hallucinate” frequently, making up things about the world and confidently spouting nonsense. For entertainment, this is fine, but for most practical uses it’s a problem. It is, however, not a bug but a feature when it comes to scams: LLMs’ ability to confidently roll with the punches, no matter what a user throws at them, will prove useful to scammers as they navigate hostile, bemused, and gullible scam targets by the billions. AI chatbot scams can ensnare more people, because the pool of victims who will fall for a more subtle and flexible scammer—one that has been trained on everything ever written online—is much larger than the pool of those who believe the king of Nigeria wants to give them a billion dollars.

Personal computers are powerful enough today that they can run compact LLMs. After Facebook’s new model, LLaMA, was leaked online, developers tuned it to run fast and cheaply on powerful laptops. Numerous other open-source LLMs are under development, with a community of thousands of engineers and scientists.

A single scammer, from their laptop anywhere in the world, can now run hundreds or thousands of scams in parallel, night and day, with marks all over the world, in every language under the sun. The AI chatbots will never sleep and will always be adapting along their path to their objectives. And new mechanisms, from ChatGPT plugins to LangChain, will enable composition of AI with thousands of API-based cloud services and open source tools, allowing LLMs to interact with the internet as humans do. The impersonations in such scams are no longer just princes offering their country’s riches. They are forlorn strangers looking for romance, hot new cryptocurrencies that are soon to skyrocket in value, and seemingly-sound new financial websites offering amazing returns on deposits. And people are already falling in love with LLMs.

This is a change in both scope and scale. LLMs will change the scam pipeline, making them more profitable than ever. We don’t know how to live in a world with a billion, or 10 billion, scammers that never sleep.

There will also be a change in the sophistication of these attacks. This is due not only to AI advances, but to the business model of the internet—surveillance capitalism—which produces troves of data about all of us, available for purchase from data brokers. Targeted attacks against individuals, whether for phishing or data collection or scams, were once only within the reach of nation-states. Combine the digital dossiers that data brokers have on all of us with LLMs, and you have a tool tailor-made for personalized scams.

Companies like OpenAI attempt to prevent their models from doing bad things. But with the release of each new LLM, social media sites buzz with new AI jailbreaks that evade the new restrictions put in place by the AI’s designers. ChatGPT, and then Bing Chat, and then GPT-4 were all jailbroken within minutes of their release, and in dozens of different ways. Most protections against bad uses and harmful output are only skin-deep, easily evaded by determined users. Once a jailbreak is discovered, it usually can be generalized, and the community of users pulls the LLM open through the chinks in its armor. And the technology is advancing too fast for anyone to fully understand how they work, even the designers.

This is all an old story, though: It reminds us that many of the bad uses of AI are a reflection of humanity more than they are a reflection of AI technology itself. Scams are nothing new—simply intent and then action of one person tricking another for personal gain. And the use of others as minions to accomplish scams is sadly nothing new or uncommon: For example, organized crime in Asia currently kidnaps or indentures thousands in scam sweatshops. Is it better that organized crime will no longer see the need to exploit and physically abuse people to run their scam operations, or worse that they and many others will be able to scale up scams to an unprecedented level?

Defense can and will catch up, but before it does, our signal-to-noise ratio is going to drop dramatically.

This essay was written with Barath Raghavan, and previously appeared on Wired.com.

Nearly 90% of the multibillion-dollar federal lobbying apparatus in the United States serves corporate interests. In some cases, the objective of that money is obvious. Google pours millions into lobbying on bills related to antitrust regulation. Big energy companies expect action whenever there is a move to end drilling leases for federal lands, in exchange for the tens of millions they contribute to congressional reelection campaigns.

But lobbying strategies are not always so blunt, and the interests involved are not always so obvious. Consider, for example, a 2013 Massachusetts bill that tried to restrict the commercial use of data collected from K-12 students using services accessed via the internet. The bill appealed to many privacy-conscious education advocates, and appropriately so. But behind the justification of protecting students lay a market-altering policy: the bill was introduced at the behest of Microsoft lobbyists, in an effort to exclude Google Docs from classrooms.

What would happen if such legal-but-sneaky strategies for tilting the rules in favor of one group over another become more widespread and effective? We can see hints of an answer in the remarkable pace at which artificial-intelligence tools for everything from writing to graphic design are being developed and improved. And the unavoidable conclusion is that AI will make lobbying more guileful, and perhaps more successful.

It turns out there is a natural opening for this technology: microlegislation.

“Microlegislation” is a term for small pieces of proposed law that cater—sometimes unexpectedly—to narrow interests. Political scientist Amy McKay coined the term. She studied the 564 amendments to the Affordable Care Act (“Obamacare”) considered by the Senate Finance Committee in 2009, as well as the positions of 866 lobbying groups and their campaign contributions. She documented instances where lobbyist comments—on health-care research, vaccine services, and other provisions—were translated directly into microlegislation in the form of amendments. And she found that those groups’ financial contributions to specific senators on the committee increased the amendments’ chances of passing.

Her finding that lobbying works was no surprise. More important, McKay’s work demonstrated that computer models can predict the likely fate of proposed legislative amendments, as well as the paths by which lobbyists can most effectively secure their desired outcomes. And that turns out to be a critical piece of creating an AI lobbyist.

Lobbying has long been part of the give-and-take among human policymakers and advocates working to balance their competing interests. The danger of microlegislation—a danger greatly exacerbated by AI—is that it can be used in a way that makes it difficult to figure out who the legislation truly benefits.

Another word for a strategy like this is a “hack.” Hacks follow the rules of a system but subvert their intent. Hacking is often associated with computer systems, but the concept is also applicable to social systems like financial markets, tax codes, and legislative processes.

While the idea of monied interests incorporating AI assistive technologies into their lobbying remains hypothetical, specific machine-learning technologies exist today that would enable them to do so. We should expect these techniques to get better and their utilization to grow, just as we’ve seen in so many other domains.

Here’s how it might work.

Crafting an AI microlegislator

To make microlegislation, machine-learning systems must be able to uncover the smallest modification that could be made to a bill or existing law that would make the biggest impact on a narrow interest.

There are three basic challenges involved. First, you must create a policy proposal—small suggested changes to legal text—and anticipate whether or not a human reader would recognize the alteration as substantive. This is important; a change that isn’t detectable is more likely to pass without controversy. Second, you need to do an impact assessment to project the implications of that change for the short- or long-range financial interests of companies. Third, you need a lobbying strategizer to identify what levers of power to pull to get the best proposal into law.

Existing AI tools can tackle all three of these.

The first step, the policy proposal, leverages the core function of generative AI. Large language models, the sort that have been used for general-purpose chatbots such as ChatGPT, can easily be adapted to write like a native in different specialized domains after seeing a relatively small number of examples. This process is called fine-tuning. For example, a model “pre-trained” on a large library of generic text samples from books and the internet can be “fine-tuned” to work effectively on medical literature, computer science papers, and product reviews.

Given this flexibility and capacity for adaptation, a large language model could be fine-tuned to produce draft legislative texts, given a data set of previously offered amendments and the bills they were associated with. Training data is available. At the federal level, it’s provided by the US Government Publishing Office, and there are already tools for downloading and interacting with it. Most other jurisdictions provide similar data feeds, and there are even convenient assemblages of that data.

Meanwhile, large language models like the one underlying ChatGPT are routinely used for summarizing long, complex documents (even laws and computer code) to capture the essential points, and they are optimized to match human expectations. This capability could allow an AI assistant to automatically predict how detectable the true effect of a policy insertion may be to a human reader.

Today, it can take a highly paid team of human lobbyists days or weeks to generate and analyze alternative pieces of microlegislation on behalf of a client. With AI assistance, that could be done instantaneously and cheaply. This opens the door to dramatic increases in the scope of this kind of microlegislating, with a potential to scale across any number of bills in any jurisdiction.

Teaching machines to assess impact

Impact assessment is more complicated. There is a rich series of methods for quantifying the predicted outcome of a decision or policy, and then also optimizing the return under that model. This kind of approach goes by different names in different circles—mathematical programming in management science, utility maximization in economics, and rational design in the life sciences.

To train an AI to do this, we would need to specify some way to calculate the benefit to different parties as a result of a policy choice. That could mean estimating the financial return to different companies under a few different scenarios of taxation or regulation. Economists are skilled at building risk models like this, and companies are already required to formulate and disclose regulatory compliance risk factors to investors. Such a mathematical model could translate directly into a reward function, a grading system that could provide feedback for the model used to create policy proposals and direct the process of training it.

The real challenge in impact assessment for generative AI models would be to parse the textual output of a model like ChatGPT in terms that an economic model could readily use. Automating this would require extracting structured financial information from the draft amendment or any legalese surrounding it. This kind of information extraction, too, is an area where AI has a long history; for example, AI systems have been trained to recognize clinical details in doctors’ notes. Early indications are that large language models are fairly good at recognizing financial information in texts such as investor call transcripts. While it remains an open challenge in the field, they may even be capable of writing out multi-step plans based on descriptions in free text.

Machines as strategists

The last piece of the puzzle is a lobbying strategizer to figure out what actions to take to convince lawmakers to adopt the amendment.

Passing legislation requires a keen understanding of the complex interrelated networks of legislative offices, outside groups, executive agencies, and other stakeholders vying to serve their own interests. Each actor in this network has a baseline perspective and different factors that influence that point of view. For example, a legislator may be moved by seeing an allied stakeholder take a firm position, or by a negative news story, or by a campaign contribution.

It turns out that AI developers are very experienced at modeling these kinds of networks. Machine-learning models for network graphs have been built, refined, improved, and iterated by hundreds of researchers working on incredibly diverse problems: lidar scans used to guide self-driving cars, the chemical functions of molecular structures, the capture of motion in actors’ joints for computer graphics, behaviors in social networks, and more.

In the context of AI-assisted lobbying, political actors like legislators and lobbyists are nodes on a graph, just like users in a social network. Relations between them are graph edges, like social connections. Information can be passed along those edges, like messages sent to a friend or campaign contributions made to a member. AI models can use past examples to learn to estimate how that information changes the network. Calculating the likelihood that a campaign contribution of a given size will flip a legislator’s vote on an amendment is one application.

McKay’s work has already shown us that there are significant, predictable relationships between these actions and the outcomes of legislation, and that the work of discovering those can be automated. Others have shown that graphs of neural network models like those described above can be applied to political systems. The full-scale use of these technologies to guide lobbying strategy is theoretical, but plausible.

Put together, these three components could create an automatic system for generating profitable microlegislation. The policy proposal system would create millions, even billions, of possible amendments. The impact assessor would identify the few that promise to be most profitable to the client. And the lobbying strategy tool would produce a blueprint for getting them passed.

What remains is for human lobbyists to walk the floors of the Capitol or state house, and perhaps supply some cash to grease the wheels. These final two aspects of lobbying—access and financing—cannot be supplied by the AI tools we envision. This suggests that lobbying will continue to primarily benefit those who are already influential and wealthy, and AI assistance will amplify their existing advantages.

The transformative benefit that AI offers to lobbyists and their clients is scale. While individual lobbyists tend to focus on the federal level or a single state, with AI assistance they could more easily infiltrate a large number of state-level (or even local-level) law-making bodies and elections. At that level, where the average cost of a seat is measured in the tens of thousands of dollars instead of millions, a single donor can wield a lot of influence—if automation makes it possible to coordinate lobbying across districts.

How to stop them

When it comes to combating the potentially adverse effects of assistive AI, the first response always seems to be to try to detect whether or not content was AI-generated. We could imagine a defensive AI that detects anomalous lobbyist spending associated with amendments that benefit the contributing group. But by then, the damage might already be done.

In general, methods for detecting the work of AI tend not to keep pace with its ability to generate convincing content. And these strategies won’t be implemented by AIs alone. The lobbyists will still be humans who take the results of an AI microlegislator and further refine the computer’s strategies. These hybrid human-AI systems will not be detectable from their output.

But the good news is: the same strategies that have long been used to combat misbehavior by human lobbyists can still be effective when those lobbyists get an AI assist. We don’t need to reinvent our democracy to stave off the worst risks of AI; we just need to more fully implement long-standing ideals.

First, we should reduce the dependence of legislatures on monolithic, multi-thousand-page omnibus bills voted on under deadline. This style of legislating exploded in the 1980s and 1990s and continues through to the most recent federal budget bill. Notwithstanding their legitimate benefits to the political system, omnibus bills present an obvious and proven vehicle for inserting unnoticed provisions that may later surprise the same legislators who approved them.

The issue is not that individual legislators need more time to read and understand each bill (that isn’t realistic or even necessary). It’s that omnibus bills must pass. There is an imperative to pass a federal budget bill, and so the capacity to push back on individual provisions that may seem deleterious (or just impertinent) to any particular group is small. Bills that are too big to fail are ripe for hacking by microlegislation.

Moreover, the incentive for legislators to introduce microlegislation catering to a narrow interest is greater if the threat of exposure is lower. To strengthen the threat of exposure for misbehaving legislative sponsors, bills should focus more tightly on individual substantive areas and, after the introduction of amendments, allow more time before the committee and floor votes. During this time, we should encourage public review and testimony to provide greater oversight.

Second, we should strengthen disclosure requirements on lobbyists, whether they’re entirely human or AI-assisted. State laws regarding lobbying disclosure are a hodgepodge. North Dakota, for example, only requires lobbying reports to be filed annually, so that by the time a disclosure is made, the policy is likely already decided. A lobbying disclosure scorecard created by Open Secrets, a group researching the influence of money in US politics, tracks nine states that do not even require lobbyists to report their compensation.

Ideally, it would be great for the public to see all communication between lobbyists and legislators, whether it takes the form of a proposed amendment or not. Absent that, let’s give the public the benefit of reviewing what lobbyists are lobbying for—and why. Lobbying is traditionally an activity that happens behind closed doors. Right now, many states reinforce that: they actually exempt testimony delivered publicly to a legislature from being reported as lobbying.

In those jurisdictions, if you reveal your position to the public, you’re no longer lobbying. Let’s do the inverse: require lobbyists to reveal their positions on issues. Some jurisdictions already require a statement of position (a ‘yea’ or ‘nay’) from registered lobbyists. And in most (but not all) states, you could make a public records request regarding meetings held with a state legislator and hope to get something substantive back. But we can expect more—lobbyists could be required to proactively publish, within a few days, a brief summary of what they demanded of policymakers during meetings and why they believe it’s in the general interest.

We can’t rely on corporations to be forthcoming and wholly honest about the reasons behind their lobbying positions. But having them on the record about their intentions would at least provide a baseline for accountability.

Finally, consider the role AI assistive technologies may have on lobbying firms themselves and the labor market for lobbyists. Many observers are rightfully concerned about the possibility of AI replacing or devaluing the human labor it automates. If the automating potential of AI ends up commodifying the work of political strategizing and message development, it may indeed put some professionals on K Street out of work.

But don’t expect that to disrupt the careers of the most astronomically compensated lobbyists: former members Congress and other insiders who have passed through the revolving door. There is no shortage of reform ideas for limiting the ability of government officials turned lobbyists to sell access to their colleagues still in government, and they should be adopted and—equally important—maintained and enforced in successive Congresses and administrations.

None of these solutions are really original, specific to the threats posed by AI, or even predominantly focused on microlegislation—and that’s the point. Good governance should and can be robust to threats from a variety of techniques and actors.

But what makes the risks posed by AI especially pressing now is how fast the field is developing. We expect the scale, strategies, and effectiveness of humans engaged in lobbying to evolve over years and decades. Advancements in AI, meanwhile, seem to be making impressive breakthroughs at a much faster pace—and it’s still accelerating.

The legislative process is a constant struggle between parties trying to control the rules of our society as they are updated, rewritten, and expanded at the federal, state, and local levels. Lobbying is an important tool for balancing various interests through our system. If it’s well-regulated, perhaps lobbying can support policymakers in making equitable decisions on behalf of us all.

This article was co-written with Nathan E. Sanders and originally appeared in MIT Technology Review.

Congress is currently debating bills that would ban TikTok in the United States. We are here as technologists to tell you that this is a terrible idea and the side effects would be intolerable. Details matter. There are several ways Congress might ban TikTok, each with different efficacies and side effects. In the end, all the effective ones would destroy the free Internet as we know it.

There’s no doubt that TikTok and ByteDance, the company that owns it, are shady. They, like most large corporations in China, operate at the pleasure of the Chinese government. They collect extreme levels of information about users. But they’re not alone: Many apps you use do the same, including Facebook and Instagram, along with seemingly innocuous apps that have no need for the data. Your data is bought and sold by data brokers you’ve never heard of who have few scruples about where the data ends up. They have digital dossiers on most people in the United States.

If we want to address the real problem, we need to enact serious privacy laws, not security theater, to stop our data from being collected, analyzed, and sold—by anyone. Such laws would protect us in the long term, and not just from the app of the week. They would also prevent data breaches and ransomware attacks from spilling our data out into the digital underworld, including hacker message boards and chat servers, hostile state actors, and outside hacker groups. And, most importantly, they would be compatible with our bedrock values of free speech and commerce, which Congress’s current strategies are not.

At best, the TikTok ban considered by Congress would be ineffective; at worst, a ban would force us to either adopt China’s censorship technology or create our own equivalent. The simplest approach, advocated by some in Congress, would be to ban the TikTok app from the Apple and Google app stores. This would immediately stop new updates for current users and prevent new users from signing up. To be clear, this would not reach into phones and remove the app. Nor would it prevent Americans from installing TikTok on their phones; they would still be able to get it from sites outside of the United States. Android users have long been able to use alternative app repositories. Apple maintains a tighter control over what apps are allowed on its phones, so users would have to “jailbreak”—or manually remove restrictions from—their devices to install TikTok.

Even if app access were no longer an option, TikTok would still be available more broadly. It is currently, and would still be, accessible from browsers, whether on a phone or a laptop. As long as the TikTok website is hosted on servers outside of the United States, the ban would not affect browser access.

Alternatively, Congress might take a financial approach and ban US companies from doing business with ByteDance. Then-President Donald Trump tried this in 2020, but it was blocked by the courts and rescinded by President Joe Biden a year later. This would shut off access to TikTok in app stores and also cut ByteDance off from the resources it needs to run TikTok. US cloud-computing and content-distribution networks would no longer distribute TikTok videos, collect user data, or run analytics. US advertisers—and this is critical—could no longer fork over dollars to ByteDance in the hopes of getting a few seconds of a user’s attention. TikTok, for all practical purposes, would cease to be a business in the United States.

But Americans would still be able to access TikTok through the loopholes discussed above. And they will: TikTok is one of the most popular apps ever made; about 70% of young people use it. There would be enormous demand for workarounds. ByteDance could choose to move its US-centric services right over the border to Canada, still within reach of American users. Videos would load slightly slower, but for today’s TikTok users, it would probably be acceptable. Without US advertisers ByteDance wouldn’t make much money, but it has operated at a loss for many years, so this wouldn’t be its death knell.

Finally, an even more restrictive approach Congress might take is actually the most dangerous: dangerous to Americans, not to TikTok. Congress might ban the use of TikTok by anyone in the United States. The Trump executive order would likely have had this effect, were it allowed to take effect. It required that US companies not engage in any sort of transaction with TikTok and prohibited circumventing the ban. . If the same restrictions were enacted by Congress instead, such a policy would leave business or technical implementation details to US companies, enforced through a variety of law enforcement agencies.

This would be an enormous change in how the Internet works in the United States. Unlike authoritarian states such as China, the US has a free, uncensored Internet. We have no technical ability to ban sites the government doesn’t like. Ironically, a blanket ban on the use of TikTok would necessitate a national firewall, like the one China currently has, to spy on and censor Americans’ access to the Internet. Or, at the least, authoritarian government powers like India’s, which could force Internet service providers to censor Internet traffic. Worse still, the main vendors of this censorship technology are in those authoritarian states. China, for example, sells its firewall technology to other censorship-loving autocracies such as Iran and Cuba.

All of these proposed solutions raise constitutional issues as well. The First Amendment protects speech and assembly. For example, the recently introduced Buck-Hawley bill, which instructs the president to use emergency powers to ban TikTok, might threaten separation of powers and may be relying on the same mechanisms used by Trump and stopped by the court. (Those specific emergency powers, provided by the International Emergency Economic Powers Act, have a specific exemption for communications services.) And individual states trying to beat Congress to the punch in regulating TikTok or social media generally might violate the Constitution’s Commerce Clause—which restricts individual states from regulating interstate commerce—in doing so.

Right now, there’s nothing to stop Americans’ data from ending up overseas. We’ve seen plenty of instances—from Zoom to Clubhouse to others—where data about Americans collected by US companies ends up in China, not by accident but because of how those companies managed their data. And the Chinese government regularly steals data from US organizations for its own use: Equifax, Marriott Hotels, and the Office of Personnel Management are examples.

If we want to get serious about protecting national security, we have to get serious about data privacy. Today, data surveillance is the business model of the Internet. Our personal lives have turned into data; it’s not possible to block it at our national borders. Our data has no nationality, no cost to copy, and, currently, little legal protection. Like water, it finds every crack and flows to every low place. TikTok won’t be the last app or service from abroad that becomes popular, and it is distressingly ordinary in terms of how much it spies on us. Personal privacy is now a matter of national security. That needs to be part of any debate about banning TikTok.

This essay was written with Barath Raghavan, and previously appeared in Foreign Policy.

When is it time to start worrying about artificial intelligence interfering in our democracy? Maybe when an AI writes a letter to The New York Times opposing the regulation of its own technology.

That happened last month. And because the letter was responding to an essay we wrote, we’re starting to get worried. And while the technology can be regulated, the real solution lies in recognizing that the problem is human actors—and those we can do something about.

Our essay argued that the much heralded launch of the AI chatbot ChatGPT, a system that can generate text realistic enough to appear to be written by a human, poses significant threats to democratic processes. The ability to produce high quality political messaging quickly and at scale, if combined with AI-assisted capabilities to strategically target those messages to policymakers and the public, could become a powerful accelerant of an already sprawling and poorly constrained force in modern democratic life: lobbying.

We speculated that AI-assisted lobbyists could use generative models to write op-eds and regulatory comments supporting a position, identify members of Congress who wield the most influence over pending legislation, use network pattern identification to discover undisclosed or illegal political coordination, or use supervised machine learning to calibrate the optimal contribution needed to sway the vote of a legislative committee member.

These are all examples of what we call AI hacking. Hacks are strategies that follow the rules of a system, but subvert its intent. Currently a human creative process, future AIs could discover, develop, and execute these same strategies.

While some of these activities are the longtime domain of human lobbyists, AI tools applied against the same task would have unfair advantages. They can scale their activity effortlessly across every state in the country—human lobbyists tend to focus on a single state—they may uncover patterns and approaches unintuitive and unrecognizable by human experts, and do so nearly instantaneously with little chance for human decision makers to keep up.

These factors could make AI hacking of the democratic process fundamentally ungovernable. Any policy response to limit the impact of AI hacking on political systems would be critically vulnerable to subversion or control by an AI hacker. If AI hackers achieve unchecked influence over legislative processes, they could dictate the rules of our society: including the rules that govern AI.

We admit that this seemed far fetched when we first wrote about it in 2021. But now that the emanations and policy prescriptions of ChatGPT have been given an audience in the New York Times and innumerable other outlets in recent weeks, it’s getting harder to dismiss.

At least one group of researchers is already testing AI techniques to automatically find and advocate for bills that benefit a particular interest. And one Massachusetts representative used ChatGPT to draft legislation regulating AI.

The AI technology of two years ago seems quaint by the standards of ChatGPT. What will the technology of 2025 seem like if we could glimpse it today? To us there is no question that now is the time to act.

First, let’s dispense with the concepts that won’t work. We cannot solely rely on explicit regulation of AI technology development, distribution, or use. Regulation is essential, but it would be vastly insufficient. The rate of AI technology development, and the speed at which AI hackers might discover damaging strategies, already outpaces policy development, enactment, and enforcement.

Moreover, we cannot rely on detection of AI actors. The latest research suggests that AI models trying to classify text samples as human- or AI-generated have limited precision, and are ill equipped to handle real world scenarios. These reactive, defensive techniques will fail because the rate of advancement of the “offensive” generative AI is so astounding.

Additionally, we risk a dragnet that will exclude masses of human constituents that will use AI to help them express their thoughts, or machine translation tools to help them communicate. If a written opinion or strategy conforms to the intent of a real person, it should not matter if they enlisted the help of an AI (or a human assistant) to write it.

Most importantly, we should avoid the classic trap of societies wrenched by the rapid pace of change: privileging the status quo. Slowing down may seem like the natural response to a threat whose primary attribute is speed. Ideas like increasing requirements for human identity verification, aggressive detection regimes for AI-generated messages, and elongation of the legislative or regulatory process would all play into this fallacy. While each of these solutions may have some value independently, they do nothing to make the already powerful actors less powerful.

Finally, it won’t work to try to starve the beast. Large language models like ChatGPT have a voracious appetite for data. They are trained on past examples of the kinds of content that they will be asked to generate in the future. Similarly, an AI system built to hack political systems will rely on data that documents the workings of those systems, such as messages between constituents and legislators, floor speeches, chamber and committee voting results, contribution records, lobbying relationship disclosures, and drafts of and amendments to legislative text. The steady advancement towards the digitization and publication of this information that many jurisdictions have made is positive. The threat of AI hacking should not dampen or slow progress on transparency in public policymaking.

Okay, so what will help?

First, recognize that the true threats here are malicious human actors. Systems like ChatGPT and our still-hypothetical political-strategy AI are still far from artificial general intelligences. They do not think. They do not have free will. They are just tools directed by people, much like lobbyist for hire. And, like lobbyists, they will be available primarily to the richest individuals, groups, and their interests.

However, we can use the same tools that would be effective in controlling human political influence to curb AI hackers. These tools will be familiar to any follower of the last few decades of U.S. political history.

Campaign finance reforms such as contribution limits, particularly when applied to political action committees of all types as well as to candidate operated campaigns, can reduce the dependence of politicians on contributions from private interests. The unfair advantage of a malicious actor using AI lobbying tools is at least somewhat mitigated if a political target’s entire career is not already focused on cultivating a concentrated set of major donors.

Transparency also helps. We can expand mandatory disclosure of contributions and lobbying relationships, with provisions to prevent the obfuscation of the funding source. Self-interested advocacy should be transparently reported whether or not it was AI-assisted. Meanwhile, we should increase penalties for organizations that benefit from AI-assisted impersonation of constituents in political processes, and set a greater expectation of responsibility to avoid “unknowing” use of these tools on their behalf.

Our most important recommendation is less legal and more cultural. Rather than trying to make it harder for AI to participate in the political process, make it easier for humans to do so.

The best way to fight an AI that can lobby for moneyed interests is to help the little guy lobby for theirs. Promote inclusion and engagement in the political process so that organic constituent communications grow alongside the potential growth of AI-directed communications. Encourage direct contact that generates more-than-digital relationships between constituents and their representatives, which will be an enduring way to privilege human stakeholders. Provide paid leave to allow people to vote as well as to testify before their legislature and participate in local town meetings and other civic functions. Provide childcare and accessible facilities at civic functions so that more community members can participate.

The threat of AI hacking our democracy is legitimate and concerning, but its solutions are consistent with our democratic values. Many of the ideas above are good governance reforms already being pushed and fought over at the federal and state level.

We don’t need to reinvent our democracy to save it from AI. We just need to continue the work of building a just and equitable political system. Hopefully ChatGPT will give us all some impetus to do that work faster.

This essay was written with Nathan Sanders, and appeared on the Belfer Center blog.