The Generative AI Battle Continues: Comedian Sarah Silverman May Have Won the Fight, but Will OpenAI Win the War?

Emma AistropNews & Insights

In July 2023, comedian Sarah Silverman and authors Christopher Golden and Richard Kadrey sued OpenAI, the parent company of Chat-GPT, for copyright infringement.[1] The suit contains a total of six claims: copyright infringement, violation of the Digital Millenium Copyright Act, unjust enrichment, violation of the California and common law unfair competition laws, and negligence.[2] Silverman filed the lawsuit after learning that OpenAI’s generative AI software successfully summarized detailed passages of her 2010 autobiography, “The Bedwetter.”[3] On August 28, 2023, OpenAI responded with a motion to dismiss.[4] Silverman subsequently filed an opposition and declared that OpenAI’s motion indicates “its intent to unilaterally rewrite U.S. copyright law in its favor.”[5]

According to the initial suit, OpenAI is liable for copyright infringement due to its use of copyrighted works as source material when training Chat-GPT, a large language model (“LLM”) that functions by intaking existing information and extracting expressive information from the ingested text.[6] An LLM can produce simulations of natural written language through progressive adjustment, which is a method in which the LLM recalibrates its output to more closely reflect the source material.[7]

The lawsuit alleges that ChatGPT is trained in part by illegal “shadow libraries” comprised of thousands of pirated books, thus allowing the LLM to produce verbatim synopses of the written works illicitly included in the libraries.[8] According to the lawsuit, ChatGPT’s use of shadow libraries is evidenced by two factors. First, OpenAI did not identify the source material used when training its latest version of ChatGPT, GPT-4, despite the fact that it previously publicized training datasets.[9] Second, the number of copyright-free internet-based books available is limited, and the previous training sets identified by ChatGPT already greatly exceed the number of works available.[10] For example, Project Gutenberg, one of the most popular digital archives for public domain books, contains just over 60,000 titles.[11] One of ChatGPT’s previously publicized training sets contained more than 294,000 titles.[12] Given that the only “internet books corpora” that offer that much material are “shadow library” websites, the plaintiffs assert that OpenAI used shadow library content to source data for its latest model.[13] 

Rather than address its source material directly, OpenAI’s motion to dismiss argues that none of the listed causes of action state a viable claim for relief because the legal theories invoked fail to adequately challenge ChatGPT’s processes.[14] OpenAI cites the general intent of the Copyright Act, the doctrine of fair use, and the doctrine of substantial similarity as defenses against the allegations.[15]

According to the motion, the Copyright Act balances the “progress of science and useful arts” to accommodate technological advances such as those made possible with AI.[16] In previous cases, this balance has been described as a mechanism for ensuring “authors the right to their original expression [and] encouraging others to build freely upon the ideas and information conveyed by a work.”[17]

Additionally, the motion asserts that the fair use doctrine ensures protection for transformative works created by AI-outputs.[18] In Authors Guild v. Google, Google invoked the fair use doctrine to defend against the claim that displaying snippets of authors’ works was infringement.[19] Applying the doctrine, the court examined (1) the purpose and character of the use; (2) the nature of the copyrighted work; (3) the amount and substantiality of the portion taken; and (4) the effect of the use upon the potential market.[20] The court held that since Google’s purpose of copying was highly transformative, its public display was limited, and its copying was unlikely to affect market value of the original work, the use was protected against the infringement claim.[21]

Finally, the motion looks to substantial similarity. The substantial similarity doctrine requires that the court compare works and decide whether the “protectable elements, standing alone, are substantially similar.”[22] While OpenAI relies on this doctrine to show that dissimilarities in the works preclude the claim of infringement, Silverman cites the Ninth Court’s previous holding that “substantial similarity is not an element of a claim of copyright infringement” but rather a tool for determining whether copying occurred.[23]  

A case which may provide guidance for Silverman and OpenAI is Anderson v. Stability AI.[24] This class action lawsuit, filed by cartoonist and illustrator Sarah Andersen on behalf of “at least thousands” of creatives, alleged that Stability AI scraped billions of copyrighted images from public websites and used them without the copyright owners’ consent to train their AI-based image generation product, Stable Diffusion.[25] In a July 2023 hearing on Stability AI’s motion to dismiss, the judge indicated that he would tentatively dismiss Andersen’s claims for two reasons.[26] First, he explained that the images produced by the models are not “substantially similar” to the creatives’ art.[27] Second, the judge stated that it is “implausible” that the creatives’ works were involved in training the models because the models had been trained on “five billion compressed images.”[28] Still, the judge provided Andersen the opportunity to amend her complaint, likely due to the inconclusive nature of the current laws in this area.[29]

Although the battle between creatives and generative AI continues, the outcomes of these pending cases will likely shed light on how copyright infringement claims will be handled in the future.

[1] Compl. at 1, Sarah Silverman vs. OpenAI, Inc., No. 32­­­3-CV-03416 (N.D. Cal., Jul. 7, 2023).

[2] Silverman, supra note 1 at 2.

[3] Silverman, supra note 1 at Ex. B.

[4] Compl. at 13, Sarah Silverman vs. OpenAI, Inc., No. 323-CV-03416-AMO (N.D. Cal., Aug. 28, 2023).

[5] Compl. at 12, Sarah Silverman vs. OpenAI, Inc., No. 323-CV-03416-AMO (N.D. Cal., Sept. 27, 2023).

[6] Silverman, supra note 1 at 9.

[7] Id. at 6.

[8] Id. at 7.

[9] Id. at 8.

[10] Id.

[11] Id. at 6.

[12] Id. at 8.

[13] Id.

[14] Silverman, supra note 4.

[15] Id. at 4.

[16] Id. at 12.

[17] Feist Publ’ns, Inc. v. Rural Tel. Serv. Co., 499 U.S. 340, 349 (1991).

[18] Silverman, supra note 4 at 12.

[19] See generally Authors Guild v. Google, Inc., 804 F.3d 202 (2015).

[20] Id.

[21] Id. At 229.

[22] Silverman, supra note 5 at 13.

[23] Id. at 15.

[24] Gabriel Karger, AI-Generated Images: The First Lawsuit, Science and Technology Law Review (Jan. 25, 2023),

[25] Compl. at 12, Andersen v. Stability AI, Inc., No. 323-CV-00201 (N.D. Cal., Jan. 13, 2023).

[26] Christopher J. Valente, Michael J. Stortz, Amy Wong, Peter E. Soskin, and Michael W. Meredith, Recent Trends in Generative Artificial Intelligence Litigation in the United States, K&L Gates (Sept. 5, 2023),

[27] Id.

[28] Id.

[29] Id.