How Will NVIDIA’s Purchase of Pirated Content to Train AI Be Characterized Under Chinese Law?
(By You Yunting) Recently, a lawsuit filed by U.S. copyright holders against NVIDIA for allegedly using pirated materials to train AI models has attracted significant public attention. According to the complaint, in order to quickly obtain more than 500 terabytes of data, NVIDIA proactively contacted the pirate website Anna’s Archive and paid hundreds of thousands of U.S. dollars to download a large volume of pirated content, including copyrighted books and articles.
Anna’s Archive is one of “shadow libraries” known for their decentralized and anonymous nature, most of which typically provide access to literature in a way that infringes upon its copyright. If the plaintiffs’ allegations are true, it will be a serious blemish on the reputation of NVIDIA, the world’s most valuable company, to have paid a pirate website for content and then been sued by copyright holders. However, the unauthorized use of training data can be considered the “original sin” of nearly all general AI companies. In both China and the United States—the two global leaders in AI technology, numerous lawsuits concerning AI training data have already emerged. We will discuss whether, under Chinese law, NVIDIA’s alleged conduct will be considered breach of law.