(By You Yunting) With much attention drawn to ChatGPT, people are interested in discussing whether AI generated works are copyrightable. When I am asked this kind of question, my answer is basically no. However, when I asked ChatGPT, the most well-known AI program, it answered yes. Open AI, ChatGPT developer claimed that it was entitled to benefits from these works. I would like to discuss this issue with you.
|Ask||Is AI generated content copyrightable?|
|ChatGPT Answer||Yes, it is. In copyright law, original works created by individuals and organizations, including natural people and machines are copyrightable. All kinds of AI generated content that is protectable under copyright law, including words, pictures, audio recordings, videos and other forms of content are protected by copyright law.|
|Ask||Can I get a share of profits from sale of AI (or chatgpt) generated works that are protected by copyright law?|
|ChatGPT Answer||Sale or use of a copyright law protected AI generated work (such as a ChatGPT generated text) for other commercial purpose should be subject to license or authorization granted by the copyright owner, i.e. original author or licensor. Sale or use of the content of ChatGPT developed by a company or an individual with its copyright owned by the developer for other commercial purpose should be subject to license or authorization granted by the company or individual. In this case, ChatGPT developer or owner could get a share of the profits in an proportion agreed by the parties or explicitly set out in the authorization or license agreement.|
First, my argument is that no person should be granted copyright in or paid a license fee for AI generated works because they are not copyrightable in nature. Users can claim their rights under the Anti-Unfair Competition Law in case of AI generated work infringement.
I. Origianl sin of copyrighting AI
The fundamental problem in charging royalties for AI is AI itself. Charging royalties for AI is illogical because its training process is illegal, involving extensive infringement.
ChatGPT and other similar AI developers are bound to use large quantities of data and resources to teach and train AI. Major AI companies are secretive about learning resources. Some treat them as trade secrets. The biggest problem is that the most popular AI products are trained with free open online data and resources crawled by developers. They feed and train AI with such data and resources without consent of crawled websites. There is an action arising from this great dispute brought by media.
When I ask ChatGPT, the answer is training data are open online data that are not and do not need to be licensed.
|Ask||Are training data collected by chatgpt licensed?|
|ChatGPT Answer||As an AI model, ChatGPT training data come from large quantities of text and language data on the internet that are open and available without and with no need for a license. As an organization that develops ChatGPT, OpenAI might abide by some data use guidelines or laws in order to use data without violating any law or ethics. Meanwhile, Open AI will work as hard as possible to protect users’ privacy and intellectual property rights.|
I consider that the data crawling might involve infringement. Is the data crawling an illegal activity? Many internet users think it depends on Robots Agreement. It is legal for search engines to crawl data on websites that agree to be crawled under its Robots Agreement. As AI companies and search engines do not crawl data for the same purpose, it is very controversial whether the activity of AI companies crawling data is governed by Robots Agreement.
Website visits incur costs of server, technical maintenance and management. Website owners allow search engines to crawl them because both sides can benefit from it. Search engines give a guide to website visits for search engine users by data crawling which helps crawled websites in business by facilitating website visits.
On the contrary, AI companies crawl data only to train their AI programs in their own interests, leaving nothing to website owners that pay server, technical maintenance and management costs. This activity should be governed by Copyright Law, not Robots Agreement.
In Copyright Law, the process of AI learning online content is reproduction or temporary reproduction. AI companies crawl online or offline content and import it into an AI program. Reproducing content, including words, pictures, audio recordings, videos and programs should be subject to related owner license. Unauthorized reproduction might involve infringement.
A very small possibility is that the activity of AI learning online content might be temporary production, which means data are deleted once learned by AI. In China, temporary reproduction does not require copyright owner license but might be questioned as to whether data sources are open and can be crawled or it is legal to make offline data (such as books and literature) into electronic copies. According to search results generated by ChatGPT in Microsoft New Bing, the content generated with links for reference showed ChatGPT was very likely to store the website content. This is not temporary reproduction.
Some websites abide by open agreements, under which people can freely copy and re-publish the content of these websites, provided that they do not violate the open agreement, for example, Wikipedia’s GNU free file permit. AI companies could crawl content on such websites and republish it with a mark showing the source. OpenAi published ChatGPT generated content on its website without giving users a link to the source of data, while Microsoft New Bing published ChatGPT generated content with a link to the source. Despite differences between ChatGPT generated content and Wikipedia, such selfish activities will gradually undermine the open source culture. If Wikipedia brings an action in China, I believe that they could claim that OpenAI’s crawling violate generally accepted business ethics by referring to Article 2 of Unfair Competition Law, besides Copyright Law.
II. Why AI generated works are not copyrightable?
Berne Conventionon the Protection of Literary and Artistic Works dated 9 September 1886 formulated in Berne, Switzerland is a globally recognized copyright convention, to which China is a member. Article 1 thereof explicitly states that each party to the convention should protect author’s rights in literary and artistic works. In Article 3 thereof, authors include citizens of countries that are or aren’t parties thereto. So authors must be natural people. ChatGPT and other AI software could not have been authors in copyright. That is the reason why no copyright registration authority in any country has accepted AI as an author.
In Chinese Copyright Law, in addition to natural people, legal people can also be the owner of works. ChatGPT created works are not legal person’s works. Legal person’s works are created by natural people and comprised of the following elements.
- The work should be conducted by a legal person or other type of organization;
- The work should represent the will of a legal person or other type of organization; and
- A legal person or other type of organization should be responsible for the work.
If I ask ChatGPT to write an article, I conduct the work and ChatGPT creates the work at my will. If the article infringe other person’s rights, I should be responsible for the infringement. There is nothing to do with ChatGPT. Therefore, the article written by ChatGPT is not a legal person’s work. I just give some conditions to generate the article, not creating it and should not own copyright in it.
Work creation by AI is like a commission. The commissioning party gives instructions upon which AI finishes the work. The AI program is real creator of the work. In Copyright Law, in general, copyright in a commissioned work is owned by the commissioned party. This is the basis on which staff of an AI company told me they owned copyrights in AI generated works.
Despite what they said, ChatGPT is rational when preparing legal instruments. Article 4.2 of the User Agreement provides that intellectual property rights in any content generated by the user using ChatGPT is owned by the user, provided that ChatGPT may use such content without paying to the user. I understand that if ChatGPT claimed it owned intellectual property rights in the works, users would be unsatisfied and ChatGPT, the intellectual property owner would be responsible for the content when having intellectual property rights in the same, in which case if the content infringes other person’s rights or is disputed, such as copyright or reputation infringement or even worse, politically improper speeches, Open AI, a startup company might not be able to deal with it.
III. What if a work created by a person using AI is misused by other person?
As AI generated works are not protected by Copyright Law, if a work created by a person using AI is misused, how could that person protect their rights in the misused work?
First, look at a precedent case. A computer software called Dreamwriter developed by Tencent could write articles on its own and generate three hundred thousand articles a year. Tencent brought an infringement action against a website that re-published an AI generated articles on the website of Tencent securities. The court of Nanshan District held that the articles were technically generated by Dreamwriter software, met the conditions for protection of written works under Copyright Law and constituted a legal person’s work conducted by the plaintiff and awarded 1500 RMB damages in the judgement which is now in force.
I do not completely agree with the court decision. The articles were written by AI. As mentioned above, works created by non-natural persons, whether original or not, should not be protected by copyright. The decision was made over three years ago. Works created by AI were rare, so it was all right to protect them at that time. With the advent of ChatGPT and different kinds of AI software, there are now so many articles, works of fine art, videos, music and apps created by AI. If a similar case arises now, the court would not decide in that way.
The audience may ask what if someone infringes rights in the articles that are not copyrightable. The Anti-Unfair Competition Law might apply. In this case, if the defendant extensively misused thousands of Tencent’s AI generated articles, Tencent could protect its legal rights pursuant to Article 2 of Anti-Unfair Competition Law relating to business integrity and ethics.
The argument for bringing the action is as follows. It is a justifiable business model, in which Tencent developed AI software at a cost and published the articles, of which the contents were lawful. This was a justifiable business model. The defendant’s extensive misuse of the articles at no cost of creating the works without permission is a typical dishonest act that violated the generally accepted business ethics and the competition rules.
The threshold of protection by Anti-Unfair Competition Law is so high that only extensive reproduction is restricted by law. If the defendant used a small number of AI generated articles, say less than ten ones, it could not be found as a violation of Anti-Unfair Competition Law. In other words, if the defendant crawled and published a little bit of the contents, Tencent could not claim it infringed its rights.
As AI develops so quickly that AI generated works are used in all walks of life and will become a great industry soon, the audience might ask if weak protection of non-copyrightabe AI generated works will affect the industry’s development and harm protection of intellectual property rights.
My opinions are as follows. Intellectual property rights, including copyright are created to encourage innovation, which in nature mean everyone in our society alienates part of their rights to protect other people’s works and inventions. Considering the hard work of authors and creators and their contribution to the advancement of civilization, such alienation is reasonable.
The advent of ChatGPT and breakthroughs of other AI technology has made it much easier for people to create works. People can receive articles, music, drawings and videos that used to take a long time and much effort to create by just giving words of instructions. We don’t need to alienate so many rights to protect this kind of works created at a low cost. This means AI generated works can be used and spread free so that everyone can benefit from advances in AI technology. With easier access, relaxed legal restrictions and free competition, the industry will thrive. If a limited number of companies or authors have a monopoly in the content industry, the situation will get worse.
Finally, if the telecommunication company claimed all titles in online data transmitted via its networks at its beginning, the internet industry would not be as prosperous as today. The same is true to the AI industry. If AI companies own intellectual property rights in AI generated content, the AI industry will not grow big. Only if they charges for use of networks and let works be spread free, will AI generated works benefit the general public more and help the long-term development of the AI industry.