ChatGPT—user beware

ChatGPT—user beware

ChatGPT burst onto the artificial intelligence scene in November 2022, creating something of a media frenzy around the opportunities and complications accompanying its arrival. Trend Analyst Isabel de Leon looks in more detail at the chatbot offering and its impact on the market. Julia Dickenson, Of Counsel at Baker McKenzie, and Alison Rees-Blanchard, TMT PSL at ÀÏ˾»úÎçÒ¹¸£Àû, also weigh in on the matters companies should be aware of in relation to intellectual property and risk. 

What is ChatGPT?

Released in November 2022, ChatGPT is one of the newest large language models to be publicly accessible in the form of a chatbot. This type of AI has been trained, taking huge volumes of text, removing key words from their context, and rewarding the model for correctly predicting the missing word.

Developed by the company Open AI, one of ChatGPT’s distinctive features is its ability to interact with users in ‘a conversational way’, making it possible for the model ‘to answer follow-up questions, admit its mistakes, challenge incorrect premises, and reject inappropriate requests’. It was optimised for dialogue by using Reinforcement Learning with Human Feedback – a method that uses human demonstrations and preference comparisons to guide the model toward desired behaviour. After sufficient training, a bank of information on the syntax, grammar, and other features of the key words are encoded into the model, which in turn produces a high quality natural language output (see Practice Note: ) (a subscription to Lexis®PSL is required). 

The impact of ChatGPT

Following its launch, applications of the model have proven to be wide-ranging. has launched a preview of ‘an all new, AI-powered Bing search engine and Edge browser’ incorporating a chat function which brings more context to search results after its multibillion-dollar investment in Open AI, while announced that it has integrated Harvey, ‘the innovative artificial intelligence platform built on a version of Open AI’s latest models enhanced for legal work’, into its global practice.

The increasing widespread use of ChatGPT and other similar generative AI models across large companies prompts questions around whether such use involves any risks to companies and, if so, the nature of such risks and strategies to mitigate them. 

Ownership and copyright

There are issues related to the intellectual property rights which may (or may not) attach to the outputs of these models. According to ChatGPT’s ‘’, subject to their Content Policy and Terms, the user owns the output they create with ChatGPT, including the right to reprint, sell and merchandise, regardless of whether the output was generated through a free or paid plan. However, users should not take this as providing any comfort in relation to the position on ownership of intellectual property rights arising in the output or in relation to the right to use the output.

As is the case with any AI, users of outputs of generative AI models should keep in mind the fact that the training data used to train the model is likely to be subject to copyright, meaning that appropriate licences should have been sought for its use. However, it is unlikely that any reassurances will be given in this regard in the AI bot’s terms of use. It is also possible that the outputs may contain extracts that are the same as or similar to the original training data, which may be subject to the original author’s copyright and may therefore lead to use of such outputs being liable to infringement action.

It may also happen that two outputs are produced which contain the same or similar content as a result of the same or similar prompts. On this point, that a user may provide input to a model such as ‘What colour is the sky?’ and receive output such as ‘The sky is blue.’ In these instances, responses that are requested by and generated for other users are not considered a user’s Content. Market Tracker discussed this dense network of parties, users and potential ownership rights with Julia Dickenson, Of Counsel in the Intellectual Property department of Baker McKenzie. Specialising in brand advisory, litigation and enforcement and copyright advisory and litigation, she comments:

‘There is a distinction between contractual ownership of content, and whether there are in fact any intellectual property rights in that content. So, for example … generative AI will often provide that each user ‘owns’ their output, but this is different from whether the output is protected by intellectual property rights (e.g. copyright), and is also different from whether that output infringes any third party intellectual property rights. So, whilst each user will ‘own’ their output, i.e. it is deemed to be their content and they can in theory use it how they wish, that doesn't mean that the content is protected by intellectual property rights such that they can prevent anyone else from using similar content, or that the content doesn’t infringe anyone else's IP rights.’

She continues by explaining:

‘Where questions are generic or similar, and the resulting output is identical, that wouldn't prevent the user from ‘owning’ their individual response, but it may well mean that there are no intellectual property rights in their response and so they wouldn't be able to prevent others from using / creating similar or identical responses. The same can be said in relation to situations where output is similar to existing works - simply ‘owning’ something (or creating it) does not mean that it won't infringe existing content. It's important to note that for copyright infringement there is a requirement to show copying, which raises complex questions around how generative AI models have been trained and whether content resulting from models that have been trained on infringing data can itself be infringing - some of the questions that are being asked in recent cases on this topic.’

User beware–accuracy and risk

In light of these complex questions, Alison Rees-Blanchard, a TMT specialist at ÀÏ˾»úÎçÒ¹¸£Àû UK, highlights an important reminder, namely that there are no contracts or indemnities associated with the free use of ChatGPT and, as such, users rely on and use the output at their own risk. For example, it is unlikely that the terms of use for such large language models will include reassurances about the training data, in particular that appropriate permissions have been obtained from rightsholders. The user makes use of the output at its own risk and must make their own judgment as to its perceived value and legitimacy. As well as issues around intellectual property and the rights to own and use the output, Rees-Blanchard highlights potential issues around accuracy and bias risk. If the training data used to train the model contained errors or biases these will be reflected in the outputs.

Furthermore, as has been the focus of much recent commentary around ChatGPT, this type of AI model can sometimes produce hallucinations – where the output is highly convincing but completely made up by the system.  Even if not wholly invented by the AI model, it is important to remember that, although it may give the impression of a willing, helpful and truthful chatty assistant, the model is simply trying to find the next likely word given the previous words and its training programme.   Rees-Blanchard comments that although there is no doubt that there are benefits and appropriate use cases for this efficient and innovative technology,  a degree of caution and common sense should be applied when determining the level of reliance to be placed on the outputs. Risk assessments should be carried out as to the impact that use of the output might have on any product, process or service that it may be used in – considering the introduction of infringement risk, biases or inaccuracies. It is also important, when assessing how such outputs might be used, to remember the importance of maintaining ‘a human-in-the-loop’ – some form of human oversight that can carry out sense checks in respect of the use of the output.  

The future use of ChatGPT

So what could owners of large language generative AI models do to provide more comfort and reassurance around areas of concern, such as the model’s training? When asked about how companies should draw parameters around the appropriate data to use when training ChatGPT in order to avoid potential IP risks, Dickenson acknowledges that questions around such parameters are ‘particularly complex’ and ‘will depend on a number of factors including the type of data itself, how models are trained (that is, what is done with the data during the training process), the permissions around the data (for example, is it truly open source or are there restrictions on how the data can be used and/or onward licensed), as well as the law in the jurisdictions where training is done.’

Dickenson concludes by stating, ‘There is no easy answer here and companies wishing to train AI models should think very carefully about this, and probably seek specialist advice, before starting the training process.’

Market Tracker will continue to monitor developments in the ChatGPT space.

 

 

 

 

 


Related Articles:
Latest Articles:
About the author:

Market Tracker is a unique service for corporate lawyers housed within Lexis®PSL Corporate. It features a powerful transaction data analysis tool for accessing, analysing and comparing the specific features of corporate transactions, with a comprehensive and searchable library of deal documentation across 14 different deal types. The Market Tracker product also includes news and analysis of key corporate deals and activity and in-depth analysis of recent trends in corporate transactions.Â