Open AI privacy policies and data requests

Creative abstract question mark sketch and hand working with a digital tablet on background, FAQ and research concept. Double exposure

In light of changing regulations, views, and data storage solutions in recent years, Open AI, the creators of ChatGPT, provided an update in late 2024 to its privacy policies, to better inform users as to the use of data controllers, what data is collected, how it’s used, retained, and more.

Primary Sources

In its privacy policy, Open AI states that three primary sources of information are used to power the company’s foundation models (the language models trained via data sets):

Data that is publicly available on the internet
Third-party partner data
Information provided by human trainers and researchers

Data Collected

Open AI divides the kinds of information is collected into several categories:

Account Information: the personal data users provide upon creating an account (e.g. account credentials, birthdays, contact information, etc.)
User Content: user-made prompts and uploaded files (images, audio recordings, etc.)
Communication Information: the contents of communications sent via email or social media comments
Other Information You Provide: data provided during events or in surveys (e.g. age or identity data)
Personal Data Received from Use of Services: technical information gleaned from a user’s activity, including:
- Web browsers being used
- Time zone
- Content viewed
- Operating system
- Location or IP address
- Account cookies
Data Received from Other Sources: third-party data, including security and service threats, or customer marketing guidance.

Data That is Not Collected

To stay on the right side of the law, there are certain sources and types of data that Open AI avoids – including information stored behind paywalls and dark web data.

Open AI policies also state that the company applies specialised filters to avoid content such as hate speed, spam, and adult content. And as its software is intended for a 13+ age audience, any data submitted by persons under the age of 13 will be investigated and deleted.

Personal Data

How It is Used

The data Open AI receives is used in a variety of ways and has several application, including:

Improving existing services (e.g. training large language models (“LLMs”)) or assisting in developing new products
Analysing and responding to ChatGPT prompts
Preventing misuse of services or fraudulent actions
Complying with various legal obligations such as privacy rights or third-party requirements
Communicating information with users about events or service updates

How It Is Not Used

Despite collecting enormous amounts of data on a regular basis, Open AI has stated that there are certain applications for data that it will not utilise, including building users’ profiles to contact, advertise, or sell customers anything – including raw information.

Data Disclosure

A user’s personal data might, in due course, be required to be exposed to several different parties, which include:

Contracted or partner vendors
Counterparties assisting in a service transaction (e.g. a bankruptcy or receivership matter)
Government authorities
Open AI account administrators or affiliates
Third-party vendors and users

Data Retention

While Open AI does not provide exact guidance on the length of time for which data will be stored (and even states on one help page that “ChatGPT does not copy or store training information in a database”), it does lay out the factors that determine how long data will be kept, which includes:

Legal requirements
Potential acts of harm that could result from unauthorised use or disclosure
The quantity, sensitivity, and nature of the data
Service processing purposes

Deidentified Data

Some data may also be anonymised (“deidentified”) so as to be de-linked from its original record, thus no longer identified with the source user. Open AI may deidentify information but retain the data for the purposes of analysing and improving service offerings or conducting research.

Legal Requests and Denials

Given the numerous legal jurisdictions in which it operates, Open AI strives to ensure it is compliant with all local laws and protects personal information properly. Should it encounter a data request that it considers unlawful, Open AI may deny the issued request.

OAIC Privacy Considerations

In the context of Australia and the Privacy Act 1988, the Office of the Australian Information Commissioner (OAIC) published a report listing five privacy consideration takeaways for Australian organisations that are considering using commercial AI products.

Conduct due diligence to ensure it’s being used for its intended purpose
Establish clear policies and procedures around usage, transparency, and proper privacy governance
AI systems generating or inferring personal information must comply with Australian Privacy Principle 3 (APP 3) – Collection of Solicited Personal Information – demonstrating a clear business need
Any personal information input into an AI system must be used or disclosed “for the primary purpose for which it was collected” in accordance with APP 6 (Use or Disclosure of Personal Information)
As a matter of best practice, they recommend not entering personal or sensitive information into publicly available generative AI tools

Key Takeaways

As more and more publicly-available data (personal, sensitive, or otherwise) gets collected and repurposed to train large language model text generation systems (generative AI), it is more important than ever to engage in careful and secure data practices when dealing with publicly accessible text generation tools.

Nyman Gibson Miralis provides expert advice and representation to individuals and companies the subject of law enforcement and government requests for data from technology companies.

Contact us if you require assistance.

Dennis Miralis AutHor

Dennis Miralis is a leading Australian defence lawyer with over 20 years of experience. Dennis is the Principal of Nyman Gibson Miralis and specialises in international criminal law.