What Is Personal Data? KVKK, Its Types, and Protection in the AI Era
What is personal data? Personal data is any information relating to an identified or identifiable natural person. This guide: a clear definition, why personal data matters, the scope of Türkiye's KVKK, special categories of personal data, data processing, the disclosure obligation, its relationship with AI and LLMs, limits, and FAQs.
What is personal data? Personal data is any information relating to an identified or identifiable natural person. In other words, if a piece of information points to a specific human being — alone or when combined with other data — that information is personal data.
Most people equate personal data only with name and national ID number; yet the definition is far broader. IP address, location, cookie ID, a voice recording, and even a shopping history count as personal data when they make a person identifiable. This guide covers what personal data is, why it matters, how it is protected under Türkiye's KVKK, how it differs from special categories of personal data, and what to watch for in the AI era.
- Personal Data
- Any information relating to an identified or identifiable natural person. Beyond direct identifiers such as name, national ID number, and email, indirect information such as IP address, location, and cookie ID — which points to a person alone or when combined with other data — is also personal data. In Türkiye it is protected by the KVKK (Law No. 6698).
- Also known as: Personal information, PII, personally identifiable information, data under the KVKK
Why Does Personal Data Matter?
Personal data is the most valuable raw material of the digital economy; but it is also a door into a person's private life. Where someone lives, what they click, which illness they have, or whom they vote for can, in the wrong hands, turn into discrimination, fraud, and privacy violations. That is why protecting personal data is not merely a technical issue but a fundamental rights issue.
For organizations, personal data is both an opportunity and an obligation. Used correctly, it enables better products and services; managed poorly, it brings administrative fines, reputational loss, and legal liability. In Türkiye, personal data is protected by Law No. 6698 on the Protection of Personal Data (KVKK), and compliance with this law is now mandatory for organizations of every size. The spread of AI systems has raised this importance further, because these systems are built precisely on large-scale data processing.
How Is Personal Data Determined? Identified and Identifiable Person
To decide whether a piece of information is personal data, one question suffices: can this information be linked, directly or indirectly, to a natural person? If the answer is yes, it is personal data. Two concepts are critical here: the "identified person" and the "identifiable person".
An identified person is the one the information points to directly; for example, a name or national ID number identifies the person on its own. An identifiable person is the case where something looks anonymous alone but can reveal an identity when combined with other data. A postal code, date of birth, and gender each look weak separately, yet together they often point to a single person. This is why the definition of personal data is so broad: in the modern data world, much information assumed to be "anonymous" is in fact identifiable.
An important point in the identifiability test is the "reasonable effort" criterion: it asks whether the additional data, technical means, and cost needed to reach a person are reasonably accessible. The spread of large datasets and powerful matching techniques continually lowers this reasonable-effort threshold; that is, data considered anonymous a decade ago can today be turned into personal data by easily matching it against other sources. So the personal data assessment is not static but a dynamic one that must be redone as technology and the surrounding data ecosystem evolve.
What Are the Types of Personal Data?
Personal data is not uniform; it falls into different categories by the risk it carries and the rules it is subject to. Seeing this distinction is the foundation of both KVKK compliance and sound data management in AI systems.
| Type | Definition | Example |
|---|---|---|
| Direct identifier | Data that identifies the person alone | Name, national ID number, passport number |
| Indirect identifier | Data that points to a person when combined | IP address, location, cookie ID, device ID |
| Contact data | Data that allows reaching the person | Email, phone number, address |
| Special categories of personal data | Sensitive data whose breach causes graver harm | Health, religion, biometrics, sexual life, criminal record |
| Behavioral data | Data derived from a person's actions | Browsing history, purchases, click logs |
The row demanding the most care in this table is special categories of personal data. A leaked email address is annoying; a leaked health or religion detail is a breach that can lead to discrimination. The KVKK takes this distinction seriously and ties special categories to far stricter conditions.
What Are Special Categories of Personal Data?
Special categories of personal data (sensitive data) are the category the law places under special protection because their breach can cause disproportionate harm. Under the KVKK, this category covers a person's race, ethnic origin, political opinion, philosophical belief, religion or sect, dress, membership of associations, foundations, or unions, health, sexual life, criminal conviction, as well as biometric and genetic data.
Processing this data is far harder than ordinary personal data. As a rule, special categories of personal data cannot be processed without the explicit consent of the data subject; and for some categories such as health and sexual life, exceptions beyond explicit consent are quite limited. Additional security measures set by the Personal Data Protection Board must also be applied. For AI, this is a critical warning: when a model is fed special categories of personal data such as health records or biometric data, a much higher legal threshold applies than for an ordinary dataset.
What Is Data Processing? The Disclosure Obligation and Lawful Basis
The most common misunderstanding about personal data is thinking that "processing" means only collecting data. In fact, data processing covers almost every operation on personal data: including collection, recording, organizing, storing, altering, transferring, classifying, and deleting. Even keeping a person's email in a spreadsheet is a data processing activity.
This is where the disclosure obligation comes in. Every data controller that processes personal data must inform the data subject in advance about which data is processed, for what purpose, on which lawful basis, to whom it will be transferred, and the person's rights. The disclosure obligation is separate from and precedes explicit consent: even if consent is not obtained, the person's right to know what is happening is protected. The "privacy policy" and "disclosure notice" on websites are exactly the counterpart of this obligation.
Personal Data and AI: LLMs and Data Security
AI has taken the personal data debate to a new level. Large language models (see what is an LLM) are trained on vast piles of text, and these piles may contain personal data. Likewise, when a user types a customer's name, email, or health detail into a chatbot, that too is a data processing activity and falls under the KVKK.
That is why several principles must be designed into AI projects from the start: data minimization (processing only as much data as needed), purpose limitation, anonymization where possible, and strong data security. In particular, it must be clear where text processed at the token level and prompt content goes, with which provider it is stored, and who can access it. An enterprise AI system, especially in architectures that process corporate documents such as RAG, can put all personal data at risk if built without access control and KVKK compliance in mind.
The way to process personal data safely in AI systems is to make compliance part of the architecture rather than a layer added afterwards. To strike this balance, see the what is KVKK guide and the enterprise RAG systems solution.
What Is the Difference Between Personal Data, KVKK, and GDPR?
There is no single law that governs personal data; different frameworks apply depending on where you operate. In Türkiye the core framework is the KVKK; in the European Union the GDPR applies (see what is GDPR). The two share largely the same philosophy but have important differences.
| Dimension | KVKK (Türkiye) | GDPR (EU) |
|---|---|---|
| Scope | Organizations processing data in Türkiye | Anyone processing data of persons in the EU |
| Protected subject | Only natural persons | Only natural persons |
| Supervisory body | Personal Data Protection Authority (KVKK) | National data protection authorities |
| Administrative fine | Upper limits set in the law | Proportional to global turnover, can be very high |
The practical takeaway is this: for an organization operating only in Türkiye, the KVKK is a sufficient framework; but an organization serving or processing the data of persons in the EU must meet GDPR obligations in addition to the KVKK. Assuming there is "a single global rule" for personal data is one of the most common compliance mistakes.
Common Mistakes and Limits with Personal Data
Because the concept of personal data is broad, even well-meaning organizations often fall into the same mistakes. Most of these stem from misunderstanding the limits of the definition:
- Data assumed "anonymous" being in fact identifiable: Masking or using a pseudonym is often reversible; it is not true anonymization and the data must still be protected.
- Ignoring indirect identifiers: Assuming data such as IP address, location, and cookie ID is not personal data is a common misconception.
- Confusing the disclosure obligation with explicit consent: The two are separate obligations; disclosure is always required, while explicit consent is required only for certain processing bases.
- Processing special categories like ordinary data: Treating health or biometric data as a normal field leads to one of the gravest breaches.
Another common limit mistake is equating personal data only with structured database records. In fact, free-text notes, email correspondence, call-center voice recordings, security camera footage, and log files can also contain personal data. Especially in the AI era, overlooked personal data hidden inside raw text fed to a model and big data piles is the most frequent compliance gap. When compiling a data inventory, these unstructured sources must also be scanned.
Being aware of these limits both lowers legal risk and enables designing AI systems correctly from the start. Seeing personal data protection not as an obstacle but as part of trustworthy system design is the healthiest approach; a well-designed compliance framework, rather than slowing data processing activities, makes them trustworthy and scalable.
Frequently Asked Questions
Is an IP address personal data?
Yes. An IP address counts as personal data because, alone or combined with other data, it makes a person identifiable. The same logic applies to cookie IDs, device IDs, and location data; these fall under the KVKK as indirect identifiers.
What is the difference between personal data and special categories of personal data?
Personal data is any information that makes a person identified or identifiable. Special categories of personal data are the subset whose breach can cause more serious harm: health, religion, ethnic origin, biometric and genetic data, sexual life. Under the KVKK this second group is processed far more strictly, as a rule with explicit consent or narrow exceptions.
Is company information considered personal data?
Information about a legal entity (the company itself) is generally not personal data, because the KVKK protects only natural persons. However, information that can be linked to a specific natural person, such as an employee's name, corporate email, or phone, is personal data. So the distinction between 'company data' and 'data of a person at the company' matters.
How do AI models affect personal data?
AI and large language models process large amounts of text during training and use; this text may contain personal data. A name, email, or health detail a user types into a prompt is also data processing. That is why data minimization, anonymization, and KVKK compliance must be designed into AI projects from the start.
Is anonymized data still personal data?
Truly anonymized data — data that can no longer be linked to a person in any way — is not personal data and falls outside the KVKK. However, masking or pseudonymization is often reversible and therefore insufficient; in that case the data is still personal data and remains protected.
Is explicit consent always required to process personal data?
No. Explicit consent is only one of the lawful bases in the KVKK. Other bases such as performance of a contract, a legal obligation, or legitimate interest can also make processing lawful. However, for special categories of personal data the rule is stricter and, in most cases, explicit consent or a clear legal exception is required.
In Short: What Is Personal Data?
In short, the answer to what is personal data is: any information, direct or indirect, relating to an identified or identifiable natural person. It spans a wide range from IP address and location to health records; special categories of personal data are protected more strictly, every data processing requires a lawful basis and the disclosure obligation, and in Türkiye all of it is secured by the KVKK. In the AI era these principles matter even more. For the basics see the what is KVKK and what is data anonymization guides, and for enterprise compliance and secure AI design start with AI consulting.
Consulting Pathways
Consulting pages closest to this article
For the most logical next step after this article, you can review the most relevant solution, role, and industry landing pages here.
Enterprise RAG Systems Development
Production-grade RAG systems that provide grounded, secure and auditable access to internal knowledge.
AI Agents and Workflow Automation
Move beyond single-step chatbots to AI workflows orchestrated with tools, rules and human approval.
RAG and Compliance Assistants for Banking
Banking-focused AI systems that provide secure, grounded and auditable access to regulations, policies, procedures and internal knowledge.