Brief · Compliance

PDPL & data residency: a checklist for MENA AI teams

Training data is personal data more often than teams assume. Here is what Saudi Arabia's PDPL actually requires, how the regional landscape fits together, and the questions to put to any data partner before a single record leaves your systems.

Bayanat Labs Research· May 2026· 9 min read

This brief is general information, not legal advice. Regulations and implementing rules evolve; confirm current requirements with counsel and with the regulator's published guidance before making compliance decisions.

AI teams in the region tend to treat data protection as a deployment problem — something to solve when the model ships. In practice, the highest-risk step often happens much earlier: the moment call recordings, chat logs, documents or medical notes are handed to an annotation workforce. That handoff is a disclosure, frequently a cross-border transfer, and always something a regulator can ask you to account for.

The PDPL, briefly

Saudi Arabia's Personal Data Protection Law (PDPL) was issued by Royal Decree M/19 in September 2021, substantially amended in March 2023, and came into force on 14 September 2023, with a one-year grace period for compliance that ended on 14 September 2024. It is supervised by the Saudi Data & AI Authority (SDAIA) and is supplemented by Implementing Regulations and dedicated Data Transfer Regulations.

PDPL: from decree to enforcement Sep 2021 PDPL issued (Royal Decree M/19) Mar 2023 Amendments, transfer regime eased Sep 2023 In force; grace period begins Sep 2024 Grace ends — full enforcement
Figure 1. The PDPL is no longer a future obligation. Since September 2024, controllers processing the personal data of Saudi residents are expected to be fully compliant, with SDAIA as supervisory authority.

Three features matter most for AI data work:

Why annotation is the exposed step

A typical labeling engagement copies raw production data — the most identifying form of it — to a vendor, whose annotators view every record in full. If that vendor routes tasks through a global crowd, your call recordings may be listened to on personal laptops across a dozen jurisdictions, outside any transfer mechanism you have assessed. Under the PDPL you remain the controller throughout: the vendor's shortcuts are your liability.

This is why "where is the data processed?" is the single most clarifying question you can ask a data partner. Not where the company is headquartered, and not where the servers are — where the humans who open each record sit, and under what contractual and technical controls.

The regional picture

Saudi Arabia is not an outlier. Most major MENA markets now have comprehensive data protection statutes, each with its own transfer rules and regulator:

JurisdictionLawNotes for AI data work
Saudi ArabiaPDPL (2021, amended 2023)Fully enforced since Sep 2024; SDAIA supervises; dedicated Data Transfer Regulations.
UAEFederal Decree-Law No. 45 of 2021Federal regime plus separate DIFC and ADGM frameworks for financial free zones.
QatarLaw No. 13 of 2016The region's earliest national privacy law; consent-centric.
EgyptLaw No. 151 of 2020Licensing requirements for certain processing; criminal penalties available.
JordanLaw No. 24 of 2023Recent addition; phased compliance for existing processing.

For a team operating across the region, the practical consequence is simple: architecting for the strictest applicable regime — in-region processing, documented lawful basis, minimal disclosure — usually satisfies the rest with little extra work.

The checklist

Before any dataset leaves your environment for annotation, alignment or evaluation, you should be able to answer yes to each of these:

  1. Data mapping. Do we know exactly which fields in this dataset are personal data, and which are sensitive under the PDPL's definitions (health, financial, biometric, beliefs)?
  2. Minimization. Have we removed or masked every field the labeling task does not require? Annotators rarely need names, phone numbers or account IDs to label intent.
  3. Lawful basis. Is the processing covered by consent or another recognized basis, and does our privacy notice actually mention improvement of AI systems?
  4. Residency. Do we know the physical location of storage, of processing, and of the annotators themselves? Is each location consistent with our transfer obligations?
  5. Contracts. Does the data processing agreement bind the vendor to PDPL-consistent terms — purpose limitation, confidentiality, sub-processor disclosure, breach notification timelines, deletion on completion?
  6. Workforce controls. Are annotators under individual NDAs, working in access-controlled environments, with no local downloads and full audit logs per record?
  7. Retention. Is there a dated, verifiable deletion plan for raw data and intermediate artifacts once deliverables are accepted?
  8. Incident path. If a record is exposed, does the vendor's notification commitment leave us enough time to meet our own regulatory deadlines?
Key takeaways
  • The PDPL is enforced now. The grace period ended in September 2024; sensitive-data violations carry criminal penalties.
  • Annotation is a disclosure event. Your compliance posture is only as strong as the humans who open each record — ask where they sit, not where the servers are.
  • In-region is the simplest lawful path. Transfers are possible with safeguards, but residency eliminates the hardest questions.
  • Minimize before you share. Most labeling tasks need far less identifying data than teams send by default.

Need the data work done in-region?

Bayanat Labs runs annotation, alignment and evaluation with in-region residency by default — vetted contributors under NDA, audit trails on every record, and on-prem options for the work that demands them.

Talk to us about residency
More from the insight hub
Why dialect coverage decides your Arabic model's ceiling How to benchmark an Arabic LLM you can actually trust
Bayanat Labs Bayanat Labs © 2026 Bayanat Labs · Riyadh