- Pascal's Chatbot Q&As
- Posts
- OpenAI is asking third-party contractors to upload “real assignments and tasks” from current or past jobs—ideally the actual deliverables.
OpenAI is asking third-party contractors to upload “real assignments and tasks” from current or past jobs—ideally the actual deliverables.
An incentive structure that predictably pulls confidential work product into an AI lab’s orbit while delegating the hardest compliance judgments to the least protected people in the chain.
“Scrub It Yourself”: The Ethical Shortcut That Turns Contractors into Liability Shields
by ChatGPT 5.2
On January 9, 2026, WIRED reported that OpenAI is asking third-party contractors to upload “real assignments and tasks” from current or past jobs—ideally the actual deliverables (Word docs, PDFs, PowerPoints, spreadsheets, code repos)—so OpenAI can benchmark the performance of next-generation “AI agents” against a “human baseline.” Contractors are instructed to remove or anonymize confidential information and personally identifiable information (PII), with the burden placed on them to decide what must be scrubbed.
That setup is not just a procedural risk. It’s a moral and governance failure dressed up as an evaluation pipeline: an incentive structure that predictably pulls confidential work product into an AI lab’s orbit while delegating the hardest compliance judgments to the least protected people in the chain.
What’s morally and ethically wrong
1) It externalizes harm onto the powerless while internalizing value for the powerful.
If the data is valuable enough to improve or evaluate “office-work” agents, it’s valuable because it encodes real organizational knowledge: processes, templates, client context, negotiation styles, internal decision logic, and tacit expertise. The program effectively asks individuals—often precarious contractors—to “donate” that value from previous employers and clients, while taking on the personal downside if anything goes wrong. This is classic moral hazard: the beneficiary (the AI developer) is structurally insulated, while the risk carrier (the contractor) is exposed.
2) It undermines professional trust and workplace confidentiality norms.
Many professions (law, finance, healthcare, consulting, executive support, concierge services, engineering) rely on confidentiality as a baseline ethical commitment—not merely a contractual checkbox. Turning “your past deliverables” into a quarry for model evaluation normalizes the idea that confidentiality is optional if you try to redact. That corrodes trust not only between employer and employee, but between clients and institutions.
3) “Just scrub it” is ethically unserious because re-identification and context leakage are real.
Even when names are removed, documents can remain revealing through unique facts, timelines, writing style, embedded metadata, proprietary structure, or domain-specific markers. Data protection authorities have long emphasized that pseudonymised data can still be personal data; “anonymous” is a high bar that depends on whether identification is reasonably likely.
Ethically, telling contractors to strip PII shifts a complex, technical question (what is truly anonymized in practice?) into a compliance theater.
4) It invites “consent laundering.”
Even if OpenAI only uses the files for evaluation (not training), the path by which the files are obtained matters. Contractors typically do not own the underlying rights to many workplace materials; their ability to hand over deliverables is often sharply limited by contract and duty. Treating “I had access to it once” as a proxy for “I can repurpose it now” is a form of consent laundering: converting a narrow permission (to do a job) into a broader permission (to feed an AI development program).
What’s legally risky or potentially unlawful
1) Trade secret exposure and misappropriation risk.
In both US and European/UK frameworks, trade secret law is centrally about unauthorized acquisition, use, or disclosure of commercially valuable confidential information.
Under the EU Trade Secrets Directive, acquisition can be unlawful via unauthorized copying/appropriation of documents or files containing a trade secret, and use/disclosure can be unlawful when done in breach of confidentiality agreements or duties to limit use.
The UK’s Trade Secrets (Enforcement, etc.) Regulations 2018 implement the Directive.
In the US, the Defend Trade Secrets Act (DTSA) provides a federal civil cause of action for trade secret misappropriation.
WIRED quotes an IP lawyer warning that scaling a pipeline that receives “confidential information from contractors” could expose AI labs to trade secret misappropriation claims—and that contractors may violate NDAs even with scrubbing.
2) Breach of confidence / breach of contract (NDAs, employment terms, client agreements).
Many employment agreements and client contracts prohibit removing, sharing, or repurposing internal work product—even if it’s “redacted.” A contractor can breach simply by disclosing the material at all, or by using it beyond authorized purposes. The “you decide what’s confidential” instruction does not immunize the recipient organization if it is (or should be) on notice that materials could contain protected information.
3) Data protection exposure (GDPR/UK GDPR) if any personal data remains.
If uploaded files include personal data that is merely pseudonymised, GDPR obligations still apply; only properly anonymised data falls outside GDPR scope.
A system that relies on ad hoc contractor scrubbing is structurally prone to accidental inclusion of personal data (names in email threads, signatures, embedded comments, metadata, client details, HR references). That creates potential exposure for both the uploader and the organization determining the purposes/means of processing.
4) Governance and accountability gaps.
The reporting suggests OpenAI provided guidance—reportedly including a “scrubbing” helper tool—but still placed the primary burden on contractors.
From a compliance perspective, this is a red flag: a high-risk data intake channel with weak provenance guarantees, inconsistent redaction quality, and unclear downstream controls.
Recommendations for regulators
Regulators should treat “contractor-uploaded real workplace deliverables” as a high-risk data sourcing pattern and regulate it explicitly—especially when used to validate “AI agents” marketed for enterprise work.
1) Impose provenance and authorization requirements for evaluation datasets (not just training).
Make it illegal (or presumptively unlawful) to ingest third-party workplace materials unless the AI developer can document a lawful basis/authorization from the rights-holder and (where relevant) the data controller—not merely the individual contractor.
2) Require dataset intake due diligence and auditable controls.
Mandate:
documented intake screening,
automated + human review for PII/trade secret indicators,
retention limits,
access controls,
and independent audit trails showing what was received, from whom, under what authority, and how it was used.
3) Treat “contractor self-certification” as insufficient for sensitive materials.
Self-attestation (“I removed confidential info”) should not be an adequate safeguard when the likely content includes trade secrets, regulated data, or client confidences. Regulators can require a tiered review: the more sensitive the domain (legal, health, finance, corporate strategy), the higher the verification bar.
4) Create bright-line protections against trade-secret laundering into AI development.
Adopt rules that mirror anti–money laundering logic: if a developer builds a channel that predictably attracts misappropriated confidential information, liability should not hinge on “we told uploaders to redact.” The standard should ask whether the program is designed with reasonable controls to prevent unlawful acquisition and use.
5) Strengthen privacy enforcement around “anonymisation” claims.
Data protection authorities should scrutinize redaction/anonymisation practices, require DPIAs where appropriate, and treat “pseudonymised” as still regulated personal data.
6) Protect workers from being set up as scapegoats.
Regulators (and labor authorities) should require clear, plain-language warnings to contractors about NDA/confidentiality risks, prohibit contractual clauses that dump liability exclusively on individual uploaders, and enable safe reporting channels when workers are pressured to submit questionable materials.
7) Demand transparency in model evaluation claims.
If “human baseline” benchmarking is used to market enterprise automation, regulators should require disclosure of:
what kinds of real-world materials were used,
how rights/consents were obtained,
and what safeguards prevented leakage of confidential or personal data.
Closing thought
If AI developers want agents that can do real office work, they should build evaluation programs that are governance-first: licensed, permissioned, privacy-safe, and professionally ethical. A pipeline that depends on “upload your past work; scrub it yourself” is the opposite—a shortcut that converts other people’s confidential labor into a comparative advantage, while treating compliance as a crowdsourced afterthought.

