OpenAI Is Asking Contractors to Upload Work From Past Jobs to Evaluate the Performance of AI Agents

21 hours ago 6

OpenAI is asking third-party contractors to upload existent assignments and tasks from their existent oregon erstwhile workplaces truthful that it tin usage the information to measure the show of its next-generation AI models, according to records from OpenAI and the grooming information institution Handshake AI obtained by WIRED.

The task appears to beryllium portion of OpenAI’s efforts to found a quality baseline for antithetic tasks that tin past beryllium compared with AI models. In September, the institution launched a caller evaluation process to measurement the show of its AI models against quality professionals crossed a assortment of industries. OpenAI says this is simply a cardinal indicator of its advancement towards achieving AGI, oregon an AI strategy that outperforms humans astatine astir economically invaluable tasks.

“We’ve hired folks crossed occupations to assistance cod real-world tasks modeled disconnected those you’ve done successful your full-time jobs, truthful we tin measurement however good AI models execute connected those tasks,” reads 1 confidential papers from OpenAI. “Take existing pieces of semipermanent oregon analyzable enactment (hours oregon days+) that you’ve done successful your concern and crook each into a task."

OpenAI is asking contractors to picture tasks they’ve done successful their existent occupation oregon successful the past and to upload existent examples of enactment they did, according to an OpenAI presumption astir the task viewed by WIRED. Each of the examples should beryllium “a factual output (not a summary of the file, but the existent file), e.g., Word doc, PDF, Powerpoint, Excel, image, repo,” the presumption notes. OpenAI says radical tin besides stock fabricated enactment examples created to show however they would realistically respond successful circumstantial scenarios.

OpenAI and Handshake AI declined to comment.

Real-world tasks person 2 components, according to the OpenAI presentation. There’s the task petition (what a person’s manager oregon workfellow told them to do) and the task deliverable (the existent enactment they produced successful effect to that request). The institution emphasizes aggregate times successful instructions that the examples contractors stock should bespeak “real, on-the-job work” that the idiosyncratic has “actually done.”

One illustration successful the OpenAI presumption outlines a task from a “Senior Lifestyle Manager astatine a luxury concierge institution for ultra-high-net-worth individuals.” The extremity is to “Prepare a short, 2-page PDF draught of a 7-day yacht travel overview to the Bahamas for a household who volition beryllium traveling determination for the archetypal time.” It includes further details regarding the family’s interests and what the itinerary should look like. The “experienced quality deliverable” past shows what the contractor successful this lawsuit would upload: a existent Bahamas itinerary created for a client.

OpenAI instructs the contractors to delete firm intelligence spot and personally identifiable accusation from the enactment files they upload. Under a conception labeled “Important reminders,” OpenAI tells the workers to “Remove oregon anonymize any: idiosyncratic information, proprietary oregon confidential data, worldly nonpublic accusation (e.g., interior strategy, unreleased merchandise details).”

One of the files viewed by WIRED papers mentions an ChatGPT instrumentality called “Superstar Scrubbing" that provides proposal connected however to delete confidential information.

Evan Brown, an intelligence spot lawyer with Neal & McDevitt, tells WIRED that AI labs that person confidential accusation from contractors astatine this standard could beryllium taxable to commercialized concealed misappropriation claims. Contractors who connection documents from their erstwhile workplaces to an AI company, adjacent scrubbed, could beryllium astatine hazard of violating their erstwhile employers’ non-disclosure agreements, oregon exposing commercialized secrets.

“The AI laboratory is putting a batch of spot successful its contractors to determine what is and isn’t confidential,” says Brown. “If they bash fto thing gaffe through, are the AI labs truly taking the clip to find what is and isn’t a commercialized secret? It seems to maine that the AI laboratory is putting itself astatine large risk.”

Read Entire Article