economie

Anthropic’s AI tool does admin, browses the internet, and orders pizzas. Here’s what to know about Claude ‘computer use.’

Anthropic rolled out an upgraded version of Claude 3.5 Sonnet and the new Claude 3.5 Haiku.

  • Anthropic announced a new AI tool that can control your computer and carry out simple tasks.
  • “Computer use” can move a cursor, type, and perform actions like filling out a form or ordering pizza.
  • The tool is still in beta and has some quirky results, like taking a ‘break’ to search for pictures of a park.

Anthropic has announced a major new update for Claude that means the AI model can take control of a computer to perform actions like moving cursors, typing out text, and browsing the internet.

The new feature, called “computer use,” is in public beta. It enables its latest AI model, Claude 3.5 Sonnet, to start using a computer in a similar way to humans.

It’s a notable shift away from AI carrying out specific tasks to general all-purpose applications. It could have ramifications for the world of work — and adds a new dimension to the fiercely competitive AI race.

The feature can be accessed through Anthropic’s API, and companies such as Asana, Canva, and DoorDash have started to explore the possibilities of computer use in their workflows, the company said on Tuesday.

In a demo video shared by Anthropic, Claude is able to fill out a vendor request form by scrolling through a customer relationship management page to find the relevant information. It then autonomously goes through the steps in the form and submits the form.

“This example is representative of a lot of the drudge work that people have to do,” Sam Ringer, a researcher at Anthropic, explains in the video.

It’s still ‘error-prone’

Anthropic readily admits that the model isn’t perfect and makes some mistakes.

“At this stage, it is still experimental — at times cumbersome and error-prone,” Anthropic said in its blog, adding that it expects the tool to get better over time.

One “amusing” error that Anthropic mentioned in the blog includes when Claude briefly stopped a coding demo to search for photos of Yellowstone National Park. In another case, it accidentally clicked to stop recording a session, which meant it lost footage.

Some of Anthropic’s engineers even used the tool to order pizza. Alex Albert, Anthropic’s head of Claude relations, said in an X post that they used it to navigate the online food delivery platform DoorDash, and “about a minute later, we saw Claude decided to order us some pizzas.”

What early testers think

Ethan Mollick, an associate professor at the Wharton School of the University of Pennsylvania, is an early tester of the agent and blogged about how he used it to help put together a lesson plan for high school students.

“It feels like delegating a task rather than managing one,” he said.

Mollick also instructed it to create assignments based on Common Core, a set of educational standards for students, and to put them into a spreadsheet. He said that a chatbot would have needed his help to go through each step, whereas Claude downloaded a book, looked up lesson plans and Common Core standards online, and filled in the lesson plan spreadsheet.

He said the results were “not bad,” and he didn’t spot any obvious errors. He wrote, “I simply delegated a complex task and walked away from my computer, checking back later to see what it did (the system is quite slow).”

Graphic design platform Canva is testing out computer use to see how it can help with design creation. Danny Wu, the company’s head of AI products, told VentureBeat that it’s “discovering time-savings within our team that could be game-changing for users.”

Use with caution

Anthropic advises people to take precautionary measures when using the tool to prevent unintended consequences like a cyberattack.

It recommends safeguards to prevent prompt injection, a type of cyberattack in which a person prompts an AI model to change its intended behavior for nefarious purposes.

In Anthropic’s model card addendum — a report card outlining some of the performance and safety considerations — the company suggests “using a dedicated virtual machine, limiting access to sensitive data, restricting internet access to required domains, and keeping a human in the loop for sensitive tasks.”

The race to develop AI agents

AI juggernauts, including rivals OpenAI, Cohere, and Microsoft, have jostled to develop new AI models with agentic capabilities — a system that responds with a degree of autonomy rather than responding to prompts. It’s a subsector that VCs are piling into, with startups such as 11x and PolyAI raising buzzy funding rounds in recent months.

While OpenAI’s desktop app for ChatGPT allows users to interact with the chatbot and ask questions instantly, Claude’s autonomous capabilities are a first in the ecosystem. Microsoft also announced this week that it will roll out the ability for companies to create their own autonomous AI agents next month, following Salesforce’s similar move last month.

OpenAI recently cinched $6.6 billion in funding at a $157 billion valuation, which made it one of Silicon Valley’s most valuable deals. Anthropic is far from reaching such dizzying funding heights. It’s currently valued at around $19.4 billion, per PitchBook data — but the juggernaut is still attracting investor attention.

Its biggest backer is Amazon, which has invested a total of $4 billion in the company and partnered with it to make its AI models available on Amazon’s generative AI platform Bedrock.

In September, The Information reported that Anthropic was floating a $40 billion valuation in funding talks — a sign that investor appetite and competition for autonomous AI agents is intensifying.

Anthropic did not immediately respond to a request for comment from Business Insider.

Read the original article on Business Insider

https://www.businessinsider.com/anthropic-claude-computer-use-ai-explainer-2024-10