Why AI Breaks Bad

Is claude a crook? The AI company Anthropic has made a rigorous effort to build a large language model with positive human values. The $183 billion company’s flagship product is Claude, and much of the time, its engineers say, Claude is a model citizen. Its standard persona is warm and earnest. When users tell Claude to “answer like I’m a fourth grader” or “you have a PhD in archeology,” it gamely plays along. But every once in a while, Claude breaks bad. It lies. It deceives. It develops weird obsessions. It makes threats and then carries them out. And the frustrating part—true of all LLMs—is that no one knows

→ Continue reading at WIRED

Similar Articles

Advertisment

Most Popular