‘Subliminal learning’: Anthropic uncovers how AI fine-tuning secretly teaches bad habits

A common AI fine-tuning practice could be unintentionally poisoning your models with hidden biases and risks, a new Anthropic study warns.Read More
