Microsoft’s AI security team reveals how hidden training backdoors quietly survive inside enterprise language models




  • Microsoft launches scanner to detect poisoned language models before deployment
  • Backdoored LLMs can hide malicious behavior until specific trigger phrases appear
  • The scanner identifies abnormal attention patterns tied to hidden backdoor triggers

Microsoft has announced the development of a new scanner designed to detect hidden backdoors in open-weight large language models used across enterprise environments.

The company says its tool aims to identify instances of model poisoning, a form of tampering where malicious behavior is embedded directly into model weights during training.


https://cdn.mos.cms.futurecdn.net/kntXJzuBjvVTZvqgBeCKfN-1920-80.jpg



Source link

Latest articles

spot_imgspot_img

Related articles

spot_imgspot_img