More

    Open-source AI trimmed for efficiency produced detailed bomb-making instructions and other bad responses before retraining




    • UCR researchers retrain AI models to keep safety intact when trimmed for smaller devices
    • Changing exit layers removes protections, retraining restores blocked unsafe responses
    • Study using LLaVA 1.5 showed reduced models refused dangerous prompts after training

    Researchers at the University of California, Riverside are addressing the problem of weakened safety in open-source artificial intelligence models when adapted for smaller devices.

    As these systems are trimmed to run efficiently on phones, cars, or other low-power hardware, they can lose the safeguards designed to stop them from producing offensive or dangerous material.

    https://cdn.mos.cms.futurecdn.net/3Ek42Bm7W4No2qAL4PKvCU.jpg



    Source link
    waynewilliams@onmail.com (Wayne Williams)

    Latest articles

    spot_imgspot_img

    Related articles

    spot_imgspot_img