More

    Private API keys and passwords found in AI training dataset – nearly 12,000 details leaked




    • Truffle Security found thousands of pieces of private info in Common Crawl
    • The archives are used to train some of the biggest LLMs today
    • The researchers notified the vendors and helped fix the problem

    Cybersecurity researchers have found thousands of login credentials and other secrets in the Common Crawl dataset.

    Common Crawl is a nonprofit organization that provides a freely accessible archive of web data, collected through large-scale web crawling. As of recent estimates, the organization hosts over 250 petabytes of web data, with monthly crawls adding several petabytes more.

    https://cdn.mos.cms.futurecdn.net/dEpz5LV5PYpqYBngLd6omi-1200-80.jpg



    Source link

    Latest articles

    spot_imgspot_img

    Related articles

    Leave a reply

    Please enter your comment!
    Please enter your name here

    spot_imgspot_img