Friday, 8 May 2026
Subscribe to AIWatcher
AIWatcher
  • Home
  • News

    Beijing Blocks Meta’s Manus Grab: The ‘Singapore Wash’ Strategy Hits a Wall

    By
    AIWadmin

    Ouster’s color lidar sensor aims to kill the camera in robotics and self driving cars

    By
    AIWadmin

    Huang’s Cheerleading Act: Nvidia’s CEO Dismisses AI Job Fears as Sci-Fi Hype

    By
    AIWadmin

    OpenAI and Anthropic’s New Ventures Are a Hostile Takeover of Enterprise AI

    By
    AIWadmin

    The Xteink X3 Is Not a Salvation Device

    By
    AIWadmin

    Uber’s Dark Plan to Turn Every Driver Into an Unpaid Sensor for Its AV Empire

    By
    AIWadmin
  • Articles

    The Great Tech Bloodletting of 2025: 22,000+ Workers Sacrificed on the Altar of AI

    By
    AIWadmin

    OpenAI’s 2025 Reckoning: Code Red, Lawsuits, and the Race Against Rivals

    By
    AIWadmin

    OpenAI’s Secret War on Goblins: Inside the Bizarre Codex Prompt That Bans a Fantasy Species

    By
    AIWadmin

    Japan Airlines Ropes in Wobbly Humanoid Robots to Fill Airport Jobs. It’s Not Going Great Yet.

    By
    AIWadmin

    Google’s Privacy Maze: How Gemini Traps You and Your Data

    By
    AIWadmin

    Google’s $40 Billion Anthropic Bet Is Really a $40 Billion Self-Dealing Loop

    By
    AIWadmin
  • Spotlight

    GitHub Pulls the Plug on Copilot Subsidies, Billing by the Token Starting June 1

    By
    AIWadmin

    Europe Demands Google Unlock Android for Rival AI Assistants. Google Fights Back.

    By
    AIWadmin

    When AI Data Centers Become Battlefield Targets: The Gulf’s Cloud War Just Got Real

    By
    AIWadmin

    Beijing’s Veto of the Meta Manus Deal Exposes the Cracks in US China Tech Relations

    By
    AIWadmin

    Robots Are Your New Baggage Handlers at Haneda Airport. Yes, It Is That Awkward

    By
    AIWadmin

    Google’s Gemini trap: dark patterns designed to hoover your data

    By
    AIWadmin
  • Events
  • More
    • About
    • Services
    • Contact
  • 🔥
  • Alerts
  • Alignment
  • Explainability
  • Legal/Compliance
  • Startups
  • Safety
  • Chips
  • Mobility
  • Vision
  • Robotics
  • Research
  • Medical/Healthcare
Font ResizerAa
AIWatcherAIWatcher
  • Home
  • News
  • Articles
  • Spotlight
  • Events
  • About
Search
  • Quick Links
    • Home
    • News
    • Articles
    • Spotlight
    • Events
  • About AIWatcher
    • Mission
    • Services
    • Contact
Have an existing account? Sign In
Follow US
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
News

AI’s safety blind spots: Researchers call for stronger testing and standards

Experts warn that insufficient evaluation and regulation of AI models are leading to harmful outputs, and that more rigorous testing is urgently needed.

Zoe Chang
Last updated: July 17, 2025 7:52 am
Zoe Chang
Share
SHARE

As first reported by CNBC, researchers are raising alarms over the growing number of harmful and problematic responses generated by AI models, ranging from hate speech to copyright violations and explicit content. The rapid adoption of AI across industries is revealing gaps in testing and oversight, with experts warning that current evaluation methods are not sufficient to safeguard users. “After almost 15 years of research, we still don’t know how to make models behave reliably,” said adversarial machine learning researcher Javier Rando.

Red teaming — a practice borrowed from cybersecurity that involves deliberately probing AI systems for vulnerabilities — has emerged as a vital method for stress-testing models. However, researchers like Shayne Longpre note that the current red-teaming ecosystem is under-resourced. In a recent paper, Longpre and collaborators argue for expanding testing beyond internal teams to include third-party experts such as scientists, doctors, lawyers, and journalists. They also propose standardized AI flaw reporting and reward structures to better document and address model weaknesses.

One initiative, Project Moonshot, offers a promising path forward. Developed in Singapore with support from IBM and DataRobot, the open-source toolkit combines benchmarking, red teaming, and customizable evaluation mechanisms. IBM’s Anup Kumar emphasized that evaluation must be a continuous effort, and while some startups have adopted Moonshot, broader industry engagement remains limited. Future improvements aim to make the tool more adaptable across languages, cultures, and industries.

Experts are also calling for regulation in AI to follow the precedents set by sectors like pharmaceuticals and aviation, where rigorous testing is mandatory before release. Pierre Alquier of ESSEC Business School argued that tech companies are releasing general-purpose models too quickly without understanding the full scope of their potential misuse. Narrower, task-specific models could help mitigate these risks, but for now, developers must avoid overstating the strength of their model safeguards.

The AI industry is at a critical juncture: as models grow in power and ubiquity, their potential for harm escalates just as rapidly. Without proper standards, open testing frameworks, and clear regulatory oversight, both users and developers are left vulnerable. Researchers say that establishing stronger checks — through red teaming, transparency, and policy — is not just a safeguard but a necessary foundation for trustworthy AI.

VIA:CNBC
Share This Article
Email Copy Link Print
Zoe Chang
ByZoe Chang
Zoe is a technology writer based in Taiwan.
Previous Article Microsoft’s chief scientist warns Trump’s AI regulation ban could hinder progress
Next Article NTU students penalised over AI use dispute misconduct ruling and process
Leave a Comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Your Trusted Source for Accurate and Timely Updates!

Our commitment to accuracy, impartiality, and delivering breaking news as it happens has earned us the trust of a vast audience. Stay ahead with real-time updates on the latest events, trends.
FacebookLike
XFollow
InstagramFollow
LinkedInFollow
MediumFollow
QuoraFollow
- Advertisement -
Ad image

You Might Also Like

News

Amazon Deepens Anthropic Bet with $5B Infusion for Trainium Chip Access

By
AIWadmin
News

Google’s Gemini trap: How buried settings and forced defaults steal your privacy

By
AIWadmin
News

Apple Turns iOS Into a Bazaar for AI Models, but Whose Data Is Driving the Cart?

By
AIWadmin
News

Musk’s Four Fails: How He Begged Gates to Fund OpenAI and Got Ghosted

By
AIWadmin
AIWatcher
Facebook Twitter Youtube Linkedin Rss

Global AI News and Information
AIWatcher is your definitive source for AI updates worldwide, from Silicon Valley to Shanghai.
Our industry coverage keeps you in the loop with the latest news and trends shaping the future of AI.

Quick Links
  • News
  • Articles
  • Spotlight
  • Events
About Us
  • Mission
  • Services
  • Contact
  • Privacy Policy
  • Legal

© 2025 AIWatcher. All Rights Reserved.