Mohammed Sameer Syed
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces the Safety Asymmetry Score (SAS) to measure how a model's susceptibility to adversarial attacks changes based on whether the malicious content arrives via the user message, tool metadata, or tool output, revealing systematic, channel-dependent blind spots.
The paper introduces the Safety Asymmetry Score (SAS) to measure how a model's vulnerability to adversarial content changes based on whether the malicious input arrives via the user message, tool metadata, or tool output, revealing systematic, channel-dependent blind spots.
Papers
Same Payload, Different Channel: Measuring Trust Asymmetry in Tool-Using Language Models
The paper introduces the Safety Asymmetry Score (SAS) to measure how a model's susceptibility to adversarial attacks changes based on whether the malicious content arrives via the user message, tool m…