Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Sai Puppala

Sai Puppala

6 indexed papers

Recent (6 mo)
6
With code
0
Influential cites
0
Benchmarked
0

Publications per year

6
26

Top categories

Crypto×6AI×6ML×4

Frequent co-authors

Ismail Hossain6×
Sajedul Talukder6×
Md Jahangir Alam4×
Syed Bahauddin Alam4×
Tanzim Ahad3×
Zhuoran Lu2×

Research Timeline

2026
Semantic Intent Fragmentation: A Single-Shot Compositional Attack on Multi-Agent AI Pipelines

The paper introduces Semantic Intent Fragmentation (SIF), an attack class demonstrating that multi-agent AI orchestrators can violate security policies through a composition of individually benign subtasks, even when subtask-level safety checks pass.

When Safety Geometry Collapses: Fine-Tuning Vulnerabilities in Agentic Guard Models

The paper demonstrates that fine-tuning safety guard models on benign data can catastrophically collapse their safety alignment, proposing Fisher-Weighted Safety Subspace Regularization (FW-SSR) to actively maintain safety geometry.

The Art of the Jailbreak: Formulating Jailbreak Attacks for LLM Security Beyond Binary Scoring

This paper addresses the lack of systematic infrastructure for evaluating jailbreak attacks by introducing a large-scale dataset, an automated generation method, and a continuous evaluation metric that surpasses traditional binary scoring.

The Misattribution Gap: When Memory Poisoning Looks Like Model Failure in Agentic AI Systems

The paper identifies the Misattribution Gap, showing that memory-layer attacks (Semantic Norm Drift) can mimic model failure in multi-agent AI systems, and proposes novel detection and mitigation techniques.

Benchmarking Security Risk Detection and Verification in Open Agentic Skill Ecosystems

The paper introduces SkillVetBench, a novel two-stage benchmark that effectively detects and verifies malicious behavior hidden within open agentic skills, significantly outperforming static and semantic-only detection methods.

Benchmarking Security Risk Detection and Verification in Open Agentic Skill Ecosystems

The paper introduces SkillVetBench, a novel two-stage benchmark that effectively detects and verifies malicious behavior in open agentic skill ecosystems, significantly outperforming existing static and semantic-only detection methods.

Highlighted terms show continued research focus across papers

Papers

cs.CRcs.AIRecentMay 30, 2026

Benchmarking Security Risk Detection and Verification in Open Agentic Skill Ecosystems

Ismail Hossain, Sai Puppala, Zhuoran Lu, Sajedul Talukder +1 more

The paper introduces SkillVetBench, a novel two-stage benchmark that effectively detects and verifies malicious behavior hidden within open agentic skills, significantly outperforming static and seman…

View →
cs.CRcs.AIRecentMay 30, 2026

Benchmarking Security Risk Detection and Verification in Open Agentic Skill Ecosystems

Ismail Hossain, Sai Puppala, Zhuoran Lu, Sajedul Talukder +1 more

The paper introduces SkillVetBench, a novel two-stage benchmark that effectively detects and verifies malicious behavior in open agentic skill ecosystems, significantly outperforming existing static a…

View →
cs.CRcs.AIcs.LGRecentMay 12, 2026

The Misattribution Gap: When Memory Poisoning Looks Like Model Failure in Agentic AI Systems

Tanzim Ahad, Ismail Hossain, Md Jahangir Alam, Sai Puppala +2 more

The paper identifies the Misattribution Gap, showing that memory-layer attacks (Semantic Norm Drift) can mimic model failure in multi-agent AI systems, and proposes novel detection and mitigation tech…

View →
cs.CRcs.AIcs.LGRecentMay 9, 2026

The Art of the Jailbreak: Formulating Jailbreak Attacks for LLM Security Beyond Binary Scoring

Ismail Hossain, Tanzim Ahad, Md Jahangir Alam, Sai Puppala +2 more

This paper addresses the lack of systematic infrastructure for evaluating jailbreak attacks by introducing a large-scale dataset, an automated generation method, and a continuous evaluation metric tha…

View →
cs.CRcs.AIcs.LGRecentApr 8, 2026

Semantic Intent Fragmentation: A Single-Shot Compositional Attack on Multi-Agent AI Pipelines

Tanzim Ahad, Ismail Hossain, Md Jahangir Alam, Sai Puppala +3 more

The paper introduces Semantic Intent Fragmentation (SIF), an attack class demonstrating that multi-agent AI orchestrators can violate security policies through a composition of individually benign sub…

View →
cs.LGcs.AIcs.CRRecentApr 8, 2026

When Safety Geometry Collapses: Fine-Tuning Vulnerabilities in Agentic Guard Models

Ismail Hossain, Sai Puppala, Jannatul Ferdaus, Md Jahangir Alam +3 more

The paper demonstrates that fine-tuning safety guard models on benign data can catastrophically collapse their safety alignment, proposing Fisher-Weighted Safety Subspace Regularization (FW-SSR) to ac…

View →