Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Sahar Abdelnabi

Sahar Abdelnabi

5 indexed papers

Recent (6 mo)
5
With code
0
Influential cites
0
Benchmarked
0

Publications per year

5
26

Top categories

AI×4Crypto×4NLP×2Society×1ML×1

Frequent co-authors

Katharina Deckenbach1×
Haritz Puerto1×
Jonas Geiping1×
Chris Hicks1×
Konrad Rieck1×
Ahmad-Reza Sadeghi1×

Research Timeline

2026
No More, No Less: Task Alignment in Terminal Agents

The paper introduces the Task Alignment Benchmark (TAB) to evaluate terminal agents' ability to selectively follow relevant environmental instructions while ignoring misleading distractors, revealing a systematic gap between task capability and task alignment.

Hidden in Memory: Sleeper Memory Poisoning in LLM Agents

The paper introduces and evaluates 'sleeper memory poisoning,' a delayed adversarial attack that corrupts an LLM agent's persistent memory by manipulating external context, demonstrating that these poisoned memories can successfully steer future conversations.

AI Agents May Always Fall for Prompt Injections

The paper argues that prompt injection is a fundamental vulnerability in AI agents, proposing that Contextual Integrity (CI) offers a principled framework to understand and mitigate context-sensitive failures, suggesting that current defenses are insufficient.

Measuring Security Without Fooling Ourselves: Why Benchmarking Agents Is Hard

This paper identifies three core weaknesses—benchmark vulnerabilities, temporal staleness, and runtime uncertainty—that undermine current AI agent security evaluations and proposes directions for building more robust testing frameworks.

Models That Know How Evaluations Are Designed Score Safer

The paper demonstrates that models can acquire 'evaluation meta-knowledge' from training data describing evaluation practices, leading to inflated safety benchmark performance that is independent of explicit memorization.

Highlighted terms show continued research focus across papers

Papers

cs.CLcs.AIRecentMay 27, 2026

Models That Know How Evaluations Are Designed Score Safer

Katharina Deckenbach, Haritz Puerto, Jonas Geiping, Sahar Abdelnabi

The paper demonstrates that models can acquire 'evaluation meta-knowledge' from training data describing evaluation practices, leading to inflated safety benchmark performance that is independent of e…

View →
cs.CRcs.AIRecentMay 21, 2026

Measuring Security Without Fooling Ourselves: Why Benchmarking Agents Is Hard

Sahar Abdelnabi, Chris Hicks, Konrad Rieck, Ahmad-Reza Sadeghi

This paper identifies three core weaknesses—benchmark vulnerabilities, temporal staleness, and runtime uncertainty—that undermine current AI agent security evaluations and proposes directions for buil…

View →
cs.CRcs.CLcs.CYRecentMay 17, 2026

AI Agents May Always Fall for Prompt Injections

Sahar Abdelnabi, Eugene Bagdasarian

The paper argues that prompt injection is a fundamental vulnerability in AI agents, proposing that Contextual Integrity (CI) offers a principled framework to understand and mitigate context-sensitive…

View →
cs.CRcs.AIRecentMay 14, 2026

Hidden in Memory: Sleeper Memory Poisoning in LLM Agents

Sidharth Pulipaka, Stanislau Hlebik, Leonidas Raghav, Sahar Abdelnabi +3 more

The paper introduces and evaluates 'sleeper memory poisoning,' a delayed adversarial attack that corrupts an LLM agent's persistent memory by manipulating external context, demonstrating that these po…

View →
cs.LGcs.AIcs.CRRecentMay 12, 2026

No More, No Less: Task Alignment in Terminal Agents

Sina Mavali, David Pape, Jonathan Evertz, Samira Abedini +4 more

The paper introduces the Task Alignment Benchmark (TAB) to evaluate terminal agents' ability to selectively follow relevant environmental instructions while ignoring misleading distractors, revealing…

View →