Traian Rebedea
4 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper proposes a general-purpose pipeline to train automated red teaming models capable of generating attacks for arbitrary adversarial goals, overcoming the limitations of current methods that are restricted to safety and content moderation.
This paper details the systematic construction and training of a high-performing Romanian Vision-Language Model (VLM), demonstrating that language-specific adaptation significantly boosts performance over general models.
This paper proposes and evaluates guardian-based defenses, both dynamic and static, to mitigate skill injection attacks targeting LLM agents that rely on reusable procedural skills.
This paper introduces and evaluates guardian-based defenses, showing that an intermediary LLM agent can significantly reduce the success rate of skill injection attacks on terminal-based agents, even when attacks are reframed.
Papers
Defenses & Enablers For Skill Injection Attacks on Terminal Based Agents
This paper proposes and evaluates guardian-based defenses, both dynamic and static, to mitigate skill injection attacks targeting LLM agents that rely on reusable procedural skills.