Seongyong Ju

4 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×4Crypto×3

Frequent co-authors

Junyoung Park3×

Sunghwan Park3×

Jaewoo Lee3×

Taein Lim1×

Munhyeok Kim1×

Hyunjun Kim1×

Research Timeline

2026

CyBiasBench: Benchmarking Bias in LLM Agents for Cyber-Attack Scenarios

The paper introduces CyBiasBench, a comprehensive benchmark that quantifies the inherent, agent-specific bias in LLM agents' attack selection patterns in cybersecurity scenarios.

Beyond Attack Success Rate: Temporal Logit Observability for LLM Safety Failures

The paper introduces Temporal Logit Observability (TLO), a training-free diagnostic that analyzes the decoding process to reveal the temporal patterns of LLM safety failures, showing that failure mechanisms are often distinct even when the final Attack Success Rate is the same.

Persona Attack: Incremental Memory Injection Jailbreak Attack against Large Language Models

The paper introduces Persona Attack, a novel memory injection jailbreak method that demonstrates how accumulating instructions in the model's context window can override internal safety alignments, achieving high success rates.

Persona Attack: Incremental Memory Injection Jailbreak Attack against Large Language Models

The paper introduces Persona Attack, a novel memory injection jailbreak method that demonstrates that accumulating instructions in the model's context window can override internal safety alignments, achieving high attack success rates.

Highlighted terms show continued research focus across papers

Papers

cs.CRcs.AIRecentMay 29, 2026

Persona Attack: Incremental Memory Injection Jailbreak Attack against Large Language Models

Junyoung Park, Seongyong Ju, Sunghwan Park, Jaewoo Lee

View →

cs.CRcs.AIRecentMay 29, 2026