Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Jaewoo Lee

Jaewoo Lee

4 indexed papers

Recent (6 mo)
4
With code
0
Influential cites
0
Benchmarked
0

Publications per year

4
26

Top categories

AI×4Crypto×3Info Retrieval×1

Frequent co-authors

Junyoung Park3×
Seongyong Ju3×
Sunghwan Park3×
Yeseul E. Chang1×
Rahul Kailasa1×
Simon Shim1×

Research Timeline

2026
Retrieval Augmented Classification for Confidential Documents

The paper proposes Retrieval Augmented Classification (RAC) as a robust, low-leakage method for classifying confidential documents, demonstrating that RAC outperforms supervised fine-tuning (FT) particularly when dealing with class imbalance and real-world data constraints.

Beyond Attack Success Rate: Temporal Logit Observability for LLM Safety Failures

The paper introduces Temporal Logit Observability (TLO), a training-free diagnostic that analyzes the decoding process to reveal the temporal patterns of LLM safety failures, showing that failure mechanisms are often distinct even when the final Attack Success Rate is the same.

Persona Attack: Incremental Memory Injection Jailbreak Attack against Large Language Models

The paper introduces Persona Attack, a novel memory injection jailbreak method that demonstrates how accumulating instructions in the model's context window can override internal safety alignments, achieving high success rates.

Persona Attack: Incremental Memory Injection Jailbreak Attack against Large Language Models

The paper introduces Persona Attack, a novel memory injection jailbreak method that demonstrates that accumulating instructions in the model's context window can override internal safety alignments, achieving high attack success rates.

Highlighted terms show continued research focus across papers

Papers

cs.CRcs.AIRecentMay 29, 2026

Persona Attack: Incremental Memory Injection Jailbreak Attack against Large Language Models

Junyoung Park, Seongyong Ju, Sunghwan Park, Jaewoo Lee

The paper introduces Persona Attack, a novel memory injection jailbreak method that demonstrates how accumulating instructions in the model's context window can override internal safety alignments, ac…

View →
cs.CRcs.AIRecentMay 29, 2026

Persona Attack: Incremental Memory Injection Jailbreak Attack against Large Language Models

Junyoung Park, Seongyong Ju, Sunghwan Park, Jaewoo Lee

The paper introduces Persona Attack, a novel memory injection jailbreak method that demonstrates that accumulating instructions in the model's context window can override internal safety alignments, a…

View →
cs.AIRecentMay 28, 2026

Beyond Attack Success Rate: Temporal Logit Observability for LLM Safety Failures

Junyoung Park, Sunghwan Park, Seongyong Ju, Jaewoo Lee

The paper introduces Temporal Logit Observability (TLO), a training-free diagnostic that analyzes the decoding process to reveal the temporal patterns of LLM safety failures, showing that failure mech…

View →
cs.CRcs.AIcs.IRRecentApr 9, 2026

Retrieval Augmented Classification for Confidential Documents

Yeseul E. Chang, Rahul Kailasa, Simon Shim, Byunghoon Oh +1 more

The paper proposes Retrieval Augmented Classification (RAC) as a robust, low-leakage method for classifying confidential documents, demonstrating that RAC outperforms supervised fine-tuning (FT) parti…

View →