Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Yutaka Matsuo

Yutaka Matsuo

2 indexed papers

Recent (6 mo)
2
With code
0
Influential cites
0
Benchmarked
0

Publications per year

2
26

Top categories

ML×2AI×2

Frequent co-authors

Soichiro Nishimori1×
Paavo Parmas1×
Sotetsu Koyamada1×
Tadashi Kozuno1×
Toshinori Kitamura1×
Shin Ishii1×

Research Timeline

2026
Zipping the Thought: When and How Compressed Reasoning Data Works in LLM Post-Training

This paper investigates how different types of compressed reasoning data (Explicit, Composed, Implicit CoT) affect LLM performance during post-training, finding that the choice of compression and subsequent fine-tuning method significantly impacts generalization and data scaling.

Emergence of Exploration in Policy Gradient Reinforcement Learning via Retrying

The paper introduces ReMax, a novel objective function that naturally encourages stochastic exploration in policy gradient reinforcement learning by evaluating expected maximum returns over multiple samples, and proposes RePPO for efficient optimization.

Highlighted terms show continued research focus across papers

Papers

cs.LGcs.AIRecentMay 29, 2026

Emergence of Exploration in Policy Gradient Reinforcement Learning via Retrying

Soichiro Nishimori, Paavo Parmas, Sotetsu Koyamada, Tadashi Kozuno +3 more

The paper introduces ReMax, a novel objective function that naturally encourages stochastic exploration in policy gradient reinforcement learning by evaluating expected maximum returns over multiple s…

View →
cs.AIcs.LGRecentMay 27, 2026

Zipping the Thought: When and How Compressed Reasoning Data Works in LLM Post-Training

Kohsei Matsutani, Gouki Minegishi, Takeshi Kojima, Yusuke Iwasawa +1 more

This paper investigates how different types of compressed reasoning data (Explicit, Composed, Implicit CoT) affect LLM performance during post-training, finding that the choice of compression and subs…

View →