Risto Miikkulainen
2 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces TSVD, a novel framework that efficiently pre-trains LLMs by enforcing both low rank and strict weight orthonormality, achieving performance comparable to full-parameter models with significantly reduced computational cost.
This paper introduces Anchored Weight Decay (AWD), a regularization technique that effectively prevents prior-task forgetting during LLM fine-tuning with Evolution Strategies (ES), positioning ES as a viable method for continual learning.
Papers
Overcoming Forgetting in LLM Fine-Tuning with Evolution Strategies
This paper introduces Anchored Weight Decay (AWD), a regularization technique that effectively prevents prior-task forgetting during LLM fine-tuning with Evolution Strategies (ES), positioning ES as a…