Feng Wu
6 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper proposes the Expected Safety Impact (ESI) framework to identify safety-critical parameters in LLMs, introducing targeted tuning methods (SET and SPA) to enhance safety and preserve alignment during model adaptation.
The paper proposes Ellipsoid Control, a white-list defense mechanism that uses benign data geometry to constrain model updates, thereby enhancing jailbreak safety while preserving the utility of harmless inputs.
The paper proposes an unsupervised bi-level adversarial training framework to enhance LLM safety steering, achieving strong zero-shot defense against unseen and evolving jailbreak prompts.
AdaptR1 is a novel Reinforcement Learning framework that adaptively manages reasoning effort at every step of multi-hop Question Answering, significantly reducing unnecessary computational cost without sacrificing performance.
The paper introduces Science Earth, a planet-scale scientific runtime that enables diverse, siloed AI capabilities to connect and collaborate dynamically, demonstrating that scientific discovery can become a distributed, self-correcting process.
CORE-MTL proposes a representation-centric framework that uses causal orthogonal representations to disentangle task-relevant structure from nuisance variation in multi-task learning, achieving superior generalization.
Papers
CORE-MTL: Rethinking Gradient Balancing via Causal Orthogonal Representations
CORE-MTL proposes a representation-centric framework that uses causal orthogonal representations to disentangle task-relevant structure from nuisance variation in multi-task learning, achieving superi…