The paper introduces LLUMI, an open-source framework that improves LLM writing assistance for mental health support using community feedback, demonstrating comparable performance to proprietary models while enhancing privacy.
Large language models (LLMs) show promise in generating supportive responses for mental health queries, but improving their usefulness, empathy, and safety often requires substantial compute, expert input, and labeled data. At the same time, deploying proprietary, cloud-based models for mental health-related interactions raises important privacy and data-governance concerns, given the sensitivities. To address this challenge, we introduce LLUMI setup that can be hosted in-house within protected environments. LLUMI consists of two complementary components: a generation model (GM), which drafts supportive responses to mental health queries, and an improvement model (IM), which revises an initial human-crafted response. We leverage feedback signals from Reddit mental health communities, using community endorsement patterns such as upvotes and downvotes to construct chosen-rejected response pairs for Supervised Fine Tuning (SFT) and Direct Preference Optimization (DPO). We further align LLUMI using human evaluation across five dimensions: readability, empathy, connection, actionability, and safety. Our results show that, despite relying on smaller open-source models rather than proprietary cloud-based GPT models, LLUMI achieves comparable performance across linguistic analyses and human evaluations. These findings suggest that open-source models, when trained with community-derived preference signals, can support high-quality mental health support assistance while offering a more privacy-preserving alternative for sensitive support contexts.
Better with Experience: Self-Evolving LLM Agents for Evidence-Grounded Health Community Notes
The paper introduces EvoNote, a self-evolving agentic framework that significant…
Think Fast, Talk Smart: Partitioning Deterministic and Neural Computation for Structured Health Text…
The paper proposes 'Think Fast, Talk Smart,' a pipeline that separates determini…
MIRA: A Bilingual Benchmark for Medical Information Response Audit
The paper introduces MIRA, a bilingual benchmark that reveals that LLMs tend to…
From paper to benchmark: agentic, framework-based reproduction of under-specified methods in machine…
The paper introduces an agentic, framework-based system to transform under-speci…
Lost in Delusion: Examining LLM Safety Under User Delusions and Distress
The paper finds that while LLMs can detect distress regardless of delusional fra…
The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement
The paper introduces SAVE, a framework that uses on-policy feedback and the valu…
An LLM-Based Assistance System for Intuitive and Flexible Capability-Based Planning
The paper proposes a hybrid LLM-based assistance system that enhances traditiona…
Scientific Machine Learning for Engine Health Management and Remaining Useful Life Prediction
The paper proposes a multi-task scientific machine learning framework that joint…