20 results for “domain-specialized”
CS papers onlyHybrid search: Keyword + semantic, ranked by combined score.ⓘ
Want pure semantic search? Try claim verification →
The paper introduces LearnWeak, an annotation-free framework that automatically specializes small computer-use agents by identifying and targeting their specific weaknesses using a stronger reference…
Tong Ye, Hang Yu, Tengfei Ma, Xuhong Zhang +5 more
The paper introduces DOMINO, a novel inductive framework that synthesizes domain-specific data for LLMs using only reference examples, significantly improving performance on challenging, implicitly de…
This paper introduces a novel malware detection system for macOS by utilizing domain-specific static features, achieving state-of-the-art performance and demonstrating strong generalization capabiliti…
Yunkai Lou, Longbin Lai, Shunyang Li, Zhengping Qian +1 more
SpecDB is a novel system that uses LLMs to synthesize highly customized, purpose-built relational databases, achieving performance comparable to commercial systems while significantly reducing code si…
Zheng Yuan, Chuang Zhou, Linhao Luo, Siyu An +3 more
MoG proposes a novel Mixture of Experts framework for graph-based RAG, which uses hub graphs to guide the sparse activation of domain-specific expert graphs, significantly improving retrieval accuracy…
Yuduo Li, Xiaofeng Shi, Qian Kou, Longbin Yu +1 more
RAFT proposes a two-stage framework combining data refinement and adaptive distillation to improve domain-specific fine-tuning while mitigating the loss of general model capabilities.
Yujie Luo, Xiangyuan Ru, Jingsheng Zheng, Jingjing Wang +9 more
The paper introduces Autonomous Agentic Data Engineering, demonstrating that LLMs can autonomously plan and optimize end-to-end data curation pipelines, leading to substantial performance gains in spe…
Yaxin Luo, Jiacheng Cui, Xiaohan Zhao, Xinyi Shang +4 more
The paper introduces LLMSurgeon, a framework that estimates the domain-level data mixture of a Large Language Model (LLM) using only generated text, thereby providing a post-hoc method to audit the mo…
EvoPool introduces an evolutionary multi-agent framework that efficiently generates high-quality, specialized supervision labels, significantly outperforming LLM annotation baselines across complex, l…
mcp-proto-okn is a Python server that facilitates natural language access to complex scientific knowledge graphs, simplifying cross-domain knowledge analysis for biomedical research.
The paper proposes an aggressive, parameter-efficient method to prune non-essential experts from Mixture-of-Experts (MoE) LLMs, significantly compressing the model while maintaining high machine trans…
The paper introduces ProbMoE, a probabilistic routing framework that tackles the non-differentiability of top-$k$ routing in Mixture-of-Experts (MoE) models, achieving strong performance with improved…
The paper proposes SubFit, a novel compression technique that achieves superior LLM compression by replacing non-contiguous, submodule-level components (Attention and FeedForward) with lightweight res…
The paper introduces 'bundesrecht,' an open-source, end-to-end pipeline for processing complex German statutory references, which parses, normalizes, and resolves raw citation strings into structured,…
This paper introduces the first LLM-generated, domain-independent heuristics for symbolic AI planning, using evolutionary search to surpass the performance of hand-engineered state-of-the-art methods.
Safayat Bin Hakim, Aniqa Afzal, Qi Zhao, Vigna Majmundar +2 more
CyberCane is a neuro-symbolic framework that enhances phishing detection by combining symbolic rule analysis with privacy-preserving RAG and formal ontology reasoning, achieving high recall against AI…
The paper introduces CBCL, a provably safe and extensible agent communication language that constrains all message extensions to the deterministic context-free language (DCFL) class.
The paper introduces formal patterns to enhance and compose security components (lingos and dialects) for network protocols, providing generic, verifiable methods for hardening distributed systems.
The paper introduces a novel LLM-driven evolutionary framework to synthesize admissible, domain-specific pattern generators, enabling optimal classical planning with high performance and interpretabil…
This paper demonstrates that in-domain pretraining of BERT significantly improves the detection of DNS exfiltration, particularly in maintaining a low false positive rate.