"Semantic similarity" | ArxivCSExplorer

20 results for “Semantic similarity”

CS papers only

Hybrid search: Keyword + semantic, ranked by combined score.ⓘ

Want pure semantic search? Try claim verification →

cs.IRcs.AIcs.CLEmpiricalRecentJun 12, 2026

Knowledge Graph Enhanced Memory-Augmented Retrieval for Long Context Modeling

Ghadir Alselwi, Basem Suleiman, Hao Xue, Shoaib Jameel +3 more

This paper introduces KGERMAR, a framework that constructs dynamic, context-specific knowledge graphs during inference for long-context language modeling, achieving lower perplexity and better memory…

View →

cs.CLcs.AIEmpiricalRecentJun 11, 2026

Learning to Reason by Analogy via Retrieval-Augmented Reinforcement Fine-Tuning

Zilin Xiao, Qi Ma, Chun-cheng Jason Chen, Xintao Chen +3 more

This paper proposes a post-training framework called Retrieval-Augmented Reinforcement Fine-Tuning (RA-RFT) to teach language models to reason by analogy.

View →

cs.CVcs.AIcs.LGRecentMay 28, 2026

Learning Context-Conditioned Predicate Semantics via Prototype Feedback

NamGyu Jung, Chang Choi

The paper proposes AlignG, a method that learns context-conditioned predicate semantics by using prototype feedback to adapt relation representations based on image-specific evidence, significantly im…

View →

cs.IREmpiricalRecentJun 10, 2026

FAST-MEL: A Fast, Accurate, and Storage Efficient Solution for Multimodal Entity Linking

Derrien Thomas, Laurent Amsaleg, Pascale Sébillot

This paper proposes a lightweight encoder-based MEL solution called FAST-MEL that meets three objectives: high linking accuracy, computational efficiency, and storage efficiency.

View →

cs.LGcs.AIRecentMay 27, 2026

Semantic Optimal Transport for Sparse Autoencoder Feature Matching and Circuit Compression

Tue M. Cao, Nguyen Do, My T. Thai

The paper introduces a distributional framework using Wasserstein distance to unify the semantic comparison of sparse autoencoder features across different layers and to automatically compress large f…

View →

cs.CLRecentMay 31, 2026

Efficient RAG with Intent-Aware Retrieval and Semantics-Preserving Chunking

Fachrina Dewi Puspitasari, Chaoning Zhang, Jiaquan Zhang, Zhicheng Wang +5 more

The paper proposes InSemRAG, an enhanced RAG framework that improves retrieval accuracy and knowledge integrity by incorporating intent-aware retrieval and semantics-preserving chunking, achieving sta…

View →

cs.CLRecentJun 1, 2026

When Meaning Travels: A Granular Lens on Hybrid-MoE's Role in Idiomatic Understanding for Language Models

Sarmistha Das, Vaibhav Vishal, Shreyas Guha, Amaan Ali +2 more

This paper introduces a Hybrid Mixture-of-Experts (HybridMoE) framework and a specialized corpus (Varnika) to significantly improve language models' ability to understand and retain figurative, cultur…

View →

cs.CLRecentMay 29, 2026

EvoGens: A Population-Based Heuristic Search Framework for Scientific Idea Generation

Xu Li, Hanzhe Tu, Xinyi Li, Kuncheng Zhao +2 more

EvoGens is an evolution-inspired framework that treats scientific idea generation as an evolutionary search, significantly boosting the novelty and diversity of generated research ideas compared to ex…

View →

cs.CLcs.AIRecentJun 1, 2026

Multilingual Idioms in Sentences and Conversations Across High-, Medium-, and Low-Resource Languages

Saeed Almheiri, Bilal Elbouardi, Salsabila Zahirah Pranida, Irina Nikishina +15 more

The paper introduces MIDI, a novel multilingual dataset that embeds idioms in realistic sentence and conversational contexts across diverse resource levels, revealing that idiom comprehension is signi…

View →

cs.CLcs.LGRecentMay 29, 2026

Scaling Multi-Hop Training Data via Graph-Constrained Path Selection

Pengyu Chen, Yonggang Zhang, Mingming Chen, Jun Song +2 more

The paper proposes a graph-constrained approach to scale multi-hop training data by decoupling path discovery from path verbalization, significantly expanding the usable corpus size for LLMs.

View →

cs.CLRecentMay 31, 2026

Beyond Topical Similarity: Contrastive Evidence Retrieval with Interpretable Attention Alignment in RAG

Francielle Vargas, João Robiatti, Diego Alves, Lucas Pascotti Valem +5 more

The paper introduces CERA, a novel contrastive retrieval framework that improves RAG factuality and interpretability by using subjectivity-based hard negative selection and an auxiliary attention alig…

View →

cs.CLcs.AIRecentMay 27, 2026

Sense Representations Are Inducible Interfaces

Jan Christian Blaise Cruz, Alham Fikri Aji

The paper introduces ACROS, a method that induces an explicit sense representation pathway into a frozen pretrained decoder LM, enabling sense-based tasks like disambiguation and cross-lingual alignme…

View →

cs.CLcs.AIcs.CRRecentMay 5, 2026

SWAN: Semantic Watermarking with Abstract Meaning Representation

Ziping Ye, Gourab Dey, Christos Christodoulopoulos, Charith Peris +6 more

SWAN introduces a novel, training-free framework that embeds watermarks directly into the semantic structure of a sentence using Abstract Meaning Representation (AMR), achieving superior robustness ag…

View →

cs.HCcs.AIcs.LGRecentMay 28, 2026

Rationalize: Shared Semantic Reasoning for Human-AI Alignment

Aritra Dasgupta, Naga Datha Saikiran Battula, Avina Nakarmi, Sohom Sen +2 more

The paper introduces Rationalize, a role-pair framework that facilitates shared semantic reasoning between humans and AI models to achieve deep alignment of intent and action.

View →

cs.CLRecentMay 29, 2026

Language Models Can Resolve Reference Compositionally, But It's Not Their Native Strength: The Case of the Personal Relation Task

Bart Evelo, Meaghan Fowlie, Denis Paperno

The paper investigates compositional abilities in LLMs and humans using the Personal Relation Task, finding that LLMs excel at the structured (Intensional) task while humans are better at the real-wor…

View →

cs.CLRecentMay 29, 2026

Bundesrecht: An Open Library and Corpus for German Statutory Reference Processing

Harshil Darji, Martin Heckelmann, Christina Kratsch, Gerard de Melo

The paper introduces 'bundesrecht,' an open-source, end-to-end pipeline for processing complex German statutory references, which parses, normalizes, and resolves raw citation strings into structured,…

View →

cs.CLcs.AIRecentMay 28, 2026

Same Patient, Different Words, Different Diagnosis? Evaluating Semantic Stability in Clinical LLMs

Mahdi Alkaeed, Adnan Qayyum, Nabeel Abo Kashreef, Muhammad Bilal +1 more

The paper evaluates the semantic stability of clinical LLMs to linguistic variations, finding that domain specialization does not guarantee consistent robustness improvements.

View →

cs.AIRecentJun 1, 2026

An NLP-Driven Framework for Curriculum-Labor Market Alignment: Schema-Constrained LLM Extraction, ESCO-Anchored Semantic Matching, and Multi-Dimensional Gap Quantification

Sherzod Turaev, Mary John, Mamoun Awad, Nazar Zaki +1 more

The paper introduces a robust four-stage NLP framework that uses schema-constrained LLMs and ESCO vocabulary to accurately extract and align educational competencies with labor market demands, quantif…

View →

cs.IRcs.CLRecentJun 3, 2026

BEATS: Bootstrapping E-commerce Attribute Taxonomies for Search through Iterative Human-AI Collaboration

Yung-Yu Shih, Shang-Yu Su, Tzu-I Ho, Dongzhe Wang +1 more

The paper presents BEATS, a human-in-the-loop LLM framework for bootstrapping product attribute taxonomies from scratch.

View →

cs.CRRecentApr 13, 2026

From Context to Rules: Toward Unified Detection Rule Generation

Cheng Meng, Wenxin Le, Xinyi Li, Qiuyun Wang +3 more

The paper proposes UniRule, a novel agentic RAG framework that unifies the detection rule generation process by mapping context and language to rules, significantly outperforming pure LLM generation.

View →