"vector databases" | ArxivCSExplorer

20 results for “vector databases”

CS papers only

Hybrid search: Keyword + semantic, ranked by combined score.ⓘ

Want pure semantic search? Try claim verification →

cs.CRcs.DBRecentApr 7, 2026

Can You Trust the Vectors in Your Vector Database? Black-Hole Attack from Embedding Space Defects

Hanxi Li, Jianan Zhou, Jiale Lao, Yibo Wang +4 more

The paper introduces the Black-Hole Attack, a poisoning vulnerability that exploits geometric defects in high-dimensional embedding spaces to force malicious vectors into the top-k results of vector d…

View →

cs.CRcs.IRcs.LGRecentMay 13, 2026

VectorSmuggle: Steganographic Exfiltration in Embedding Stores and a Cryptographic Provenance Defense

Jascha Wanger

The paper demonstrates a class of steganographic exfiltration attacks against vector databases by hiding data within embeddings, and proposes VectorPin, a cryptographic provenance protocol to detect s…

View →

cs.ARcs.DBcs.ETRecentJun 2, 2026

ACRONYM: Accelerated Approximate Nearest Neighbor Search in Memory for Dynamic Vector Databases

Md Mizanur Rahaman Nayan, Tianqi Zhang, Flavio Ponzina, Tajana Rosing +1 more

ACRONYM is a novel algorithm-hardware co-designed platform that enables high-recall, continuous approximate nearest neighbor search in memory for dynamic vector databases, achieving massive throughput…

View →

cs.IRcs.AIcs.LGRecentMay 28, 2026

No More K-means: Single-Stage Sparse Coding for Efficient Multi-Vector Retrieval

Lixuan Guo, Yifei Wang, Tiansheng Wen, Aosong Feng +2 more

The paper introduces Single-stage Sparse Retrieval (SSR), a method that replaces computationally expensive vector clustering with sparse autoencoding to achieve highly efficient multi-vector retrieval…

View →

cs.CRcs.IRRecentApr 10, 2026

Trans-RAG: Query-Centric Vector Transformation for Secure Cross-Organizational Retrieval

Yu Liu, Kun Peng, Wenxiao Zhang, Fangfang Yuan +3 more

Trans-RAG introduces a novel query-centric vector transformation technique to enable secure, efficient, and accurate cross-organizational retrieval in RAG systems without plaintext decryption.

View →

cs.LGcs.AIRecentMay 29, 2026

From Rashomon Theory to PRAXIS: Efficient Decision Tree Rashomon Sets

Zakk Heile, Hayden McTavish, Varun Babbar, Margo Seltzer +1 more

The paper introduces PRAXIS, a novel algorithm that efficiently approximates the computation of 'Rashomon sets' for decision trees, significantly reducing memory and runtime complexity.

View →

cs.AIcs.DBRecentMay 27, 2026

A Query Engine for the Agents

Kenny Daniel

The paper introduces Hyperparam, a set of lightweight JavaScript libraries designed to enable direct, model-aware querying of unstructured data (like agent traces) within client-side AI applications.

View →

cs.CRcs.AIRecentMay 27, 2026

Domain-Informed Representation for Evolutionary Sieving in Integral and Module Lattices

Ahmad Tashfeen, Qi Cheng

This paper enhances a genetic algorithm approach for solving the Shortest Vector Problem (SVP) in lattices by incorporating domain-informed representation, thereby extending its applicability to modul…

View →

cs.CRcs.AIRecentMay 27, 2026

Domain-Informed Representation for Evolutionary Sieving in Integral and Module Lattices

Ahmad Tashfeen, Qi Cheng

This paper enhances a genetic algorithm approach for solving the Shortest Vector Problem (SVP) in both integral and module lattices by incorporating domain-informed representation and crossover.

View →

cs.AIcs.DBcs.IRRecentMay 29, 2026

Vector Linking via Cross-Model Local Isometric Consistency

Ziying Chen, Yang Cao, He Sun, Beining Yang +1 more

The paper proposes a novel geometric embedding hashing method to recover object correspondences (vector links) between two embedding clouds generated by different black-box encoders using only a small…

View →

cs.IRcs.CLRecentJun 3, 2026

BEATS: Bootstrapping E-commerce Attribute Taxonomies for Search through Iterative Human-AI Collaboration

Yung-Yu Shih, Shang-Yu Su, Tzu-I Ho, Dongzhe Wang +1 more

The paper presents BEATS, a human-in-the-loop LLM framework for bootstrapping product attribute taxonomies from scratch.

View →

cs.DBcs.AIRecentMay 29, 2026

Sophrosyne: Agentic Exploration of Relational Data Systems Needs Moderation

Madhav Jivrajani, Ramnatthan Alagappan, Aishwarya Ganesan

The paper introduces Sophrosyne, a system that moderates LLM agent exploration in relational data systems, significantly reducing over-exploration and boosting SQL generation accuracy by guiding the a…

View →

cs.DBcs.AIRecentMay 29, 2026

SpecDB: LLM-Generated Customized Databases via Feature-Oriented Decomposition

Yunkai Lou, Longbin Lai, Shunyang Li, Zhengping Qian +1 more

SpecDB is a novel system that uses LLMs to synthesize highly customized, purpose-built relational databases, achieving performance comparable to commercial systems while significantly reducing code si…

View →

cs.AIRecentMay 31, 2026

SIRIUS-SQL: Anchoring Multi-Candidate Text-to-SQL in Execution Feedback

Leo Luo, Haining Xie, Siqi Shen, Zhipeng Ma +7 more

SIRIUS-SQL introduces a robust multi-candidate text-to-SQL system that addresses weaknesses in candidate generation, error handling, and selection, achieving state-of-the-art performance on complex be…

View →

cs.AIRecentJun 1, 2026

Spatial Representation Learning Beyond Pixels: Unifying Raster Data and Vector Semantics for Human-Centric Geospatial Foundation Models

Steffen Knoblauch, Hao Li, Gengchen Mai, Konstantin Klemmer +2 more

The paper advocates for a paradigm shift toward joint Spatial Representation Learning (SRL) that unifies raster imagery and structured vector data into a single embedding space for developing more sem…

View →

cs.DScs.CCTheoreticalRecentJun 11, 2026

Sketching Intersection Profiles: A Simple Proof and Three Applications

Flavio Chierichetti, Mirko Giacchini, Ravi Kumar, Alessandro Panconesi +2 more

This paper settles the complexity of three sketching problems in graphs and distributions.

View →

cs.CRcs.DBRecentMay 20, 2026

Polars inside Intel SGX2 Enclaves: An Empirical Study of Confidential Analytical Query Processing

Wei Wang, Burns Smith, Kenny Leftin

This paper empirically evaluates the performance of the Polars DataFrame engine running within Intel SGX2 enclaves, finding that while the overall security overhead is manageable, the performance is s…

View →

cs.LGcs.AIRecentMay 29, 2026

ChurnNet: A Optimized Modern AI for Churn Prediction

Syed Saad Saif, Giulio Maggiore, Paolo Russo, Damiano Distante

This paper compares traditional machine learning models (Random Forests, XGBoost, SVM) against a complex Unified Multi-Task Time Series Model for churn prediction, concluding that conventional methods…

View →

cs.LGcs.AIRecentMay 27, 2026

QuITE: Query-Based Irregular Time Series Embedding

JungHoon Lim

The paper introduces QuITE, a plug-and-play embedding module that uses learnable query tokens to effectively embed irregular multivariate time series data into latent representations compatible with e…

View →