"Data debugging" | ArxivCSExplorer

20 results for “Data debugging”

CS papers only

Hybrid search: Keyword + semantic, ranked by combined score.ⓘ

Want pure semantic search? Try claim verification →

cs.LGcs.IREmpiricalRecentJun 10, 2026

DeMix: Debugging Training Data with Mixed Data Error Types by Investigating Influence Vectors

Jiale Deng, Yanyan Shen, Xiaogang Shi, Chai Junjun

This paper proposes DeMix, a novel framework for simultaneously diagnosing erroneous samples and their error types in machine learning models.

View →

cs.CRcs.SERecentMay 13, 2026

Automatic Detection of Reference Counting Bugs in Linux Kernel Drivers

Joe Hattori, Naoki Kobayashi, Ken Sakayori

The paper introduces DrvHorn, a novel automated tool that detects reference counting bugs in Linux kernel drivers by transforming the verification problem into an assertion checking task, successfully…

View →

cs.CRRecentMay 30, 2026

NeuroLog: Reasoning You Can Audit -- Neuro-Symbolic Vulnerability Discovery via LLM Facts, Datalog, and SMT

Sanjay Rawat

NeuroLog is a novel, build-free neuro-symbolic pipeline that combines LLM-derived dataflow facts, Datalog, and SMT solving to systematically discover and synthesize exploitable memory safety vulnerabi…

View →

cs.CRRecentMay 27, 2026

Do you dare to try Test-Driven Forensics? Increasing Trust in Desktop Forensics with ADARE

Michael Külper, Martin Lambertz, Mariia Rybalka

The paper introduces Test-Driven Forensics, an approach that treats forensic expectations as executable tests to detect and measure the degradation of repeatability and confidence in digital forensic…

View →

cs.SEcs.CRRecentMar 28, 2026

Finding Memory Leaks in C/C++ Programs via Neuro-Symbolic Augmented Static Analysis

Huihui Huang, Jieke Shi, Bo Wang, Zhou Yang +1 more

MemHint is a neuro-symbolic static analysis pipeline that significantly improves memory leak detection in C/C++ by combining LLM semantic understanding with Z3 symbolic reasoning, detecting more leaks…

View →

cs.CLcs.AIcs.LGRecentMay 27, 2026

MemTrace: Tracing and Attributing Errors in Large Language Model Memory Systems

Xinle Deng, Ruobin Zhong, Hujin Peng, Xiaoben Lu +14 more

The paper introduces MemTrace, a framework that treats LLM memory pipelines as traceable graphs to systematically diagnose and automatically correct memory-related errors, boosting performance by up t…

View →

cs.ARcs.AIcs.CRRecentApr 15, 2026

VeriCWEty: Embedding enabled Line-Level CWE Detection in Verilog

Prithwish Basu Roy, Zeng Wang, Anatolii Chuvashlov, Weihua Xiao +3 more

VeriCWEty proposes an embedding-based framework to detect and classify common software vulnerabilities (CWEs) in Verilog RTL code at both module and line levels, achieving high detection accuracy.

View →

cs.CRcs.SERecentMay 28, 2026

Control Flow Graph Recovery for Dynamically Loaded Code via Symbolic Library Resolution

Oleksandr Mostovyi

The paper proposes a novel symbolic execution technique that combines speculative library preloading and custom software hooks to recover Control Flow Graphs (CFGs) from binaries that use dynamic code…

View →

cs.CRcs.DBRecentApr 8, 2026

Interpreting the Error of Differentially Private Median Queries through Randomization Intervals

Thomas Humphries, Tim Li, Shufan Zhang, Karl Knopf +1 more

The paper introduces PostRI, a novel method that allows for computing a Randomization Interval (RI) for differentially private median queries after the median has already been estimated, significantly…

View →

cs.CRRecentApr 30, 2026

WOOTdroid: Whole-system Online On-device Tracing for Android

Simon Althaus, Nikolaos Alexopoulos, Max Mühlhäuser, Christian Reuter +1 more

WOOTdroid is a novel, non-invasive system for comprehensive on-device tracing on stock Android that simultaneously addresses syscall data loss and the semantic gap in Binder IPC events.

View →

cs.SEcs.CRRecentMay 21, 2026

Automated Repair of TEE Partitioning Issues via DSL-Guided and LLM-Assisted Patching

Chengyan Ma, Jieke Shi, Ruidong Han, Ye Liu +3 more

The paper introduces TEERepair, a framework that automatically repairs severe security vulnerabilities caused by improper partitioning in Trusted Execution Environments (TEEs) by combining a domain-sp…

View →

cs.CRRecentJun 1, 2026

PeAR: A Static Binary Rewriting Framework for Binary-Only Fuzzing

Alvin Charles, Adrian Herrera, Peter Oslington, Alwen Tiu

The paper introduces PeAR, a static binary rewriting framework that proves static binary instrumentation (SBI) is a practical and effective alternative to dynamic binary instrumentation (DBI) for high…

View →

cs.SEcs.CRRecentMay 14, 2026

Veritas: A Semantically Grounded Agentic Framework for Memory Corruption Vulnerability Detection in Binaries

Xinran Zheng, Alfredo Pesoli, Marco Valleri, Suman Jana +1 more

Veritas is a semantically grounded framework that detects memory corruption vulnerabilities in stripped binaries by combining static analysis, LLM-based reasoning, and runtime validation, achieving hi…

View →

cs.CRRecentMay 13, 2026

Memory Forensics Techniques for Automated Detection and Analysis of Go Malware

Hala Ali, Andrew Case, Irfan Ahmed

The paper introduces a novel memory forensics framework to perform runtime analysis of Go malware, successfully recovering critical execution state and artifacts that are invisible to traditional stat…

View →

cs.CRRecentMar 25, 2026

Bridging Code Property Graphs and Language Models for Program Analysis

Ahmed Lekssays

The paper introduces codebadger, a Model Context Protocol (MCP) server that integrates Joern's Code Property Graph (CPG) with LLMs, enabling large language models to perform large-scale, semantic prog…

View →

cs.CRcs.SERecentMay 5, 2026

Root-Cause-Driven Automated Vulnerability Repair

Hulin Wang, Zion Leonahenahe Basque, Jie Hu, Ati Priya Bajaj +12 more

The paper introduces Kumushi, a root-cause-driven patching agent that significantly improves automated vulnerability repair by focusing LLMs on the true source of bugs, outperforming existing methods…

View →

cs.SEcs.AIcs.CRRecentMay 12, 2026

Decaf: Improving Neural Decompilation with Automatic Feedback and Search

Alexander Shypula, Osbert Bastani, Edward Schwartz

The paper introduces Decaf, a system that uses automatic feedback and search to significantly improve the semantic correctness and accuracy of neural decompilers, boosting the decompilation rate from…

View →

cs.CRcs.PLRecentApr 21, 2026

Adding Compilation Metadata To Binaries To Make Disassembly Decidable

Daniel Engel, Freek Verbeek, Pranav Kumar, Binoy Ravindran

The paper proposes a new binary format that embeds compiler-generated metadata into executables, making the binary structure more transparent and enabling reliable analysis, instrumentation, and recom…

View →

cs.AIcs.DBRecentMay 27, 2026

A Query Engine for the Agents

Kenny Daniel

The paper introduces Hyperparam, a set of lightweight JavaScript libraries designed to enable direct, model-aware querying of unstructured data (like agent traces) within client-side AI applications.

View →

cs.CRRecentMay 21, 2026

Parser-Free Querying of Security Logs

Evan Luo, Julien Piet, David Wagner

The paper introduces Sieve, a system that uses a large language model (LLM) to generate executable query code from natural language security questions, significantly improving the ability to perform c…

View →