"machine learning" | ArxivCSExplorer

20 results for “machine learning”

CS papers only

Hybrid search: Keyword + semantic, ranked by combined score.ⓘ

Want pure semantic search? Try claim verification →

math.STcs.LGmath.PREmpiricalRecentJun 4, 2026

How abundant are good interpolators?

This paper establishes a large deviation principle for the generalization error of interpolating classifiers in the overparametrized regime.

View →

cs.CRRecentMay 6, 2026

Assessing Generalisation Capability of Machine Learning Models for Intrusion Detection

Md Zakir Hossain, Md Ayshik Rahman Khan, Md Rafiqul Islam, Syed Mohammed Shamsul Islam +1 more

The study assesses the generalization capability of supervised machine learning models for intrusion detection using UNSW-NB15 and TON_IoT, finding a significant performance drop when models are teste…

View →

stat.MLcs.LGEmpiricalRecentJun 28, 2026

Gradient boosting with vector-valued leafs

David Cortes

This paper extends gradient boosting to functions of vector inputs using a simple algorithm with histogram-based decision trees.

View →

cs.LGcs.AIcs.CVRecentMay 30, 2026

On the Difficulty of Learning a Meta-network for Training Data Selection

Zilin Du, Junqi Zhao, Boyang Albert Li

This paper analyzes the poor performance of Meta-learning for Training-data Selection (MTS) and proposes that increasing the batch size and incorporating informative features can significantly improve…

View →

cs.LGcs.AIRecentMay 29, 2026

From Rashomon Theory to PRAXIS: Efficient Decision Tree Rashomon Sets

Zakk Heile, Hayden McTavish, Varun Babbar, Margo Seltzer +1 more

The paper introduces PRAXIS, a novel algorithm that efficiently approximates the computation of 'Rashomon sets' for decision trees, significantly reducing memory and runtime complexity.

View →

cs.CRcs.AIRecentApr 25, 2026

Training Machine Learning Models on Encrypted Data: A Privacy-Preserving Framework using Homomorphic Encryption

Alexandre Marques, Beatriz Sá, Rui Botelho, Pedro Pinto

The paper proposes and validates a privacy-preserving framework using Homomorphic Encryption (HE) to train and run Machine Learning models on sensitive data while keeping it encrypted throughout the e…

View →

cs.LGcs.AIRecentMay 29, 2026

ChurnNet: A Optimized Modern AI for Churn Prediction

Syed Saad Saif, Giulio Maggiore, Paolo Russo, Damiano Distante

This paper compares traditional machine learning models (Random Forests, XGBoost, SVM) against a complex Unified Multi-Task Time Series Model for churn prediction, concluding that conventional methods…

View →

stat.MLcs.CCcs.DSTheoreticalRecentJul 7, 2026

Boosting with List-Decodable Codes

Addison Prairie, Li-Yang Tan

A new boosting algorithm that strong learns concept classes closed under O(log 1/γ)-XOR using O(log 1/ε) calls to a γ-advantage weak learner and additional samples, by connecting boosting with list-de…

View →

stat.MLcs.AIcs.LGRecentMay 29, 2026

Correcting Split Selection in Online Decision Trees via Anytime-Valid Inference

Salim I. Amoukou, Saumitra Mishra, Manuela Veloso

The paper introduces a new anytime-valid inference method to correct split selection in online decision trees, providing robust statistical guarantees for streaming data that existing methods lack.

View →

cs.LGcs.AIRecentMay 28, 2026

LLMs Without Deep Neural Networks: New Architecture, Benefits and Case Study

Vincent Granville

The paper introduces a novel, non-deep neural network architecture that achieves the performance of LLMs by finding the global optimum of the loss function in a single, closed-form iteration, eliminat…

View →

cs.CRRecentJun 2, 2026

The Role of Domain-Specific Features in Malware Detection: A macOS Case Study

Biagio Montaruli, Andrea Oliveri, Savino Dambra, Davide Balzarotti

This paper introduces a novel malware detection system for macOS by utilizing domain-specific static features, achieving state-of-the-art performance and demonstrating strong generalization capabiliti…

View →

cs.CRcs.AIRecentApr 2, 2026

Automated Malware Family Classification using Weighted Hierarchical Ensembles of Large Language Models

Samita Bai, Hamed Jelodar, Tochukwu Emmanuel Nwankwo, Parisa Hamedi +3 more

The paper proposes a zero-label malware family classification framework that uses a weighted hierarchical ensemble of large language models (LLMs) to classify malware without requiring labeled trainin…

View →

cs.CRcs.AIcs.LGRecentMar 27, 2026

Machine Learning Transferability for Malware Detection

César Vieira, João Vitorino, Eva Maia, Isabel Praça

This study evaluates various data preprocessing pipelines to improve the transferability and generalization of Machine Learning models for detecting malware in Portable Executable (PE) files across di…

View →

cs.CLcs.LGRecentJun 1, 2026

Machine Learning for Coding Retail Product Names to Consumer-Price Categories: A Rule-plus-Bag-of-Words Pipeline with Reliability-Weighted Human-in-the-Loop Labeling

Vladimir Beskorovainyi

The paper proposes a robust, multi-stage pipeline combining rule-based classification and machine learning to map noisy retail product names to standardized consumption categories, finding that simple…

View →

cs.NEcs.AIcs.DSRecentMay 28, 2026

Selection Hyper-heuristics Can Automatically Adjust the Learning Period to Optimally Solve Pseudo-Boolean Problems

Benjamin Doerr, Pietro S. Oliveto, John Alasdair Warwicker

This paper introduces a method to automatically determine the optimal learning period ($ au$) for the Random Gradient hyper-heuristic, enabling it to optimally solve Pseudo-Boolean Problems without ma…

View →

cs.CRRecentMay 21, 2026

Botnet Detection on CTU-13 Using Lightweight Machine Learning Models

Subhash Gurappa, Yashas Hariprasad, Sundararaj Sitharama Iyengar, Naveen Kumar Chaudhary

This paper compares lightweight machine learning models (like Random Forest) against computationally intensive deep learning methods for botnet detection on the CTU-13 dataset, showing that these simp…

View →

stat.MEstat.MLTheoreticalRecentJul 8, 2026

Transfer Learning for Linear Discriminant Analysis with a Shared Classification Signal

Yonghan Zhang, Yimeng Fan, Wenya Luo, Jiang Hu

This paper derives deterministic limits for transfer learning performance of linear discriminant analysis in high-dimensional two-class classification under spiked covariance models.

View →