m (Scipediacontent moved page Draft content 563502533 to Review 670096716238)
m (Scipediacontent moved page Review 670096716238 to Shafiq et al 2026a)
 
(No difference)

Latest revision as of 10:57, 23 March 2026

Abstract

This study presents a hybrid automated framework based on a combination of machine learning (ML) and natural language processing (NLP) approaches for the automatic categorization and extraction of nonfunctional requirements (NFRs) from free-text software development documents. Using the PROMISE dataset, this framework systematically integrates semantic representation learning, deep feature extraction, and kernel-based classification to improve the performance of NFR classification. Unlike current CNN-based approaches with end-to-end softmaxbased classification, our proposed method fundamentally decouples feature learning from decision making. The first approach is to use Word2Vec embeddings to capture semantic context, and then use Convolutional Neural Networks (CNNs) as high-level feature extractors. An Improved Support Vector Machine with a Radial Basis Function kernel (ISVM-RBF) is applied for final classification, enabling more discriminative decision boundaries to be drawn in the high-dimensional semantic feature space. We reveal a considerable performance improvement with the CNN– Word2Vec setup, achieving as high as a 90% precision, significantly outperforming standard ML classifiers. The study points to three main findings: (i) CNN-based feature extraction is an efficient approach for finding and classifying NFRs, (ii) the semantic representation provided by word embedding methods is clearly superior to other traditional methods used in NLP, and (iii) NLP preprocessing of text is crucial for enhancing classification accuracy. Finally, ISVM-RBF adapts kernel-based classification over features derived from CNN, which enhances the robustness of the model to semantic overlaps between NFR categories and alleviates challenges posed by potentially large textual datasets required to train such models. This hybrid CNN–ISVM-RBF design constitutes the methodological novelty of the proposed method and effectively distinguishes it from current state-of-the-art methods in the literature.OPEN ACCESS Received: 03/11/2025 Accepted: 22/01/2026


Document

The PDF file did not load properly or your web browser does not support viewing PDF files. Download directly to your device: Download PDF document
Back to Top
GET PDF

Document information

Published on 22/03/26
Accepted on 22/01/26
Submitted on 03/11/25

Volume Online First, 2026
DOI: 10.23967/j.rimni.2026.10.75552
Licence: CC BY-NC-SA license

Document Score

0

Views 0
Recommendations 0

Share this document

claim authorship

Are you one of the authors of this document?