Design of advanced intrusion detection systems based on hybrid machine learning techniques in hierarchically wireless sensor networks

Gebrekiros Gebreyesus Gebremariam ; Panda, J. ; et al.

In: Connection Science, Jg. 35 (2023-12-01), Heft 1

Online academicJournal

Zugriff:

Volltext (PDF)

Wireless sensor networks (WSNs) are an emerging military and civilian technology that uses sensors. Sensor networks are hierarchical and chaotic in remote, unmonitored sites. Wireless sensor networks pose unique security threats due to their public location and wireless transmission. WSNs are vulnerable to various routing attacks, including Black holes, Sybil, sinkholes and wormholes. In this paper, we proposed advanced intrusion detection systems based on hybrid machine learning (AIDS-HML) in wireless sensor networks to identify and classify attacks. Hybrid machine learning classifiers identify wireless sensor network dangers. Benchmark datasets are used to compare the proposed model to baseline models in terms of precision, recall, f1-score, and accuracy. The scheme is trained and evaluates prediction models. This confirms that the detection accuracy achieved 99.80% using the NSL-KDD benchmark dataset based on hybrid random forest and extreme gradient boost (RF-XGB). The hybrid cluster labelling K-Means (CLK-M) s achieved better classification accuracy of 100% using UNSW_NB15, and CICIDS2017 benchmark datasets for binary classification of label attacks. Different attack detection metrics were compared against various benchmark datasets to evaluate the quality of this work. The proposed system is efficient in simulations for feature extraction and route discovery and detection attacks achieving an accuracy of 99.46%.

Design of advanced intrusion detection systems based on hybrid machine learning techniques in hierarchically wireless sensor networks

Keywords: Hybrid security technique; intrusion detection system; wireless sensor networks; malicious nodes; detection and classification; hybrid machine learning models

1. Introduction

Wireless sensor networks (WSNs) are Affordable and low-power sensor nodes (Gao, [28]). Sensor nodes with multi-hop routing and self-organising intelligent sensor networks fall into this category (Gebreyesus, [29]). WSNs are autonomous spatially disseminated devices using sensors for physical and environmental conditions. Cooperatively, nodes in WSNs collect data on some physical or environmental characteristic, including noise, vibration, temperature, pressure, motion, or pollution, and relay that data to a hub node (Alsaedi et al., [9]; Farooqi et al., [24]; Han et al., [33]; Pundir et al., [60]). Raw data sensed and collected sent to the cluster head was stored and analysed at the base station (Blywis, [13]; Choi et al., [16]) as shown in Figure 1. WSNs are composed of self- configured and connected by radio signals having a low operating battery and low cost distributed hierarchically and randomly (Saravana Kumar et al., [74]; Singh et al., [80]). WSNs are recent technology and have gained significant attention for research scenarios (Singh et al., [80]). They comprise low-power and cost sensors randomly distributed over the target localisation. The sensors are distributed to act on specific tasks (Mangrulkar & Negandhi, [47]; Rouissi et al., [67]). The sensors have sensing and signal processing capabilities and activate wireless communication in WSNs (Vinitha et al., [88]). In WSNs, A gat way provides wireless connectivity using wired and distributed nodes. WSNs are the main components of the internet of things (IoT), with small, low-power sensors for data collecting and monitoring from the environment (Saidi et al., [71]; Zhao et al., [95]). Wireless sensor nodes can be configured as distributed or hierarchical fashion in the target area for deployment as shown in Figure 1.

Graph: Figure 1. Architecture of distributed (a) and hierarchical (b) WSNs.

The frequency of cyber-attacks targeting international businesses is increasing, leading to the rapid development of intrusion detection systems (IDS) in both industry and academia (Mahbooba et al., [46]). The availability and confidentiality of the data may have been compromised as a result of attempts by attackers to break the network's security through vulnerabilities in the security measures (Abdulganiyu et al., [1]). IDS is a network security solution that collects and analyses network data to detect abnormal behaviour and protect system resources (Zhang et al., [94]). It is crucial in maintaining network security (Wang et al., [89]). Anomalies and misuse are two types of intrusions that can occur in WSNs. Anomaly detection utilises mathematical models and compares estimated feature values with reference values to identify deviations from normal behaviour (Godala & Vaddella, [31]). Misuse detection, on the other hand, relies on previously observed malicious activities and their specific patterns to identify intrusions.

Common causes of cybercrimes include DoS and web attacks. Attacks in WSNs can be active or passive, involving unauthorised eavesdropping, information gathering, and packet manipulation (Elsaid & Albatati, [22]). To ensure protection, WSNs employ two layers of defence. intrusion detection and prevention technologies. IDS tracks intruders, their activities, time, location, and network layer, providing valuable information to system administrators (Elsaid & Albatati, [22]). Cybercrimes pose significant threats to businesses as they introduce malicious attacks into networks. Intrusion detection is vital to cybersecurity and involves examining and identifying security breaches in an information system.

Machine learning (ML) techniques offer generic solutions and continuously improve their performance (Praveen Kumar et al., [59]). In WSNs, ML finds applications in various fields, enhancing performance and reducing the need for manual maintenance. It facilitates information extraction from large sensor-generated data, enabling machine-to-machine (M2M) communications, cyber-physical systems, and the Internet of Things (IoT). ML proves valuable in WSNs for optimising sensor nodes, energy management, localisation, node classification, routing, and detecting attacks across multiple layers of sensor networks. Machine learning strategies are employed for selecting the cluster head to enhance energy performance in Wireless Sensor Networks (WSNs) as shown in Figure 2.

Graph: Figure 2. Wireless sensor networks model with clustering Nodes for data aggregation to the sink node.

ML techniques also helpful in detecting and removing malicious nodes at the cluster head, improving network reliability and operational lifetime (Praveen Kumar et al., [59]). ML algorithms efficiently classify DoS attacks and handle data aggregation and forwarding tasks to the sink node. Data aggregation plays a critical role in WSNs, impacting energy consumption, storage requirements, network load, and processing speed as in Figure 2. Various data aggregation methods, such as cluster-based, tree-based, in-network, or centralised approaches, can improve network reliability and lifespan. ML algorithms offer advantages in collecting and organising data, contributing to the security and efficiency of WSNs.

1.1. Problem statement

The problem addressed in this research is developing an advanced Intrusion Detection System (IDS) based on hybrid machine learning (AIDS-HML) to protect hierarchically structured Wireless Sensor Networks (WSNs). Current IDSs face difficulties achieving high detection accuracy and minimising false alarms. The hierarchical organisation of sensor nodes adds complexity to the design of advanced IDSs (Paliwal & Kumar, [56]).. The objective is to create a robust and efficient IDS that effectively identifies and responds to intrusions while considering the hierarchical structure of the WSN. Specific challenges to be addressed include:

Developing a hierarchical IDS architecture that can adapt to the structure of the WSN and effectively handle data collection and aggregation.

Extracting relevant features from sensor data in a resource-constrained environment and employing techniques to reduce dimensionality and select the most informative features.

Integrating multiple machine learning algorithms in a hybrid approach to leverage their strengths and enhance intrusion detection accuracy.

Addressing the challenge of stolen sensitive information in WSNs, where the absence of physical defence lines makes it crucial to monitor network traffic flow.

Evaluating the proposed IDS in terms of detection accuracy, recall, precision, false alarm rate, resource utilisation, and scalability.

Addressing these challenges will contribute to developing advanced IDSs tailored for hierarchically structured WSNs, providing enhanced security and maintaining the efficiency of the network. The proposed solution will aim to overcome the limitations of existing IDSs and provide insights into the effective integration of hybrid machine learning techniques in the context of WSNs. The NSL-KDD, UNSW_NB15, and CICIDS2017 datasets are utilised for training and testing, serving as benchmarks for evaluating the proposed system. Specifically, the NSL-KDD dataset, which includes four types of attacks (Root to Local attacks, Denial of Service, Probing, and User to Root attacks), is used to test the effectiveness of the proposed approach.

1.2. Research motivation

Attacks affect Wireless Sensor Networks (WSNs) applied in different applications, including healthcare systems, industrial applications, traffic light management systems, and intelligent power generation and distribution systems for keeping the real-time demand. So it is essential to survey and study hybrid security methods by combining two schemes, such as hybrid IDS, hybrid routing, and hybrid anomaly detection techniques. Hybrid security techniques are famous for the detection of attacks in WSNs. The hybrid security techniques utilise datasets and hybrid models to evaluate the system's effectiveness for attack classification and detection. The significant contributions of this work are to explore the various hybrid security techniques in wireless sensor networks as follows.

Design and simulate a secure attack detection scheme against routing attack scenarios in wireless sensor networks.

Explore the various design of hybrid security techniques using a combined attack-defence strategy.

Explore data processing transformation techniques for evaluating security performance using benchmark datasets.

Exploration of the various hybrid machine learning models for effective DoS attack detection and classification using a public dataset to evaluate the system's effectiveness.

Explore wormhole tunnels and routing techniques to provide the optimal path for data transmission nodes and suitable solutions for secure routing and monitoring mechanisms.

Analyse the Location and detection of multiple attacks like a wormhole and black hole attacks using hybrid and combined schemes.

Explore the hybrid techniques used for WSNs traffic analysis and detecting complicated attacks with the public dataset with normal and malicious behaviours of the network traffic data.

1.3. Paper of contribution

This paper's primary contribution is to set a novel, hybrid machine-learning approach for intrusion detection to bolster the security and performance of WSNs while minimising resource consumption. In the past, many IDS strategies have been implemented with benchmark machine learning approaches to improve the detection accuracy of WSN assaults. However, as the quantity and interconnectivity of wireless sensor nodes grow, the frequency with which routing assaults are launched also rises. Here are some key contributions of this research:

Design and planning of secure advanced intrusion detection systems for scalable and resources optimised WSNs.

The research specifically addresses the challenges of hierarchically structured WSNs. By considering the multi-level organisation of sensor nodes, the IDS can adapt its detection mechanisms more effectively, providing improved coverage and response throughout the network.

Explore how advanced hybrid machine learning models can be used in WSNs intrusion detection systems with metrics to examine the most important aspects of intrusion detection systems.

The research provides valuable insights into the effective integration of different machine learning approaches in the context of WSNs. This knowledge can be extended to other domains where hybrid ML solutions may be beneficial.

It helps decrease the number of false alarms and boosts the efficiency of intrusion detection systems for Secure data transmission across the network for data privacy and protection in WSNs.

The other sections of this paper are divided into distinct parts for easier readability and comprehension. The first section provides a high-level overview of wireless sensor networks, sensor clustering, our inspiration for this study, and the results. Section 2 summarises prior research on sophisticated machine-learning strategies that utilise IDS data. Section 3 explains the network models underpinning the proposed hybrid security system for intrusion detection. Methodology and dataset processing for hybrid security methods are described in Section 4. In part 5, we will examine the simulation and its outcomes. Section 6 depicts the experimental setup, data collection, and analysis, while the final section discusses the takeaways and suggestions for the future as depicted as shown in Figure 3.

Graph: Figure 3. The organisation and framework of the proposed work.

2. Related works

This section's extensive literature review is conducted on various hybrid security techniques, including hybrid intrusion detection systems, machine learning techniques, and anomaly detection using benchmark datasets. The hybrid techniques effectively detect WSN attacks such as wormholes, flooding, sinkhole, and Sybil attacks in WSNs. These include hybrid intrusion detection systems, advanced HIDS, and intelligent and artificial immune hybrid intrusion detection systems (AIHIDS) (Singh et al., [80]). The hybrid routing protocols, hybrid misuse, hybrid optimisation algorithms, anomaly detection techniques, and hybrid clustering techniques are discussed in this section.

2.1. Hybrid intrusion detection systems

(Singh et al., [80]) presented an Advanced Hybrid Intrusion Detection System (AHIDS) for Wireless Sensor Networks (WSNs). AHIDS employs a cluster-based architecture with an improved LEACH protocol to reduce power consumption in sensor nodes. Hybrid Artificial Neural Networks (HANNs) are used to detect and classify potential threats like Sybil, wormhole, and Hello flood attacks. The proposed system achieved high detection rates: 99.40% for Sybil attacks, 98.20% for Hello flood attacks, and 99.20% for wormhole attacks. Similarly, (Cepheli et al., [15]) examined a hybrid intrusion detection technique that combines flexible and tunable parameters using parallel detection methods. Their hybrid system, guided by a central node, improves DDoS attack detection accuracy. Figure 4 shows signature-based and anomaly-detection block diagrams for attack detection as normal traffic and DDoS attacks.

Graph: Figure 4. hybrid intrusion detection model with anomaly detector and rule-based detector.

The detection process analyses network traffic and extracts building activity model features. DARPA 2000 dataset is used to evaluate the intrusion detection system, focusing on DDoS attacks and normal network traffic. The hybrid detection employs anomaly and signature detectors. Anomaly detector identifies normal and abnormal traffic data through feature extraction, while the signature detector uses predefined sets for traffic data features. AHIDS uses anomaly detection blocks to identify normal and abnormal data packets and misuse detection blocks to recognise various attacks as shown in Figure 4. The detection of malicious nodes in AHIDS, based on fuzzy rules, involves three steps as follow:

It measures the transmission of data packet history in WSNs through base nodes.

It selects the feature set and looking the key elements for packet classification.

Anomaly intrusion detection techniques are established based on data packet resolution.

The fuzzy-based intrusion detection technique uses MPNN, which consists of BPNN and FFNN for anomaly and misuse detection, as shown in Figure 5. It is applied for the highest detection rate using supervising learning technique. The fuzzy base AHIDS with FFNN and BPNN achieves greater attack detection accuracy using massive clustered training. The multilayer perception is utilised for estimating the error rate,

Graph

$e_{i}$ , using the formula as in Equation (1).

Graph

$e_{i} = d_{i} - a_{i}$ (1)

Where

Graph

$d_{i}$ represents the preferred output and

Graph

$a_{i}$ is the true output obtained from MPNN. The MPNN model consists of BPNN and FFNN and is applied to evaluate the detection accuracy of the various class attacks in WSNs.

DIAGRAM: Figure 5. Block diagram for advanced and hybrid intrusion detection system.

The MPNN utilises BPNN and FFNN techniques for IHIDS to manage huge datasets and the system's stability. FFNN detects the new type of attacks, and BPNN clusters the mysterious attacks for MPNN supervised learning. The membership vector applied for the fuzziness F(V) is given by Equation (2).

Graph

$F(V) = - \frac{1}{n} \sum_{k = 1}^{n} (μ_{i} log μ_{i}) + (1 - μ_{i})log(1 - μ_{i})$ (2)

Where

Graph

$V = {μ_{1}, μ_{2}, ..., μ_{n}}$ is a set of fuzzy, the fuzziness values are categorised into high, low, and mid fuzziness groups with training and testing samples.

(Gandhimathi & Murugaboopathi, [27]) conducted research on flow-based and cross-layer hybrid intrusion detection. Their approach aims to detect anomaly traffic and identify potential attacks based on narrow features. The process involves two phases: In the first phase, flow-based IDS is used to classify malicious nodes, and in the second phase, packets are analysed using cross-layer features to verify and detect potential threats. The flow-based IDS monitors network traffic, utilising network information to classify data as either normal or malicious, as shown below in Figure 6. Flow-based anomaly detection can exhibit a regular profile of behaviour by keeping tabs on network activity and keeping track of many parameters.

DIAGRAM: Figure 6. Block diagram of an intrusion detection system using the flow-based technique.

The flow-based intrusion detection scheme identifies network connections, hosts, users, and applications. Anomaly detection analyses recorded network traffic for training and detection. Cross-layer characteristics connect with packet-based IDS, extracting routing information for forwarding nodes. Hybrid IDS combines flow-based and packet-based techniques to reduce false positives and improve detection accuracy in WSNs. Correlating flow-based and layer IDS effectively detects DoS and sinkhole attacks. L. Moulad et al. (Moulad et al., [52]) proposed a hierarchical hybrid IDS technique for WSNs, using support vector machines and anomaly detection-based clustering and specification methods to detect and classify DoS attacks. The specification-based technique adopts a statistical model and behavioral/manual specifications to identify normal and malicious data, detecting anomalies and intrusion attacks in the network as shown in Figure 7. The figure illustrates the specification, signature, and anomaly detection techniques.

Graph: Figure 7. Hierarchical hybrid intrusion detection technique in wireless sensor networks using three phases of attack detection.

(Swarna Priya et al., [83]) proposed a deep neural network (DNN) model using hybrid principal component analysis and grey wolf optimisation (PCA-GWO) for intrusion detection in wireless sensor networks for medical applications. The model utilises autoencoders in a fully connected DNN with trainable parameters for training and testing. (Kanna & Santhi, [38]) examined hybrid deep learning models for effective intrusion detection systems using MapReduce based on black window-optimised long-term convolutional memory with feature selection techniques.

(Abduvaliyev et al., [2]) proposed a hybrid intrusion detection system for WSNs, utilising clustering and aggregation techniques to improve security and energy performance. Misuse and anomaly detection techniques are used to enhance detection accuracy and rate, with misuse detection targeting known attack patterns and anomaly detection employing automated training for abnormal behaviours as shown in Figure 8.

Graph: Figure 8. Illustration of data aggregation and clustering in WSNs.

Misdetection, a hybrid technique, effectively detects blackhole attacks by utilising K-medoid clustering on a synthetic data set (Ahmad et al., [7]). This approach employs the K-medoid individualised clustering method to identify anomalies caused by diversion and blackhole assaults. Additionally, a novel SDN-based Hybrid Clone Node Detection (HCND) technique has been developed for Wireless Sensor Networks (WSNs) by (Devi & Jaison, [21]). This technique proactively identifies cloned nodes using software-defined networking, ensuring the maintenance and enhancement of Quality of Service (QoS) limitations in the WSN. Hybrid multi-tiered IDS detects cyber-attacks on vehicular networks by reducing energy consumption and malicious anomaly nodes (Yang et al., [91]). Hybrid IDS detects cloning attacks in WSNs (Devi & Jaison, [21]) and classifies IoT-based security attacks for healthcare applications using feature selection and hybrid DT-GA (Saif et al., [72]).

2.2. Hybrid machine learning techniques

(Rabbani et al., [62]) proposed an effective intrusion detection system that combines machine learning with traditional security techniques. This scheme ensures mathematically secure communication among nodes and detects malicious behaviour using data pre-processing and recognition modules. Recognition techniques involve training and prediction using an optimised probabilistic neural network (PNN) with the UNSW-NB15 dataset containing normal and malicious data traffic. The hybrid PSO-PNN scheme builds a self-optimised network using particle swarm optimisation (PSO) to reduce misclassification errors and increase classification accuracy in the PNN system. PSO adapts the PNN structure, making it a self-adaptive network model using swarm behaviour patterns. Figure 9 depicts the architecture of the proposed system.

Graph: Figure 9. Hybrid PSO-PNN technique for attack classification and detection.

The features of the data network traffic are collected from the raw network packets using tools including Netmate, BRO-IDS, and Argus. The noisy features are removed to detect malicious attacks in WSNs effectively. Numeric values and symbolic variables represent the necessary features. The numeric and symbolic representations are normalised and transformed using the statistical characteristics of Equation (3).

Graph

$Z_{normalized} = \frac{(Z - min(Z))}{(max(Z) - min(Z))}$ (3)

Where Z is the feature value, min (Z) is the minimum value, and max (Z) is the maximum value from the feature of the samples in the dataset.

S. M. Kumar (Kumar, [41]) presented an optimised hybrid deep neural network using a feature section algorithm for improving the intrusion detection of attacks using benchmark datasets UNSW-NB15 and NSL-KDD. The selected features are processed into the convolutional neural network consisting of layers using distance measurement and the correlational coefficient for input packets. The distance between two data points

Graph

$(a_{i}, b_{i})$ with x and y input data, D, for arranging and selecting the features is given by Equation (4) as shown below.

Graph

$D(x,y) = \sqrt{\sum_{i = 1}^{n} {(a_{i} - b_{i})}^{2}}$ (4)

A convolutional neural network is used to compute the selected features. The outputs are incorporated into long-term memory configuration and modified materials to improve classification precision.

(Mahajan et al., [45]) explored hybrid machine learning and deep learning techniques for network traffic analysis and classification in wireless sensor networks, including attack detection using benchmark datasets. (Faysal et al., [25]) proposed machine learning techniques to detect IoT-based WSN attacks using benchmark datasets. Hybrid eXtreme Gradient Boosting and Random Forest (XGB-RF) were effective in detecting botnet attacks with feature selection and classification using various metrics. In the same year, (Alghamdi, [8]) introduced a novel optimiser technique called PO-CFNN for IoT-based IDS. The PO-CFNN method involves three stages: preprocessing, classification, and parameter optimisation, transforming networking information into a more usable format for intrusion identification.

(Sadikin et al., [69]) presented a research study on a hybrid intrusion detection system for ZigBee-based IoT systems. They combined anomaly and rule-based machine learning techniques to detect attacks. A hybrid Long Short-Term Memory Network (LSTM) and Convolutional Neural Network (CNN) learning approach extracts network traffic features using a hybrid IDS with the CICIDS2017 dataset, achieving 99.50% overall accuracy for attack type detection (Sun et al., [82]). In another study, a hybrid classification strategy combines Kalman filter (KF) and Extreme Learning Machine (ELM) to train a predictive classifier on the sink node. It detects random WSN anomalies with promising results using normal and faulty datasets (Biswas et al., [12]).

For binary classification of attack detection, Hybrid k-means and Support Vector Machine (SVM) reduce training and testing times while maintaining high classification accuracy (Rose et al., [66]). Federated learning techniques are utilised to create a privacy-friendly framework across multiple devices using benchmark datasets (Liu et al., [44]). Hybrid optimisation and deep-learning-centric intrusion detection systems are deployed in IoT-enabled smart cities using the Hybrid Chicken Swarm Genetic Algorithm (HCSGA) method (Gupta et al., [32]). The proposed solution involves pre-processing the dataset, feature selection with HCSGA and K-means, and classification with the Deep Learning-based Hybrid Neural Network (DLHNN) classifier using the NSL-KDD benchmark dataset. Hybrid machine learning techniques utilise sampling methods and feature selection analysis to achieve better detection accuracy (Cao et al., [14]). To address sample imbalance, a hybrid sampling method combining ADASYN and RENN is employed. Hybrid deep learning methods have shown effective identification of malicious attacks when tested on performance benchmark datasets (Ullah et al., [84]).

2.3. Hybrid anomaly detection

(Umarani & Kannan, [85]) proposed hybrid anomaly detection techniques based on artificial immune systems using hybrid tissue growing techniques in wireless sensor networks to detect malicious traffic. (Yin et al., [92]) presented an anomaly detection technique that recognizes and separates normal and abnormal behaviours based on patterns of normally labelled behaviours. Data mining techniques like regression analysis, clustering analysis, outlier detection, and classification are used to extract valuable knowledge for identifying patterns of malicious nodes, improving detection accuracy and efficiency. Anomaly detection requires a machine-learning model with human effort and is error-prone. The rest of the related works are summarised as in Table 1.

Table 1. Summary of different hybrid techniques from previous works.

Ref.	Security Technique	Research Findings
Vinitha et al. (2019)	Taylor-based Cat Salp swarm Algorithm	The LEACH protocol provides an energy-efficient multi-hop routing for reliable data transfer. Throughput, energy, and delay may all be accurately measured with this method.
Singh et al. (2023)	deep learning architecture based on a fully connected feed-forward Artificial Neural Network	R = 0.78 and RMSE = 41.15 were obtained for a Gaussian sensor distribution. In contrast, R = 0.79 and RMSE = 48.36 were found for a uniform sensor distribution, indicating that the model accurately predicts the number of k-barriers for both distributions.
Singh et al. (2021)	Machine learning approach based on Gaussian Process Regression (GPR) model	An analytical method is used to extract these characteristics. Based on the simulation results, the technique provides the most accurate prediction of the k-barrier probability, with a correlation coefficient (R) of 0.85 and a Root Mean Square Error (RMSE) of 0.095.
Singh et al. (2022a)	Automated Machine Learning using Bayesian optimisation.	Compared to other explainable machine learning models, they discovered that the Gaussian process regression performed exceptionally well, with a correlation coefficient of 1, a root mean square error of 0.007, and a bias of 0.006.
Ahmad et al. (2019)	Misdetection and K-medoid clustering	K-medoid clustering with a synthetic data set proves to be an efficient way of spotting hybrid black hole attacks.
Devi and Jaison (2020)	SDN-based Hybrid clone node detection	The hybrid clone node detection detects the cloned node using proactive and verification processes based on software-defined networking in WSNs, maintaining and enhancing the quality of service.
Singh et al. (2022b)	log transformation and feature scaling on the feature set and trained the tuned Support Vector Regression (SVR)	They discovered that the model accurately predicts the number of barriers with a correlation coefficient (R) equal to 0.98, a root mean square error (RMSE) equal to 6.47, and a bias equal to 12.35.
Davahli et al. (2020)	Genetic Algorithm (GA) and grey wolf optimiser(GWO)	Reduce the dimensionality of wireless network traffic using selective features for the Internet of Things intrusion detection system with the SVM classifier.
Sun et al. (2020)	Convolutional neural network (CNN) and long short-term memory network (LSTM)	Extracts the network data traffic features using the hybrid IDS with the CICIDS2017 dataset for evaluation and achieves an overall accuracy of 99.50% for attack type.
Biswas et al. (2019)	Kalman filter (KF) with extreme learning machine (ELM)	A hybrid classification technique to train the sink node using a predictive classifier. The scheme is evaluated using the detection of random WSN anomalies data with the normal and faulty datasets.
Ren et al. (2019)	hybrid data optimisation	The practical technique uses data sampling and feature selection using a Genetic algorithm and random forest classifiers using the optimal training UNSW-NB15 dataset.
Regan and Leo Manickam (2019)	The optimised hybrid security model	Hybrid optimised technique for detection of malicious attacks using the hybrid secured model.
Moon and Ingole (2015)	IDS-Secured hybrid approach	The scheme provides a unique security technique for preventing and detecting attacks. The scheme realises data integrity, authentication, and energy minimisation.
Deepa and Latha (2019)	Hybrid hierarchical secure routing using clustering	The scheme selects a hybrid hierarchical secure algorithm for detection and packet delivery using a coordinate cluster head.
Sakthivel and Chandrasekaran (2018)	Hybrid security using dummy packets	A secure routing framework for detecting malicious attacks using routing protocols and dummy packet optimisation.
Yang et al. (2022)	Hybrid multi-tiered IDS using machine learning	It is practical for detecting internal and external cyber-attacks targeting vehicular networks using the CICIDS 2017 dataset.
Rose et al. (2020)	Hybrid k-means and support vector machine	Reduced training and testing times with promising classification accuracy of attack detection and classification in the network.
VenkataRao and Ananth (2021)	Hybrid optimisation and secure clustering protocol	Provides better performance using hybrid secure clustering protocol and k-means clustering for attack detection.
Gupta et al. (2022)	Deep learning-based hybrid neural network	Effective detection and classification of attacks IoT enable networks to utilise the hybrid chicken swarm genetic algorithm method.
Cao et al. (2022)	Hybrid sampling method	Provides better accuracy using feature selection analysis techniques.
Das and Namasudra (2022)	Hybrid encryption method	It is better to improve security performance in IoT-based healthcare infrastructures.
Devi and Jaison (2020)	Clone detection technique	Efficient for detection and verification of cloning attacks in WSNs.
Ullah et al. (2022)	Hybrid deep learning	Proficient in the detection of malicious attacks using benchmark datasets.
Reshma et al. (2022)	The hybrid Neighbor discovery protocol	Maximizes energy efficiency and improves security performance for detecting malicious nodes based on hybrid machine learning.
Saif et al. (2022)	Hybrid IDS approach using feature selection	Utilised hybrid DT-GA to detect and classify security attacks on IoT for healthcare applications with reduced cost.

Despite the existing research efforts in developing IDS based on hybrid machine learning techniques in hierarchically wireless sensor networks, some research gaps still need to be addressed. Additionally, there is a need for further exploration of the optimal combination and configuration of machine learning algorithms in the hybrid approach. Different algorithms may have varying strengths and weaknesses depending on the characteristics of the WSNs. Investigating the most effective combinations and configurations can improve detection accuracy and efficiency. These include.

Limited evaluation on real-world WSN deployments. Many studies have focused on simulation-based evaluations or used benchmark datasets. Future research should include more experimentation and evaluation of real-world WSN deployments to validate the effectiveness of the proposed hybrid IDS techniques.

Scalability and resource constraints. Hierarchically structured WSNs often operate under resource-constrained environments. There is a need for IDS solutions that can handle scalability issues and optimise resource usage while maintaining high detection accuracy.

Dynamic adaptation. WSNs are subject to dynamic environmental changes and evolving attack patterns. IDS solutions should be capable of dynamically adapting to these changes and updating their detection models to ensure continuous and effective protection.

The literatures show that there is still a research gap in comprehensive studies that address the unique challenges and opportunities presented by the hierarchical structure of WSNs. Furthermore, exploring the optimal combination and configuration of machine learning algorithms in the hybrid approach is crucial for achieving improved detection accuracy and efficiency in WSNs. Future research should aim to fill these gaps and provide valuable insights into the design and implementation of advanced IDSs for hierarchically structured WSNs.

3. Network models and clustering techniques

The base station, cluster head (CH), and sensor nodes comprise the hierarchically distributed wireless sensor networks (SN) network paradigm. In this setup, the sensor nodes use a wireless connection to communicate with the cluster node and the sink node (Ghugar et al., [30]). When designing and planning the network model, the following assumptions are incorporated as shown below.

Every mobile sensor node can roam freely within the network area (Pajila et al., [55]).

The sensor nodes are deployed randomly.

All of the sensor nodes are the same in every way.

Any location within the network's range is possible for a sink to be installed.

As a result, the position of unknown nodes in the network can be calculated using beacon and sink nodes, both aware of their position and location. When it comes to routing assaults, the WSN's nodes are unprotected. In most cases, this kind of attack shortens the lifespan of the sensor nodes and causes them to run out of juice. Tunnels created by routing assaults distort the route path and use routing resources. To protect against denial-of-service and routing attacks, the proposed network model incorporates node-level security measures. The proposed system uses cutting-edge intrusion detection systems founded on hybrid machine learning approaches to identify and pinpoint cyberattacks.

The proposed network model includes five types of nodes: sensor nodes, malicious nodes, central nodes, cluster nodes, and sink nodes shown in Figure 10. The CH acts as a root node to prevent malicious communication, and the central node and CH serve as the backbone for communication with the BS. The CH uses the isolation table to conserve energy and detect attacks, with primary and secondary cluster heads for intrusion detection (Ismail & Amin, [35]). The CH also avoids depletion energy using the isolation table for attack detection. Two primary and secondary cluster heads for intrusion detection of attacks.

Graph: Figure 10. Hierarchical topology and configuration model for secure wireless sensor networks.

The attack models provide a graphical representation of the network topology, along with details about key identities and routing information that can be used to identify and exploit security flaws in the system. As seen in Figure 11, it is presumed that they have limited means and intellect to interrupt network traffic. Several variants of WSN attacks are simulated to test the effectiveness of the proposed IDS—the attack model measures how well and securely the system functions.

Graph: Figure 11. Jamming (a) and Sinkhole (b) attacks at the physical and network layers. (a) Jamming attacks and (b) Sink hole attacks

The application layer defines the network's threads, the Media Access Control layer, the Physical layer, the Transport layer, and the Network layer (Zou et al., [96]). Jamming security threats continuously sends harmful data, disrupting short-range connections. The transmission of jamming signals causes legitimate user and service blocking. Following is a mathematical model of an attack as in Equation (5).

Graph

$I_{a} = e_{i} + m_{i}$ (5)

Where

Graph

$I_{a}$ is the transmitted information depending on the IDS, that can be correct or incorrect,

Graph

$e_{i}$ is the expected information, and m is the malicious content information (Gebreyesus, [29]). The data is detected as malicious nodes have considered the network's transmitted data and energy auditing.

The channel priority is the major factor in the medium access control layer. The malicious nodes modify and change the back-off time using the manipulation approach. The attackers advertising false information in the network affects the layer routing information like the minimum hope count.

4. Methodology

The proposed system starts with system design and simulation for generating and extracting datasets. Since getting real datasets in WSNs is difficult, we use standard datasets to evaluate the effectiveness of the new advanced intrusion detection system based on hybrid machine learning models. The usefulness and efficiency of the proposed enhanced intrusion detection method based on hybrid machine learning on various classes of assaults are demonstrated by utilising publicly available datasets (Wu et al., [90]). The KDDCup 99, NSl-KDD, UNSW_NB15, and CICIDS2017 datasets are extensively used for academic study and research for attack detection evaluation as benchmark datasets. The system's effectiveness is tested using the KDD Cup 99 datasets and the intrusion detection techniques. These datasets aim to measure the IDS using a predictive decision model. The 1999 DARPA dataset is also used in this work. The dataset is evaluated using offline and real-time evaluation modes. The data is handled in several modes to establish the normal functioning of the network.

4.1. Benchmark datasets

Some of the issues with the KDD'99 data set are discussed, and a data set called NSL-KDD is proposed as a solution (Meena & Choudhary, [48]). Although it is a new and standardised genre of the KDD data set, it still suffers from some of the problems studied by McHugh. It may not be a perfect illustrative of existing real networks. Still, it is used and applied effectively as a standard data set to help researchers compare the various network-based IDSs. In addition, the NSL-KDD training and testing sets have a manageable quantity of records. The cost of doing trials on the entire dataset, as opposed to a sample, is reduced by using this method. By combining the KDD'99 Data Set with the NSL Data Set, we get several advantages over the original KDD data set.

It avoids training classifiers on duplicate or redundant records by omitting them from the train set.

The proposed test sets don't reuse any records, thus, the approaches with higher detection rates on common data won't unfairly boost the learners' performance.

The proportion of records in the original KDD data set is inversely proportional to the number of records chosen from each group of challenging levels.

Since there aren't many records in the train or test sets, we can afford to perform the experiments on the whole set without randomly picking a subset.

The NSL-KDD dataset is also used as a benchmark to test the detection performance of the proposed system using the semi-supervised machine learning (Praveen Kumar et al., [59]) models for a class of attacks with 42 attributes and class labels. Forty-one attributes are classified into content, host, traffic, and basic features. The dataset has a total record of 148,515 samples sectioned into 80% of training and 20% of testing samples, as shown in Table 2, with four different classes of attacks. The vector features are extracted for training by splitting the dataset into clusters as normal and abnormal. After training, the vector features are received for classification as normal and abnormal clusters.

Table 2. Frequency distribution of various attack classes in the NSL-KDD with training and testing samples for testing performance (Li et al., [43]).

samples	Normal	DoS	Probes	U2R	R2L	Total
Training samples	67,343	45,927	11,656	52	995	125,974
Testing samples	9,711	7,458	2,421	200	2,654	22,544
Total number	77,054	53,385	14,077	252	3,649	148,518

The dataset consists of 23 classes of attack types and is clustered into four classes of attacks, including denial of service (DoS), remote to local (R2L), user to root (U2R), and probe category. The DoS attack makes the network service busy and the authorised user inaccessible from the network (Elsaid & Albatati, [22]). The U2R attack applies vulnerabilities to the host system by sniffing the passwords of the legitimate user. The R2L injects vulnerabilities remotely into the system of the network host. The probe attack scans the network for information collecting and gathering, violating the security rule. The probe and DoS attacks have multiple links, whereas the others have single links (Pande et al., [57]). Table 3 shows the description of the four classes of attacks in the NSL-KDD benchmark dataset.

Table 3. Provides a detailed technical description of the four types of attacks in the dataset.

Attack	Attack description in the dataset
DoS	The attacker makes the network busy and denies the legitimate user access.
R2L	The intruder tries to gain access to the network or machine for a specific version of the FTP.
U2R	The attacker accesses the system's root and makes unauthorised attempts to the network.
Probe	It endeavours to assemble the data behind evading the security of the system.

The UNSW_NB15 dataset is used as a benchmark for evaluating the effectiveness of the proposed system. This dataset has hybrid synthesised attack activities and normal network traffic data (Jatti & Kishor Sontif, [36]). The IXIA traffic generator is arranged with three virtual servers for generating the UNSW_NB15 dataset containing normal and malicious activities in the network traffic. The servers are established using public and private network traffic having IP addresses with routers. The routers are configured with a firewall that filters the traffic as normal and malicious activities. The tcpdump tool is installed on routers for capturing from the IXIA tool dispersed among the network nodes utilised as attack traffic generators with normal network traffic. The frequency distribution of the class of DoS attacks is shown in Table 4 with training and testing samples.

Table 4. Frequency distribution of attacks in the dataset.

	Training samples	Testing samples
Attack category	weight	Percent	weight	Percent
Analysis	677	.8	2000	1.1
Backdoor	583	.7	1746	1.0
DoS	4089	5.0	12264	7.0
Exploits	11132	13.5	33393	19.0
Fuzzers	6062	7.4	18184	10.4
Generic	18871	22.9	40000	22.8
Normal	37000	44.9	56000	31.9
Reconnaissance	3496	4.2	10491	6.0
Shellcode	378	.5	1133	.6
Worms	44	.1	130	.1
Total	82333	100.0	175342	100.0

The method's effectiveness for identifying flooding assaults in WSNs is measured against the CIC-IDSS2017 dataset. Table 5 details some of the key elements of the training and testing dataset that can be found online at the Canadian Institute for Cyber Security Research LAB. Data about network traffic, both benign and malicious, is included in the dataset. It was manufactured to serve as a plausible in-the-background activity while gathering data. Twenty-five individuals utilising a variety of protocols were used to compile the dataset.

Table 5. The dataset's statistical distribution, based on a subset of its attributes.

Features	Number	Mode	Mean	Features	Number	Mode	Mean
DestinationPort	485881	4855.64	4855.64	FwdHeaderLength_A	485881	318.01	318.01
FlowDuration	485881	38856281	38856281	BwdHeaderLength	485881	350.67	350.67
TotalFwdPackets	485881	12.6	12.6	FwdPacketss	485881	15533.1	15533.1
TotalBackwardPackets	485881	14.39	14.39	BwdPacketss	485881	4910.09	4910.09
TotalLengthofFwdPackets	485881	766.63	766.63	MaxPacketLength	485881	2435.6	2435.6
FwdPacketLengthMax	485881	316.83	316.83	AveragePacketSize	485881	415.2116	415.2116
FlowBytess	485771	387017.1	387017.1	AvgFwdSegmentSize	485881	74.81795	74.81795
FlowPacketss	485771	20427.24	20427.24	FwdHeaderLength	485881	318.01	318.01
FlowIATMax	485881	31827784	31827784	SubflowFwdPackets	485881	12.6	12.6
FlowIATMin	485881	49418.34	49418.34	SubflowFwdBytes	485881	766.63	766.63
FwdIATTotal	485881	38499251	38499251	SubflowBwdPackets	485881	14.39	14.39
FwdIATMax	485881	31702017	31702017	Init_Win_bytes_forward	485881	6069.22	6069.22
BwdIATTotal	485881	19764440	19764440	Init_Win_bytes_backward	485881	1388.02	1388.02
BwdIATMax	485881	13265776	13265776	act_data_pkt_fwd	485881	8.51	8.51
BwdIATMin	485881	1321377	1321377	min_seg_size_forward	485881	26.26	26.26

Table 6 shows that the dataset used to develop the predictive machine learning models contains 485881 occurrences and 31 characteristics divided into training 80% and 20% testing subsets. There are five distinct varieties of DoS attacks (Bansal & Kaur, [11]), including the widely-known Slowhttptest, Slowloris, Hulk, Heartbleed, and GoldenEye. DoS attack samples used for training and testing are shown in Table 6.

Table 6. Structure of the dataset with classifications of assaults.

Attacks	DoS Slowhttptest	Normal	slowloris	DoS GoldenEye	DoS Hulk	Heartbleed
Training set	1276.8	252382.6	1504.8	6307.2	127302.4	8
Testing set	319.2	63076.4	376.2	1576.8	31825.6	2
Overall	1596	315382	1881	7884	159128	10

Normalisation, missing value imputation, and aggregation are all part of the data processing required to rearrange the data before the training and testing phases begin. We fill in the blanks by averaging the current values (Anbarasan et al., [10]). It is possible to convert the data into binary values as 0 and 1 by using the minimum and maximum values.

5. Proposed AHIDS framework

Figure 12 depicts how the hybrid machine learning models used in the proposed system are used to categories network data flow as either normal operation or harmful attacks. Hybrid machine-learning approaches are mostly combined to complete the intrusion detection system for attacks in WSNs (Shi & Li, [75]). Figure 12 depicts the major technical components of the proposed framework, including sensor deployment, data collection and information processing, data aggregation and clustering, decision-making, preprocessing, machine learning and optimisation for training and testing, classifier generation, and a detection and classification module for records. Since the proposed method is simple and effective, it has great potential for deployment in real-world wireless sensor networks in hierarchical clusters. Conditional control statements are a key component of the hybrid machine learning approaches for decision-making events' outcomes. The new aspect of this decision-making method is the use of a collaborative process for data analysis, which in turn aids in the automatic construction of predictive models. Decision nodes are used for prediction, while leaf nodes are used for the final classification, as seen in Figure 12, using hybrid machine learning models. Splitting training and testing benchmarks is governed by rules and reasoning generated by the hybrid machine learning models. Target categorisation is performed using the statistical metric.

DIAGRAM: Figure 12. Framework for advanced hybrid intrusion detection system (AHIDS) Block diagram using attack detection and classification Model.

Finally, the proposed system employs hybrid machine learning approaches (Gupta et al., [32]) to detect and localise attacks utilising data from both the attack and non-attack phases. Using the modified dataset, the suggested AIDS-HML learns to identify potential attacks. Assuming that all features of every sample belong to the designated class label, AIDS-HML is an efficient classification method (conditional independence assumption). Enhanced Advanced Hybrid Machine Learning (AIDS-HML) is a more advanced and hybridised version of machine learning.

Designing advanced intrusion detection systems based on hybrid machine learning techniques in hierarchically wireless sensor networks can offer several advantages and disadvantages. Here are some potential benefits and drawbacks of the proposed system to consider.

Advantages.

Improved Detection Accuracy. Hybrid machine learning techniques combine the strengths of multiple algorithms, such as neural networks, decision trees, or support vector machines. This combination can enhance intrusion detection accuracy by leveraging each technique's unique capabilities.

Adaptability to Dynamic Environments. Hierarchically wireless sensor networks often operate in dynamic environments where network topology, traffic patterns, and intrusion characteristics can change over time. Hybrid machine-learning approaches can adapt to these changes and update their detection models accordingly, leading to better performance in dynamic scenarios.

Enhanced Scalability. Wireless sensor networks may consist of many nodes, making scalability a critical factor. Hybrid machine learning techniques can handle large-scale networks more effectively by distributing computational tasks among nodes and optimising resource utilisation.

Reduced False Positives and False Negatives. By combining multiple machine learning techniques, hybrid systems can mitigate individual algorithms' weaknesses, reducing false positives (incorrectly identifying benign activity as intrusion) and false negatives (failing to detect actual intrusions).

Disadvantages.

Increased Complexity. Implementing hybrid machine learning techniques in intrusion detection systems adds complexity to the design and deployment process. Combining different algorithms and managing their interactions requires expertise in both machine learning and wireless sensor network domains.

Higher Computational Demands. Hybrid systems may require more computational resources compared to single-method approaches. The processing and memory requirements can be significant, particularly in resource-constrained wireless sensor networks, which can impact system performance and energy efficiency.

Training and Maintenance Overhead. Hybrid machine learning models typically require more extensive training and maintenance processes. Ensuring accurate model updates, handling concept drift (changes in intrusion patterns over time), and managing retraining procedures can be time-consuming and resource-intensive.

Increased Vulnerability to Attacks. Advanced intrusion detection systems may become targets for attackers seeking to manipulate or evade detection. Hybrid machine learning models can be susceptible to adversarial attacks, where attackers exploit vulnerabilities in the learning algorithms to deceive the system. Robustness against such attacks must be considered during system design.

5.1. Sensor deployment and routing techniques

Sensor nodes are deployed based on various network models for attack detection and classification attacks in WSNs. However, WSNs face significant routing difficulties because of their restricted power supply, poor transmission bandwidth, less memory capacity, and processor capacity (Praveen Kumar et al., [59]). Due to limitations like short battery life, small memory, and low processing power, an adversary can quickly target individual nodes of WSNs when deployed in a dangerous area (Rouissi et al., [67]). It is crucial to identify malicious attacks to prevent being tricked by the adversary's fabricated data supplied by compromised nodes. Here, we distinguish between internal and external attacks on WSNs. The goal of the external attack is to reduce the effectiveness of the WSN and is carried out by parties outside of the network. Therefore, we shall elaborate on the proposal that protects against routing attacks with data while maintaining its integrity. This paper uses HML methods for WSNs to create safe protocols for extracting features and locating new routes in moderately complex hybrid and tree network topologies. The following are a few of the many advantages that machine-based routing brings to WSNs.

Without requiring re-programming, machine learning techniques can adapt to new environments and select new CHs for routing in WSNs.

Hybrid machine learning models can be used for various purposes in WSNs, including optimal routing, reducing communication overhead, and delay-awareness.

In this study, we employ the GA-ANN method for detecting wormhole assaults and determining energy-efficient and robust routing for WSNs. WSNs utilise GA-ANN to train their protocols based on a wide range of inputs, including residual energy, node distances, routing discovery, path selection, feature extraction, cluster heads (CH), border nodes, and the sink or base station. An enormous training set is produced, and even ANN is provided with effective threshold values for picking a set of trustworthy CH via backpropagation. Data loss in WSNs can be prevented, and energy consumption among sensor nodes is balanced with this technique.

The engineering optimisation problems used in this study find the optimal solutions under special conditions for selecting cluster head and shortest path for routing in WSNs, such as design principles, resource limitations, and safety requirements (Agushaka et al., [6]) as shown in Figure 13. Typically, metaheuristic algorithms cannot directly find the solution to constraint optimisation problems. Designing and optimising WSNs pose several challenges due to the constraints of limited energy, communication bandwidth, and processing capabilities of the sensor nodes (Ovelade & Ezugwu, [54]). However, equipped with constraint-handling techniques (CHTs), the optimisers can contend with the objective function and corresponding constraints. The purpose of optimisation is to locate the optimal answer to a problem while taking into account all relevant factors. The essence of optimisation methods lies in the gradual improvement of the generated set of solutions using a set of optimisation rules and the evaluation of those solutions using a defined objective function (Abualigah, Diabat, et al., [3]).

DIAGRAM: Figure 13. Illustration of flow chart diagram based on various phases of the proposed system.

Unknown search space, discrete or continuous search space, non-derivative objective functions, high dimensions, and non-convexity are only a few of the characteristics of optimisation problems that prevent them from being solved in a reasonable amount of time using only classical methods (Ezugwu et al., [23]). These algorithms, coupled with appropriate fitness functions and problem-specific adaptations, have been used to improve the performance, reliability, and energy efficiency of IoT-WSNs (Abualigah et al., [4]). The algorithm evaluates the fitness of the candidate population using the objective function and constraints in each iteration, and the next generation of the candidate population is evaluated based on the calculated fitness function.

In the context of IoT-WSNs, an optimisation process involves finding the optimal values for specific parameters of the system in order to meet the system design requirements while minimising cost and finding the shortest path (Abualigah, Yousri, et al., [5]). The goal is to achieve an optimal configuration or solution that optimises the system's performance and efficiency. The parameters that are typically optimised in IoT-WSNs can vary depending on the specific application and design objectives. Some common parameters that are often optimised include.

Node Placement. The optimal locations for sensor nodes are determined to achieve desired coverage, connectivity, and energy efficiency. This involves finding the optimal positions or coordinates for deploying the sensor nodes within the network area.

Routing. The optimal routing paths are identified to transmit data from source to destination nodes efficiently. This includes finding the shortest or most energy-efficient paths considering network dynamics, congestion, and quality of services (QoS) requirements.

Energy Management. Energy consumption is optimised by dynamically adjusting parameters such as sleep-wake schedules, duty cycles, or transmission power levels. The objective is to prolong the network lifetime while meeting the application requirements.

Resource Allocation. The allocation of network resources, such as bandwidth and time slots, is optimised to ensure efficient utilisation. This involves determining how resources should be allocated among sensor nodes or applications to maximise overall system performance.

Data Aggregation. The optimal strategies for aggregating data from multiple sensor nodes are identified to minimise redundant transmissions and conserve energy. This involves determining which nodes should aggregate data and how the data should be fused or compressed.

By employing an optimisation process in IoT-WSNs, system designers and engineers can identify the most efficient and cost-effective network configurations, enabling improved performance, energy efficiency, and overall system design.

5.2. Data pre-processing

The performance of a machine learning model is indeed influenced by the quality of the datasets on which it is trained (Singh et al., [78]). The quality of the datasets can significantly impact the model's ability to learn patterns, generalise to new data, and make accurate predictions. Here are some key points regarding the importance of dataset quality in machine learning.

High-quality datasets should be accurate, free from errors, and reflect the true values or labels of the target variable.

Datasets should contain all the features and attributes required for the learning task.

The datasets should represent the real-world problem the model aims to solve.

Balancing the dataset or using appropriate techniques to handle class imbalance is crucial to ensure fair and accurate predictions.

Proper preprocessing steps, such as cleaning, normalisation, and feature engineering, are essential for data preparation before training the model.

The size of the dataset can also influence the model's performance.

Rigorous quality assurance measures should be applied to datasets, including data validation, outlier detection, and error handling.

The data preprocessing model converts raw network traffic into the format the classification model needs in the next stage (Zhao et al., [95]). In order to offer both training and testing data, this study measures and normalises raw network traffic using a variety of data preparation approaches. A raw dataset may not have undergone any preprocessing (Roy & Chowdhury, [68]). A raw dataset is incomplete, noisy, and possibly presented unfavourably. As a result, building machine learning models from scratch using a raw dataset is impossible, as shown in Figure 14.

Graph: Figure 14. Data pre-processing, training and testing for model evaluation framework using benchmark datasets.

Preprocessing the raw data by eliminating duplicates and standardising the format improves the efficiency of a machine-learning model. This is why the training phase of a machine learning model is so important. Here are the pre-processing techniques.

Filtering: One way to clean up data is through filtering. This filter roughly estimates a desired signal pattern from a distorted signal pattern. The major goal of this filtering method is to minimise the mean square error between the estimated and intended signal patterns.

Feature Selection: Feature selection approaches are essential for choosing relevant and useful features for model learning, enhancing prediction accuracy and reducing overfitting, training time, and complexity. Common methods include Filtering, Wrapping, and embedding. Selecting the right features is crucial for data mining projects, and Figure 15 illustrates selected features using the NSL-KDD benchmark dataset (Intelligence et al., [34]). To improve application performance, identifying the best feature selection method and implementing it in relevant processes is necessary. This reduces the dataset's attributes and makes associations between them, streamlining the procedure. However, there is no universal approach to feature selection, and the dataset's condition should be considered when choosing an appropriate method. The primary challenge is finding the best feature to discriminate between classes, requiring different strategies for various datasets.

Graph: Figure 15. Selection of important feature technique using NSL-KDD benchmark dataset.

The feature selection method employs a plethora of different methods. Spearman's rank correlation coefficient formula is used for a recursive feature selection process, which then dynamically selects features, as shown in Equation (6).

Graph

$ρ = \frac{\sum_{i} (x_{i} - \bar{x}) (y_{i} - \bar{y})}{\sum_{i} {(x_{i} - \bar{x})}^{2} {(y_{i} - \bar{y)}}^{2}}$ (6)

Where ρ is the correlation coefficient,

Graph

$x_{i}$ and

Graph

$y_{i}$ are the feature variables and

Graph

$\bar{x}$ and

Graph

$\bar{y}$ are the mean values of x and y.

Feature engineering: Each pattern can be isolated with the help of a single clue provided by the feature engineering phase. When a raw dataset has a large feature set that is considered redundant, the feature extraction approach is used to create a derived set of non-redundant and informative features from the original feature set.

Windowing: Each pattern can be isolated with the help of a single clue provided by the feature engineering phase. When a raw dataset has a large feature set that is considered redundant, the feature extraction approach is used to create a derived set of non-redundant and informative features from the original feature set.

Filter Method: The filter technique employs feature ranking methods for feature selection. Ranking features indicate how crucial they are when constructing a model. The results of numerous statistical tests are used to rank the features. Each feature's connection with the result variable is calculated in these statistical analyses. Pearson's correlation, Linear discriminant analysis, Analysis of variance, Chi-square, Variance threshold, Information gain, and so on are only some of the many statistical tests available.

Transformation and Normalization Operations: The NSL-KDD dataset utilises quantified nominal values, transforming attributes like protocol type, service, and flag numbers into numeric values (TCP = 1, UDP = 2, ICMP = 3, service = 1, flag = 2). Dataset normalisation is a crucial preprocessing technique, ensuring consistent and rescaled attribute values for effective classification (Saheed et al., [70]). Machine learning algorithms benefit from normalised data, achieving remarkable results in generalised prediction models. Minimum and maximum normalisation is used to standardise the dataset, scaling information between 0 and 1, avoiding overshadowing of lower numeric range features (Mojtaba et al., [50]). Normalisation also eliminates numerical issues in calculations, improving overall performance (Mojtaba et al., [50]; Saheed et al., [70]).

5.3. Machine learning and classification techniques

Data mining operations can utilise various data mining techniques, including hybrid machine learning methods such as Naive Bayes (NB), Artificial Neural Networks (ANN), Decision Tree (DT), Extreme Gradient Boosting (XGB), Extra Tree (ET), Random Forest (RF), Ensemble Stacking (ES), and cluster labelling K-Means (CLK-M) for attack detection and categorisation (Intelligence et al., [34]). Machine learning enables systems to learn and improve from experience without explicit programming, enhancing reliability, efficiency, and cost-effectiveness in computational procedures (Praveen Kumar et al., [59]). ML models are developed using automated and accurate processing of complex data with the extracted or chosen feature set used in conjunction with machine learning methods to create classification algorithms. Supervised learning is suitable for standard fingerprinting data, while unsupervised or semi-supervised learning can be appropriate for crowdsourced data. Selecting the right data mining algorithm based on the dataset's structure is crucial for optimal performance. This overview outlines the use of data mining methods in classification procedures on representative benchmark datasets.

5.3.1. Naive Bayes

Naive Bayes is a basic technique for classifying data based on probability theories to identify which classes should be included. Predictions can be made after just one scan, which is straightforward. The technique is predicated on a streamlined version of the Bayesian theorem.

Conditional probability theory is used to predict to which class a given sample from a dataset will belong. Classes for test dataset samples are determined using knowledge gained during training on the training dataset. Despite its seeming lack of complexity, the Naive Bayes algorithm is highly effective. Below are the mathematical formulations for the probabilities involved in Bayes' theorem as in (7).

Graph

$P (c | x_{1}, x_{2}, ... ., x_{n}) = \frac{P(x_{1}, x_{2}, ... ., x_{n} | c) P (c)}{P(x_{1}, x_{2}, ... ., x_{n})}$ (7)

Where P(x) is the probability of event x, c is the desired outcome, and x is the entire dataset's properties.

Based on Naive Bayes, an optimal cluster head selection method is utilised for safe and low-power routing in WSNs. An ideal collection of CHs will always maximise the network's lifetime while minimising the energy drain on individual sensor nodes. Naive Bayes ensures continued network flexibility in the face of dynamically added or modified features. This study describes a new adaptive integrated routing architecture for data collecting using a Bayesian approach.

5.3.2. Random forest

When several decision trees are trained using many different data sets, the resulting algorithm is called a random forest algorithm (RF). Breiman created this multi-technique classifier back in 2001 as an algorithm. Sub-training clusters are generated in the random forest algorithm. When forming a training cluster, preloading is used. In order to grow the trees, we employ a mechanism in which the attributes are randomly picked. The algorithm works by picking a random value from each node and utilising that as the basis for a branch as shown in Figure 16. Randomly selected factors produce the derived trees. The collected datasets are utilised as input into the Classification And Regression Trees (CART) algorithm for tree building. Each created tree is used to label the training sample and the classes assigned to which the sample is then compiled. To be processed instances are often included in the most common classification to which they belong. The RF method does not include pruning, while the CART algorithm does. An important reason why the RF algorithm outperforms the other decision tree approaches is that it doesn't rely on pruning.

DIAGRAM: Figure 16. Block diagram of random forest operation for training and testing benchmark dataset.

The RF algorithm is quick, flexible, and more effective than alternative decision tree approaches despite using numerous tree topologies. The CART algorithm uses the GINI idex value to decide which branch to create from each node. Tree development parameters include the number of trees and the number of variables per node. The RF algorithm's basic operation is depicted in Figure 16 for attack detection and classification using training and testing the benchmark datasets.

5.3.3. Decision trees

Decision trees (DTs) are a type of supervised ML technique to classification that uses a set of if–then rules to simplify the process and improve human comprehension. The two types of nodes in a decision tree are the leaf nodes, which represent the outcomes, and the decision nodes, which represent the choices that lead to those outcomes (choice between alternatives). A decision tree can be used to predict a class or target by inferring decision rules from training data. The decision tree has the benefits of being easy to understand, helping eliminate confusion while making choices, and facilitating in-depth research. Connectivity, anomaly detection, data aggregation, and mobile sink path selection are just a few of the many problems that WSNs can address with the help of adopted decision trees.

The algorithm employs a divide-and-conquer technique. This algorithm, in contrast to ID3, incorporates normalising procedures. The algorithm determines the ratio based on the values of the information acquired. Building and repositioning intermediate trees is feasible at the time of the decision tree's inception. The decision tree method also employs branch pruning to eliminate potentially incorrect data and lower the error rate. Identifying a single node to begin the tree-building method is necessary if all of the samples belong to the same class; otherwise, the node will be labelled as a leaf and will not represent any classes. An optimal segmentation attribute is chosen if a node has characteristics from multiple classes, and the tree expands from there.

Each feature's information gain is computed, and the feature with the highest value is chosen as the tree's decision node. During the election of the cluster leader, this is the best time to identify and remove any malicious nodes. After identifying a decision node, the procedure continues by creating a child branch off that node. If all the elements in the subgroups listed above have the same value, the procedure ends, and that value is used as the output. The process ends if the subset contains exactly one node and no distinguishing features are identified.

5.3.4. K-Means clustering

With minimal effort, the k -means method can divide a data set into a specified number of groups. Starting with a random sample of k locations, the nearest centres are assigned to each remaining point. After the data is partitioned into clusters, the centroid of each cluster is recalculated. Each time the algorithm is run, the cluster's centroid shifts until the algorithm reaches a plateau and no cluster centroid shifts. Suppose we define n as the total number of points and k as the total number of centroids. In that case, i is the total number of iterations, and d is the total number of attributes, then we can say that the time complexity of the k -means algorithm is O (n*k*i *d). In Equation (8), we see the minimisation function for the sum of squares of errors.

Graph

$min f(x) = \sum_{i = 1}^{k} \sum_{j = 1}^{N} {| | x_{i} - y_{j} | |}^{2}$ (8)

Where N is the number of data points in the ith cluster and

Graph

$| | x_{i} - y_{j} | |$ is the Euclidean distance between

Graph

$x_{i}$ and

Graph

$y_{j}$ . The simplest clustering method, k-means, is also useful in WSNs for identifying ideal cluster heads (CHs) and detecting of malicious nodes to employ while transmitting data to the base station. This method also works well for locating productive mobile sink rendezvous spots. Choosing a different value of K can affect the outcomes in some situations. Getting the best results from the analysed data is crucial to get the value of k right. Euclidean, Manhattan, and Minkowski are just a few of the distance and neighbour node formulas that can be applied. Here are the relevant formulas given by Equation (9).

Graph

$\begin{array}{l} E u c l i d e a n \to \sqrt{\sum_{i = 1}^{k} {(x_{i} - y_{i})}^{2}} \\ M a n h a n t \tan \to \sum_{i = 1}^{k} | x_{i} - y_{i} | \\ M i n k o w s k i \to {[\sum_{i = 1}^{k} ({| x_{i} - y_{i} |}^{q})]}^{1 / q} \end{array}$ (9)

5.3.5. Hybrid-ensemble machine learning techniques

When multiple machine learning algorithms are combined into an ensemble, the resulting classification is both more accurate and faster. This approach involves several learning procedures using various machine learning approaches and then combining and categorising the results. The underlying algorithm performs two basic steps. At first, the original dataset is partitioned, and the distribution of a basic model is generated on those subsets. After doing so, the distribution is aggregated into a single model, and the results are obtained. The stacking strategy differs from standard machine learning methods because it involves a model production step. Models built from the training set are combined. You can describe the algorithm's function as follows.

Models are created during training by employing the dataset and the training method.

Each derived model has full annotations for all the dataset's training samples.

The final model is built from the other models in the training dataset using the combiner method. After a final model is obtained, it categorises and tests dataset samples.

A final prediction is made using the final model once all test dataset samples have been classified and the class predicted by the stacking algorithm of the sample is chosen.

The term ensemble technique is used to describe three distinct approaches. We're bagging, boosting, and stacking here. Data mining approaches and the capabilities of the combiner models used by each of these methods are where they diverge. Stacking strives to do both instead of maximising predictive power like boosting and minimising variance like bagging does. The function that generates a single model uses the average weight in the bagging strategy, the weighted majority vote in the boosting approach, and Logistic regression in the stacking approach.

For the suggested strategy, a tree-based Parzen estimation (PTE) is employed with hyperparameter and Bayesian optimisation (BO) techniques to further enhance the classification of the hybrid machine learning models on the benchmark dataset. Fine-tuning these parameters, or hyperparameters, is integral to every machine-learning process. Hyperparameter optimisation (HPO) enhances ML performance with decreased practitioner involvement (Feurer & Hutter, [26]). Hyperparameters are optimised in a black box and global optimisation for more accurate function evaluation. This allows us to provide a non-technical explanation of Bayesian Optimization's inner workings. Bayesian optimisation (BO) is becoming increasingly prominent in HPO for deep neural networks as a framework for the global optimisation of networks that contain expensive blackbox functions. Bayesian optimisation is a recursive method that uses a probabilistic surrogate model and an acquisition function to evaluate choices with the help of the Gaussian process. Random forest and tree Parzen estimators are just two tree-based approaches to dealing with hyperparameters (PTE). This suggested effort combines Bayesian-based optimisation (BBO) with tree Parzen estimators (TPE) to determine the optimal evaluation point for fully automated machine learning.

5.1. Performance evaluation metrics

Detection rate [18], precision, false-negative rate, and the receiver operating characteristics curve (ROC) are the performance parameters that may be measured and analysed. These indicators evaluate the system's performance, generate a categorisation report, and compare the results to those of other studies. We used a complexity matrix as one of our criteria for rating submissions (Intelligence et al., [34]). Some of the metrics used in this paper are bulleted as follow:

Routing Attack Detection Rate: This metric indicates the percentage of detected routing attacks, including blackhole, wormhole, Sybil and misdirection assaults, out of the total number of simulated attacks. A higher detection rate signifies the effectiveness of the hybrid technique in identifying malicious routing behaviours.

False Positive Rate: This metric measures the proportion of legitimate network activities falsely identified as routing attacks. A low false positive rate is desirable to avoid unnecessary alarms and ensure the reliability of the detection system.

False Negative Rate: This metric represents the percentage of actual routing attacks that were not detected by the hybrid technique. A low false negative rate indicates that the method can effectively capture most routing attacks without missing significant ones.

Precision: Precision is the ratio of true positive detections to the sum of true positive and false positive detections. A higher precision indicates that the reported routing attacks are more likely to be genuine, reducing the chances of false alarms.

Recall (Sensitivity): Recall measures the proportion of true positive detections to the sum of true positive and false negative detections. A higher recall signifies that the hybrid technique successfully identifies a larger portion of routing attacks, ensuring better coverage.

F1 Score: The F1 score is the harmonic mean of precision and recall. It provides a balanced assessment of the algorithm's accuracy in detecting routing attacks. A higher F1 score indicates a better trade-off between precision and recall.

Execution Time: This metric measures the time taken by the hybrid technique to analyse network data and detect routing attacks. A shorter execution time is preferred to enable real-time or near real-time response to potential threats.

Resource Utilization: This metric evaluates the amount of computational resources, memory, or network bandwidth required by the hybrid technique to perform routing attack detection. Efficient resource utilisation including time and energy is essential for practical deployment.

Values from the complexity matrix are used to determine the criterion for evaluation. Following is a breakdown of the values in the complexity matrix.

In the dataset, TP (true-positive) refers to the number of samples that were accurately predicted to be incursions.

Several samples in the normal class were correctly predicted to be in the normal class (true-negative or TN).

False-negative (FN). The fraction of intrusions samples that were wrongly classified as normal.

Number of normal samples in the dataset that were wrongly classified as incursions (FP, or false positive).

The detection rate is calculated by dividing the TP value by the total number of samples for which intrusion estimates were calculated. The accuracy value measures how well a system performs in classifying data by comparing the fraction of data points that were correctly labelled by the system to the total number of data points. To demonstrate the system's efficiency, we employ the following mathematical Equation (10).

Graph

$\begin{aligned} Detection Rate & = \frac{TP}{TP + FN} \\ False alarm rate & = \frac{FP}{TN + FP} \\ Precision & = \frac{TP}{TP + FP} \\ Recall & = \frac{TP}{TP + FN} \\ Accuracy(Acc) & = \frac{TN + TP}{TN + TP + FN + FP} \\ F1 - score & = \frac{2 \times R e c a l l \times P r e c i s i o n}{Recall + Precision} \end{aligned}$ (10)

It has been noted that the designed IDS can perform four different outcomes for each traffic operation. The following scenarios are generated using the confusion matrix. First, a True Positive (TP) occurs when an intrusion detection system (IDS) reports a successful detection of malicious activity on a network (Mohd et al., [49]); second, a True Negative (TN) occurs when an IDS does not report a successful detection of malicious activity, third, a False Positive (FP) occurs when an IDS reports no malicious activity, and fourth, a False Negative (FN) occurs when an IDS reports a successful detection.

Mean squared error (MSE). Mean squared error (MSE) measures the amount of error in statistical machine learning models for computing the position and distance of the wormhole attack between two points as in Equation (11). It assesses the average squared difference between the observed and predicted values of each sensor node's position and location, having its unique identity to detect routing attacks. When a model has no error, the MSE equals zero. As model error increases, its value increases. The mean squared error is also known as the mean squared deviation (MSD).

Graph

$MSE = \frac{\sum_{i = 0}^{n} {(x_{i} - {\bar{x}}_{i})}^{2} + (y_{i} - {\bar{y}}_{i})^{2}}{n}$ (11)

Where.

•

Graph

$x_{i}$ , and

Graph

$y_{i}$ is the ith observed values.

•

Graph

${\bar{x}}_{i}$ , and

Graph

${\bar{y}}_{i}$ are the corresponding predicted values.

n is the number of observations.

The mean squared error uses a formula that is quite close to the variances. The MSE is calculated by square root, the difference between the observed and anticipated values. That should be done for every observation. After that, divide the total by the total number of observations to get the square root.

5.2. Simulation and environmental setup

Network design and model simulations were executed in MATLAB R2021a on a Windows 10 64-bit x64-based processor running an Intel Xeon Silver 4214 CPU at 2.20 GHz 2.19 GHz (2 processors), with 128GB (128GB useable) of installed RAM. Data processing and analysis with machine learning classifiers are performed in Python libraries, including Keras numpy, Sklearn, Seaborn, and pandas using Anaconda navigator and MATLAB R2021a. The simulation parameters for running network attack scenarios are depicted in Table 7, along with the values. This study assumes that Node-0 is the final destination for network traffic. A total of 5 s are allotted for the simulation.

Table 7. WSN configuration of simulation setting.

Parameter	Setting	Parameter	Setting
Base Station	1	Topology	Hierarchical
Filed size	1500 m × 1500 m	Number of attacks	2
Number of Nodes	200	Mobility Model	Random
Protocol type	Routing	Number layers	10
Cluster size	10	Max epochs	200
Attack type	wormhole	Data size	5000 Kb
Number iterations	200	Simulation Time	5 s

Simulations of wormhole routing attacks are undertaken, with the results being created using an artificial neural network and a genetic algorithm that have been genetically enhanced for optimal effectiveness. After that, a malicious node is added to the network in order to generate and extract features for both benign and malicious network traffic. This procedure is then repeated for another five seconds of simulation time in order to develop a new database. The simulation scenarios that make use of a mobility and routing protocol that is based on a random selection of intermediate nodes and mobility nodes to identify and extract features from probable routes are outlined in Table 7. Between two malicious nodes, a wormhole attack is injected, which results in the creation of a tunnel.

6. Experimental results and analysis

In this section, each sensor node in the network is capable of establishing connections with all other nodes, resulting in a fully connected network topology. This dynamic connectivity allows the formation of wormhole tunnels. Wormhole tunnels refer to virtual tunnels or channels created between two malicious nodes in the network. as in Figure 17 20 (a) and (b). These tunnels bypass normal routing mechanisms and enable attackers to disrupt the routing discovery process. By exploiting these tunnels, attackers can carry out various routing attacks, such as selective forwarding, blackhole attacks, or Sybil attacks. The purpose of this study is to investigate the detection and prevention of such routing attacks in wireless sensor networks. By simulating the fully connected network topology and incorporating wormhole tunnels, the researchers aim to develop effective mechanisms to discover and mitigate routing attacks.

Graph: Figure 17. wireless sensor network deployment and routing discovery for analysing energy consumption and time elapsed. (a) Dynamic network deployment. (b) Routing discovery and feature extraction. (C) Energy consumed for nodes with time and (d) Time elapsed for each node.

The routing wormhole attack highly affects the sensor nodes' energy consumption and timing operation for effective communication, as shown in Figure 17 (c) and (d).

The simulation results show that the proposed attack detection and classification techniques are effective, with an average detection accuracy of 99.46%, varying the hope count and wormhole tunnel of the routing attacks across the network. The results also show that hybrid techniques improve the prediction error and maximise the performance, as shown in Figure 18 (a) and (b) histograms. Figures show the differences between targets and actual outputs for computing the errors of the unknown nodes. Targets represent expected outputs, and outputs represent the actual outputs (Constraints, [17]). The error of the training data is almost 0, whereas the error of testing data is higher than that of training errors. This confirms and validates the proposed technique is effective for detecting the wormhole attack in WSNs. The validation and efficiency of the proposed system are depicted in Figure 18 (c) and (d) for detection of routing attacks at epochs 7 and 200 epochs with minimum mean squared error (MSE) of 0.0067 and

Graph

$2.143 x 10 e^{- 08}$ .

Graph: Figure 18. The proposed system's performance evaluation using hybrid GA-ANN takes 100 samples and varying the number of epochs. (a) Histogram with 300 instances. (b) Histogram with 300 instances. (c) Best performance ate epoch of 7 and (d) Best performance at the epoch of 100.

6.1. Attack detection analysis

The samples from the reference datasets have been put through training and testing processes (Panigrahi et al., [58]). First, we randomly assign each sample to two groups. the training and test sets. Step two involves using the whole training set for both training and testing. Finally, cross-validation was utilised to test how well the proposed model actually worked. The area under the curve, false alarm rate, precision, and classification accuracy are used to evaluate performance. Machine learning models are used to assess how well the proposed method performs on benchmark datasets that simulate a variety of assaults against wireless sensor networks. When assessing the efficacy of the proposed system for detecting routing attacks in WSNs, the hybrid optimised machine learning also uses the same benchmark dataset. Table 8 provides a comparison of the results obtained by using various machine learning algorithms. Cluster labelling (CL) k-means binary classification methods are used to boost the suggested system's performance further. Table 8 shows the comparative performance of the various hybrid machine-learning techniques.

Table 8. Evaluation of hybrid machine learning models on standard datasets.

	Results against the UNSW_NB15 dataset	Results against the CICIDS2017 dataset
Classifier	Accuracy	Precision	Recall	F1-Score	Accuracy	Precision	Recall	F1-Score
XGBoost	99.78	99.74	99.78	99.75	99.82	99.86	99.82	99.83
Random Forest	99.75	99.67	99.75	99.70	99.82	99.82	99.82	99.80
Decision Tree	99.68	99.63	99.68	99.66	99.82	99.91	99.82	99.85
Extra Tree	99.72	99.66	99.72	99.68	99.82	99.80	99.82	99.80
Ensemble Stacking	99.78	99.74	99.78	99.75	99.82	99.91	99.82	99.85
CLK-Means	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00

The hyperparameter and Bayesian optimisation (BO) techniques and the tree-based Parzen estimation (BO-PTE) are used to boost the performance of hybrid machine learning models for the proposed system. Table 8 displays the results of evaluating the proposed scheme's performance using the UNSW NB15 benchmark dataset and many other machine-learning models. When applied to the benchmark dataset, the binary classification method employing hybrid cluster labelling K-means obtains a classification accuracy of 100%.

Table 9 shows how merging different hybrid machine-learning models and moving data frames from one machine-learning to the other improves the proposed system's performance even further. The results demonstrate that hybrid ML models outperform their standalone counterparts in terms of Validation, Accuracy, Precision, Recall, F1 score and training time. When it comes to classification and detection, based on the results and model validation, we can say that the created system has high classification and detection accuracy against DoS attacks in WSNs.

Table 9. Comparison of various hybrid ML models using NSL-KDD dataset for attack detection and classification.

	Performance evaluation metrics
ML Classifiers	validation	Accuracy	Precision	Recall	F1-score	Training time
Naive Baye (NB)	83.94	83.71	90.35	83.71	85.68	0.025
Decision Tree (DT)	87.94	88.10	88.34	88.10	88.07	0.22
XG Boost (XGB)	99.16	99.34	99.32	99.34	99.32	1.83
Random Forest (RF)	98.66	99.46	99.44	99.46	99.45	0.17
DT-XGBoost	99.57	99.80	99.80	99.80	99.80	1.24
RF-XGBoost	99.57	99.80	99.80	99.80	99.80	1.44
RF-DT	99.57	99.79	99.79	99.79	99.79	0.327

The results show that hybrid machine techniques perform better attack detection and classification of attacks using the NSL-KDD benchmark dataset, as shown in Figure 19 (a) and (b). For attack detection and classification, the extreme gradient boosting (XGB)-an enhanced hybrid of a random forest and a decision tree achieves better results than either the random forest or the decision tree alone in terms of validity, accuracy, precision, recall, and f1-score.

Graph: Figure 19. Performance comparison of various machine learning models using NSL-KDD. (a) RF-based comparative analysis and (b) DT-based comparative analysis

The performance of the proposed technique is effective compared to L. Yang et al. (Yang et al., [91]) developed multi-tiered hybrid intrusion detection systems (MTH-IDS) for secure vehicular networks using the benchmark dataset CICIDS2017 for known and unknown attacks. They achieved average detection accuracy of 99.88% using binary classification. P. Sun et al. (Sun et al., [82]) developed a hybrid deep learning-based intrusion detection system (DL-IDS) using a convolutional neural network and a long short-term memory network (CNN-LSTM). The scheme achieved an average detection accuracy of 98.67% by extracting the network traffic. This proves that the proposed scheme effectively detects DoS attacks using the benchmark dataset in wireless sensor networks, as shown in Figure 20 (b), using attack detection performance metrics. (Kasongo, [39]) presented an intrusion detection system for the Internet of Things using random forest based on a genetic algorithm (RF-GA) for feature selection, as shown in Figure 20. This achieved average detection accuracy of 87.61%, which is less than compared to 100% using hybrid binary classification. (Suleiman & Issac, [81]) Evaluated six machine learning classifiers using UNSW_NB15, phishing and NSL-KD benchmark datasets for intrusion detection system. Random forest based intrusion detection system (RF-IDS) produced better detection accuracy using UNSW_NB15. Temporal and spatial features to enhance attack detection and classification. This confirms the proposed technique is effective for detection and localisation of attacks as shown in Figure 20 (a).

Graph: Figure 20. Performance comparison of the proposed scheme-based hybrid machine learning techniques using benchmark datasets. (a) Comparison based on UNSW_NB15 and (b) Comparison-based CICIDS2017

In order to achieve high-performance intrusion detection across a wide range of attack types, B. Media et al. (Intelligence et al., [34]) proposed a hybrid-layered IDS (HL-IDS) that employs several distinct machine learning and feature selection approaches as shown in Figure 20 (b). The size of the NSL-KDD dataset is decreased in the created system by first performing data preprocessing on the dataset using various feature selection algorithms.

G. H. Lai (Lai [42]) Proposed detecting wormhole attacks in WSNs using low-power routing protocol and achieving 100% accuracy with fixed range and wormhole tunnel points. This confirms the proposed technique is effective for localising and detecting routing attacks in wireless sensor networks using a benchmark dataset. Y. Yuan et al. (Yuan et al., [93]) Presented a novel lightweight method for Sybil attack detection in distributed WSNs using the approximate point in a triangle (APIT) localisation approach. They achieved an average detection rate of 90%, which is less than the proposed work. D. Upadhyay et al. (Upadhyay et al., [86]) proposed a framework for intrusion detection systems in smart grids using Gradient boosting feature selection by applying machine learning classification techniques. The scheme combines feature engineering with machine learning classifiers and achieves the performance as in Figure 21 (a). This confirms that the proposed method is effective for various applications of DoS attacks in wireless sensor networks.

Graph: Figure 21. Performance comparison of the proposed technique with other previous works. (a) Comparative analysis of the proposed scheme and (b) Comparative analysis based on recall

A unique feature selection algorithm, the dynamic recursive feature selection algorithm, was introduced by Nancy P et al. (Nancy et al., [53]), which chooses an optimal number of features from the data set. Moreover, a sophisticated intrusion-detection system based on a fuzzy logic algorithm (IF-IDS) using the NSL-KDD dataset is shown in Figure 20. Extending the decision tree approach and including convolution neural networks are also presented as means by which to detect the invaders efficiently. The technique of intelligent feature selection algorithm named dynamic recursive feature selection algorithm (DRFSA) has been proposed in this work, which picks the important features to construct the data set. G. Qi, J. Zhou et al. (Qi et al., [61]) presented a new ECABC-BPNN, a combination of back propagation neural networks (BPNNs) and elite clone artificial bee colonies (ECABCs), that improves upon the standard BPNN's weight and threshold settings as depicted in Figure 21 (b).

The proposed system is evaluated using benchmark datasets based on accuracy, precision, recall, F1-score and AUC criteria. The proposed scheme's performance is effective compared to R. Khilar et al. (Khilar et al., [40]) and (Saheed et al., [70]) in terms of various evaluation metrics. This proves the system effectively detects and localises DoS assaults in WSNs, as in Figure 22 (a) and (b).

Graph: Figure 22. Examining the proposed system's performance of similar works using standard benchmarks. (a) Comparison of the proposed system with recent works and (b) AUC-based performance comparison using the UNSW-NB15 benchmark dataset.

The next step is using ECABC-BPNN to identify threats in a computer system's network. The comparison and Conducted experiments on assault classification using benchmark dataset as shown in Table 10.

The proposed scheme is further compared for validation with previous works to detect and localise routing attacks in WSNs. S. Jiang et al. (Jiang et al., [37]) Proposed an intrusion detection system based on a secure light gradient boosting machine (IDS-SLGBM) in wireless sensor networks using the WSN-DS benchmark dataset with the class of routing attacks.

Table 10. Performance evaluation of the proposed system using the NSL-KDD benchmark dataset.

	Proposed AIDS-HML approach	Intelligent fuzzy-based IDS	ECABC-BPNN
Class of attacks	Precision	Recall	F1-Score	Precision	Recall	F1-Score	Precision
DOS	99.89	99.93	99.91	99.99	97.23	98.72	98.56
Probes	99.30	98.87	99.08	92.67	97.97	93.34	86.70
R2L	98.91	97.84	98.38	57.39	28.32	42.97	97.20
U2R	80.00	66.66	72.72	95.23	63.56	59.03	83.67

The experimental results and analysis show that designing advanced intrusion detection systems (IDS) based on hybrid machine learning techniques in hierarchically wireless sensor networks introduces several novel aspects and contributions. Here are some key points that highlight the novelty of this design:

Hierarchical Wireless Sensor Networks: Hierarchical architecture in wireless sensor networks introduces an additional layer of complexity and organisation. The network is divided into multiple levels or tiers, with different roles assigned to nodes at each level. This hierarchical structure helps in efficient data aggregation, routing, and management in large-scale sensor networks.

Hybrid Machine Learning Techniques: The IDS design incorporates hybrid machine learning techniques, which combine multiple algorithms or approaches to enhance the accuracy and effectiveness of intrusion detection. This hybridisation can involve integrating different machine learning models, such as combining supervised and unsupervised learning methods or combining traditional rule-based techniques with machine learning algorithms.

Advanced Intrusion Detection: The IDS focuses on advanced intrusion detection, aiming to detect sophisticated attacks beyond simple rule-based or signature-based detection methods. Advanced attacks often exhibit complex patterns or behaviours that can be challenging to identify using traditional approaches. By leveraging machine learning techniques, the IDS can learn and adapt to evolving attack patterns, enabling the detection of novel and unknown attacks.

Novelty in Feature Selection: The design may introduce novel approaches to feature selection, which involves identifying the most relevant and discriminative features from the sensor data to train the machine learning models. Effective feature selection plays a crucial role in improving the accuracy and efficiency of intrusion detection systems, especially in resource-constrained wireless sensor networks.

Scalability and Efficiency Considerations: The design considers wireless sensor networks' scalability and efficiency requirements. Hierarchical structures and optimised machine-learning techniques are employed to reduce the network's computational overhead, energy consumption, and communication overhead. These considerations ensure that the IDS is suitable for deployment in resource-constrained environments.

6.2. Limitation of the proposed system

While design of advanced intrusion detection systems based on hybrid machine learning techniques in hierarchically wireless sensor networks presents promising approaches to intrusion detection in wireless sensor networks, it also has certain limitations that should be acknowledged:

Scalability: The performance of the proposed advanced intrusion detection system may degrade with the increase in the size of the wireless sensor network. Handling a large number of nodes and data traffic could pose challenges in terms of computational resources and memory requirements.

Complexity and Overhead: The hybrid machine learning techniques used in the system may introduce additional complexity and computational overhead, particularly for resource-constrained sensor nodes. This could impact the real-time responsiveness and energy efficiency of the overall network.

Training Data Collection: Obtaining labelled training data for machine learning algorithms in wireless sensor networks can be challenging. Collecting a diverse and representative dataset of intrusion scenarios, including rare attacks, might be difficult due to the limited resources and controlled environment.

Intrusion Diversity: As new intrusion techniques and attack patterns emerge, the detection system may face challenges in generalising and adapting to previously unseen or zero-day attacks, especially when using pre-trained machine learning models.

Security and Privacy: Deploying an intrusion detection system in the network itself could potentially become a target for attacks. Adversaries might try to manipulate the system's behaviour or exploit its vulnerabilities to evade detection.

Adaptability to Network Changes: As the wireless sensor network topology changes due to node failures, additions, or mobility, the intrusion detection system should be able to adapt and maintain its effectiveness.

False Alarms: Hybrid machine learning techniques might lead to false alarms in certain situations, triggering unnecessary responses and consuming valuable resources for investigating non-existent attacks.

7. Conclusion and future work

The proposed advanced intrusion detection system based on machine learning effectively detects and classifies attacks for scalable and manageable in hierarchically distributed wireless sensor networks. This research aims to create a classification model for an advanced intrusion detection system based on hybrid machine learning, specifically tailored for use in wireless sensor networks to detect intrusions. Each sensor node collects information on the state of its features and reports it to the cluster's central processing node. The cluster leader checks the data and then forwards it to the main cluster head. The proposed hybrid machine learning models use training and testing data to identify attacks. Our suggested IDS-HML outperforms state-of-the-art systems regarding detection and localisation accuracy in a simulated attack on a WSN. By comparing the hypothetical outcomes to earlier research, we find that they are credible. The simulation results show that the proposed system is effective for detecting routing attacks with a localisation accuracy of 99.46% of the wormhole routing attacks. The effectiveness of the suggested system has been measured in accuracy, precision, TP Rate, FP Rate, F-Measure, Mean squared error, and Time. The designed IDS-HML achieved 99.82%, 99.91%, 99.85%, 99.82%, and 100% for average detection accuracy, precision, F1-score, recall, and CLK-Means respectively, in the presence of normal and intrusion traffic using CICIDS2017 dataset as a benchmark for multiclass and binary classifications. This work is implemented using MATLAB for network planning and simulation of attack scenarios. The Python libraries are utilised for hybrid machine-learning classification techniques. This model uses logic rules for decision-making and interpretable predictive models.

Overall, the novelty lies in the combination of hierarchical wireless sensor networks, hybrid machine learning techniques, advanced intrusion detection capabilities, novel feature selection approaches, and considerations for scalability and efficiency. These elements contribute to the development of a robust and effective IDS for wireless sensor networks.

Although the proposed method performs well, it is essential to note that IoT-based WSNs are still susceptible to attacks not addressed in this study. Since the dynamics of the attacks change with time, the topology and design should cope with the attack scenarios. The countermeasures module is only provided in concept, which is another shortcoming. Therefore, we plan to investigate and eventually offer specialised advanced hybrid intrusion detection systems for each type of assault utilising benchmark datasets to evaluate hybrid machine learning techniques. In future work, we will explore collaborative advanced intrusion detection systems based on machine learning in IoT-based wireless sensor networks for different applications using benchmark datasets for evaluations.

Declarations

Conflict of interest The authors declare that no competing interests or personal relationships could affect this work.

Data availability The corresponding author can provide access to the data utilised to support this study's findings upon request.

Disclosure statement

No potential conflict of interest was reported by the author(s).

References 1 Abdulganiyu, O. H., Ait Tchakoucht, T., & Saheed, Y. K. (2023). A systematic literature review for network intrusion detection system (IDS). International Journal of Information Security, https://doi.org/10.1007/s10207-023-00682-2 2 Abduvaliyev, A., Lee, S., & Lee, Y. K. (2010). Energy efficient hybrid intrusion detection system for wireless sensor networks. ICEIE 2010 - 2010 International Conference on Electronics and Information Engineering, Proceedings, 2 (Iceie), 25 – 29. https://doi.org/10.1109/ICEIE.2010.5559708 3 Abualigah, L., Diabat, A., Mirjalili, S., Elaziz, M. A., & Gandomi, A. H. (2021). The arithmetic optimization algorithm. Computer Methods in Applied Mechanics and Engineering, 376, 113609. https://doi.org/10.1016/j.cma.2020.113609 4 Abualigah, L., Elaziz, M. A., Sumari, P., Geem, Z. W., & Gandomi, A. H. (2022). Reptile Search Algorithm (RSA): A nature-inspired meta-heuristic optimizer. Expert Systems with Applications, 191 (November 2021), 116158. https://doi.org/10.1016/j.eswa.2021.116158 5 Abualigah, L., Yousri, D., Elaziz, M. A., Ewees, A. A., Al-qaness, M. A. A., & Gandomi, A. H. (2021). Aquila optimizer: A novel meta-heuristic optimization algorithm. Computers and Industrial Engineering, 157 (October 2020), 107250. https://doi.org/10.1016/j.cie.2021.107250 6 Agushaka, J. O., Ezugwu, A. E., & Abualigah, L. (2022). Dwarf mongoose optimization algorithm. Computer Methods in Applied Mechanics and Engineering, 391, 114570. https://doi.org/10.1016/j.cma.2022.114570 7 Ahmad, B., Jian, W., Ali, Z. A., Tanvir, S., & Ali Khan, M. S. (2019). Hybrid anomaly detection by using clustering for wireless sensor network. Wireless Personal Communications, 106 (4), 1841 – 1853. https://doi.org/10.1007/s11277-018-5721-6 8 Alghamdi, M. I. (2022). A hybrid model for intrusion detection in IoT applications. Wireless Communications and Mobile Computing, 2022. https://doi.org/10.1155/2022/4553502 9 Alsaedi, N., Hashim, F., Sali, A., & Rokhani, F. Z. (2017). Detecting Sybil attacks in clustered wireless sensor networks based on energy trust system (ETS). Computer Communications, 110, 75 – 82. https://doi.org/10.1016/j.comcom.2017.05.006 Anbarasan, M., Muthu, B. A., Sivaparthipan, C. B., Sundarasekar, R., Kadry, S., Krishnamoorthy, S., Samuel, D. J., & Antony Dasel, A. (2020). Detection of flood disaster system based on IoT, Big data and convolutional deep neural network. Computer Communications, 150 (November 2019), 150 – 157. https://doi.org/10.1016/j.comcom.2019.11.022 Bansal, A., & Kaur, S. (2018). Extreme gradient boosting based tuning for classification in intrusion detection systems. Vol. 905. Springer Singapore. Biswas, P., Charitha, R., Gavel, S., & Raghuvanshi, A. S. (2019). Fault detection using hybrid of KF-ELM for wireless sensor networks. Proceedings of the International Conference on Trends in Electronics and Informatics, ICOEI, 2019 (Icoei), 746 – 750. https://doi.org/10.1109/ICOEI.2019.8862687 Blywis, B. (2009). A real-time and energy-efficient MAC protocol for wireless sensor networks. International Journal of Ultra Wideband Communications and Systems, 1 (2), 128 – 142. https://doi.org/10.1504/IJUWBCS.2009.029002 Cao, B., Li, C., Song, Y., Qin, Y., & Chen, C. (2022). Applied Sciences Network Intrusion Detection Model Based on CNN and GRU. Cepheli, Ö., Büyükçorak, S., & Karabulut Kurt, G. (2016). Hybrid intrusion detection system for DDoS attacks. Journal of Electrical and Computer Engineering, 2016. https://doi.org/10.1155/2016/1075648 Choi, W., Shah, P., & Das, S. K. (2004). A framework for energy-saving data gathering using Two-phase clustering in wireless sensor networks. Proceedings of MOBIQUITOUS 2004 - 1st Annual International Conference on Mobile and Ubiquitous Systems: Networking and Services, 203 – 212. https://doi.org/10.1109/MOBIQ.2004.1331727 Constraints, D. (2016). Source anonymity in WSNs against Global Adversary Utilizing Low Transmission Rates With. https://doi.org/10.3390/s16070957. Das, S., & Namasudra, S. (2022). A novel hybrid encryption method to secure healthcare data in IoT-enabled healthcare infrastructure. Computers and Electrical Engineering, 101 (September 2021), 107991. https://doi.org/10.1016/j.compeleceng.2022.107991 Davahli, A., Shamsi, M., & Abaei, G. (2020). Hybridizing genetic algorithm and grey wolf optimizer to advance an intelligent and lightweight intrusion detection system for IoT wireless networks. Journal of Ambient Intelligence and Humanized Computing, 11 (11), 5581 – 5609. https://doi.org/10.1007/s12652-020-01919-x Deepa, C., & Latha, B. (2019). HHSRP: A cluster based hybrid hierarchical secure routing protocol for wireless sensor networks. Cluster Computing, 22 (s5), 10449 – 10465. https://doi.org/10.1007/s10586-017-1065-3 Devi, P. P., & Jaison, B. (2020). Protection on wireless sensor network from clone attack using the SDN-enabled hybrid clone node detection mechanisms. Computer Communications, 152 (January), 316 – 322. https://doi.org/10.1016/j.comcom.2020.01.064 Elsaid, S. A., & Albatati, N. S. (2020). An optimized collaborative intrusion detection system for wireless sensor networks. Soft Computing, 24 (16), 12553 – 12567. https://doi.org/10.1007/s00500-020-04695-0 Ezugwu, A. E., Agushaka, J. O., Abualigah, L., Mirjalili, S., & Gandomi, A. H. (2022). Prairie Dog optimization algorithm. Vol. 34. Springer. Farooqi, A. H., Khan, F. A., Wang, J., & Lee, S. (2013). A novel intrusion detection framework for wireless sensor networks. Personal and Ubiquitous Computing, 17 (5), 907 – 919. https://doi.org/10.1007/s00779-012-0529-y Faysal, J. A., Mostafa, S. T., Tamanna, J. S., Mumenin, K. M., Arifin, M. M., Awal, M. A., Shome, A., & Mostafa, S. S. (2022). XGB-RF: A hybrid machine learning approach for IoT intrusion detection. Telecom, 3 (1), 52 – 69. https://doi.org/10.3390/telecom3010003 Feurer, M., & Hutter, F. (2019). Hyperparameter optimization. Automated Machine Learning: Methods, Systems, Challenges, 3 – 33. https://doi.org/10.1007/978-3-030-05318-5_1 Gandhimathi, L., & Murugaboopathi, G. (2020). A novel hybrid intrusion detection using flow-based anomaly detection and cross-layer features in wireless sensor network. Automatic Control and Computer Sciences, 54 (1), 62 – 69. https://doi.org/10.3103/S0146411620010046 Gao, P. (2014). Mechatronics and automatic control systems. Lecture Notes in Electrical Engineering, 237, 1063 – 1071. https://doi.org/10.1007/978-3-319-01273-5_120 Gebreyesus, G. (2021). Secure intrusion detection system for hierarchically distributed wireless sensor networks. 9–14. Ghugar, U., Pradhan, J., Bhoi, S. K., & Sahoo, R. R. (2019). LB-IDS: Securing wireless sensor network using protocol layer trust-based intrusion detection system. Journal of Computer Networks and Communications, 2019. https://doi.org/10.1155/2019/2054298 Godala, S., & Vaddella, R. P. V. (2020). A study on intrusion detection system in wireless sensor networks. International Journal of Communication Networks and Information Security, 12 (1), 127 – 141. Gupta, S. K., Tripathi, M., & Grover, J. (2022). Hybrid optimization and deep learning based intrusion detection system. Computers and Electrical Engineering, 100 (May), 107876. https://doi.org/10.1016/j.compeleceng.2022.107876 Han, L., Zhou, M., Jia, W., Dalil, Z., & Xu, X. (2019). Intrusion detection model of wireless sensor networks based on game theory and an autoregressive model. Information Sciences, 476, 491 – 504. https://doi.org/10.1016/j.ins.2018.06.017 Intelligence, A., Science, S., Media, B., & Nature, S. (2019). A new hybrid approach for intrusion detection using machine learning. Applied Intelligence, 2735 – 2761. Ismail, A., & Amin, R. (2019). Malicious cluster head detection mechanism in wireless sensor networks. Wireless Personal Communications, 108 (4), 2117 – 2135. https://doi.org/10.1007/s11277-019-06512-w Jatti, S. A. V., & Kishor Sontif, V. J. K. (2019). Intrusion detection systems. International Journal of Recent Technology and Engineering, 8 (2 Special Issue 11), 3976 – 3983. https://doi.org/10.35940/ijrte.B1540.0982S1119 Jiang, S., Zhao, J., & Xu, X. (2020). SLGBM: An intrusion detection mechanism for wireless sensor networks in smart environments. IEEE Access, 8, 169548 – 58. https://doi.org/10.1109/ACCESS.2020.3024219 Kanna, P. R., & Santhi, P. (2022). Hybrid intrusion detection using MapReduce based black widow optimized convolutional long short-term memory neural networks. Expert Systems with Applications, 194 (October 2021), 116545. https://doi.org/10.1016/j.eswa.2022.116545 Kasongo, S. M. (2021). An advanced intrusion detection system for IIoT based on GA and tree based algorithms. IEEE Access, 9, 113199 – 113212. https://doi.org/10.1109/ACCESS.2021.3104113 Khilar, R., Mariyappan, K., Christo, M. S., Amutharaj, J., Anitha, T., Rajendran, T., & Batu, A. (2022). Artificial intelligence-based security protocols to resist attacks in Internet of things. Wireless Communications and Mobile Computing, 2022. https://doi.org/10.1155/2022/1440538 Kumar, S. M. (2022). Hybrid optimized deep neural network with enhanced conditional random field based intrusion detection on wireless sensor network. Neural Processing Letters, https://doi.org/10.1007/s11063-022-10892-9 Lai, G. H. (2016). Detection of wormhole attacks on IPv6 mobility-based wireless sensor network. Eurasip Journal on Wireless Communications and Networking, 2016 (1), https://doi.org/10.1186/s13638-016-0776-0 Li, L., Yu, Y., Bai, S., Cheng, J., & Chen, X. (2018). Towards effective network intrusion detection: A Hybrid Model Integrating Gini Index and GBDT with PSO. 2018. Liu, W., Cheng, J., Wang, X., Lu, X., & Yin, J. (2022). Hybrid differential privacy based federated learning for Internet of things. Journal of Systems Architecture, 124 (July 2021), 102418. https://doi.org/10.1016/j.sysarc.2022.102418 Mahajan, S., Harikrishnan, R., & Kotecha, K. (2022). Prediction of network traffic in wireless Mesh networks using hybrid deep learning model. IEEE Access, 10, 7003 – 7015. https://doi.org/10.1109/ACCESS.2022.3140646 Mahbooba, B., Timilsina, M., Sahal, R., & Serrano, M. (2021). Explainable artificial intelligence (XAI) to enhance trust management in intrusion detection systems using decision tree model. Complexity, 2021. https://doi.org/10.1155/2021/6634811 Mangrulkar, R. S., & Negandhi, P. D. (2018). Applications of machine learning in wireless sensor networks. Soft Computing in Wireless Sensor Networks, 51 – 74. https://doi.org/10.1201/9780429438639-3 Meena, G., & Choudhary, R. R. (2017). A review paper on IDS classification using KDD 99 and NSL KDD dataset in WEKA. 2017 International Conference on Computer, Communications and Electronics, COMPTELIX, 2017, 553 – 558. https://doi.org/10.1109/COMPTELIX.2017.8004032 Mohd, N., Singh, A., & Bhadauria, H. S. (2020). A novel SVM based IDS for distributed denial of sleep strike in wireless sensor networks. Wireless Personal Communications, 111 (3), 1999 – 2022. https://doi.org/10.1007/s11277-019-06969-9 Mojtaba, S., Bamakan, H., Wang, H., Yingjie, T., & Shi, Y. (2016). Neurocomputing An effective intrusion detection framework based on MCLP / SVM optimized by time-varying chaos particle swarm optimization. Neurocomputing, 199, 90 – 102. https://doi.org/10.1016/j.neucom.2016.03.031 Moon, P. S., & Ingole, P. K. (2015). An overview on: Intrusion detection system with secure hybrid mechanism in wireless sensor network. Conference Proceeding - 2015 International Conference on Advances in Computer Engineering and Applications, ICACEA, 2015, 272 – 277. https://doi.org/10.1109/ICACEA.2015.7164714 Moulad, L., Belhadaoui, H., & Rifi, M. (2019). Implementation of an hierarchical hybrid intrusion detection mechanism in wireless sensor network based on energy management. Advances in Intelligent Systems and Computing, 756, 360 – 377. https://doi.org/10.1007/978-3-319-91337-7_33 Nancy, P., Muthurajkumar, S., Ganapathy, S., Santhosh Kumar, S. V. N., Selvi, M., & Arputharaj, K. (2020). Intrusion detection using dynamic feature selection and fuzzy temporal decision tree classification for wireless sensor networks. IET Communications, 14 (5), 888 – 895. https://doi.org/10.1049/iet-com.2019.0172 Ovelade, O. N., & Ezugwu, A. E. (2021). Ebola optimization search algorithm: A New nature-inspired metaheuristic algorithm for global optimization problems. International Conference on Electrical, Computer, and Energy Technologies, ICECET, 2021 (December), 9 – 10. https://doi.org/10.1109/ICECET52533.2021.9698813 Pajila, P. J. B., Julie, E. G., & Robinson, Y. H. (2021). FBDR-Fuzzy Based DDoS attack detection and recovery mechanism for wireless sensor networks. Springer US. Paliwal, P., & Kumar, D. (2018). ABC Based Neural Network Approach for Churn Prediction in Telecommunication Sector. Vol. 84. Pande, S., Khamparia, A., & Gupta, D. (2021). Feature selection and comparison of classification algorithms for wireless sensor networks. Journal of Ambient Intelligence and Humanized Computing (0123456789), https://doi.org/10.1007/s12652-021-03411-6 Panigrahi, R., Borah, S., Pramanik, M., Bhoi, A. K., Barsocchi, P., Nayak, S. R., & Alnumay, W. (2022). Intrusion detection in cyber–physical environment using hybrid naïve Bayes—decision table and multi-objective evolutionary feature selection. Computer Communications, 188 (September 2021), 133 – 144. https://doi.org/10.1016/j.comcom.2022.03.009 Praveen Kumar, D., Amgoth, T., & Annavarapu, C. S. R. (2019). Machine learning algorithms for wireless sensor networks: A survey. Information Fusion, 49 (April 2018), 1 – 25. https://doi.org/10.1016/j.inffus.2018.09.013 Pundir, S., Wazid, M., Singh, D. P., Das, A. K., Rodrigues, J. J. P. C., & Park, Y. (2020). Intrusion detection protocols in wireless sensor networks integrated to Internet of things deployment: Survey and future challenges. IEEE Access, 8, 3343 – 3363. https://doi.org/10.1109/ACCESS.2019.2962829 Qi, G., Zhou, J., Jia, W., Liu, M., Zhang, S., & Xu, M. (2021). Intrusion detection for network based on elite clone artificial Bee colony and back propagation neural network. Wireless Communications and Mobile Computing, 2021. https://doi.org/10.1155/2021/9956371 Rabbani, M., Wang, Y. L., Khoshkangini, R., Jelodar, H., Zhao, R., & Hu, P. (2020). A hybrid machine learning approach for malicious behaviour detection and recognition in cloud computing. Journal of Network and Computer Applications, 151 (May 2019), 102507. https://doi.org/10.1016/j.jnca.2019.102507 Regan, R., & Leo Manickam, J. M. (2019). An optimized energy saving model for hybrid security protocol in WMN. National Academy Science Letters, 42 (6), 489 – 501. https://doi.org/10.1007/s40009-019-0789-4 Ren, J., Guo, J., Qian, W., Yuan, H., Hao, X., & Jingjing, H. (2019). Building an effective intrusion detection system by using hybrid data optimization based on machine learning algorithms. Security and Communication Networks, 2019. https://doi.org/10.1155/2019/7130868 Reshma, V. K., Khan, I. R., Niranjanamurthy, M., Aggarwal, P. K., Hemalatha, S., Almuzaini, K. K., & Amoatey, E. T. (2022). Hybrid block-based lightweight machine learning-based predictive models for quality preserving in the Internet of Things- (IoT-) based medical images with diagnostic applications. Computational Intelligence and Neuroscience, 2022. https://doi.org/10.1155/2022/8173372 Rose, T., Kifayat, K., Abbas, S., & Asim, M. (2020). A hybrid anomaly-based intrusion detection system to improve time complexity in the Internet of energy environment. Journal of Parallel and Distributed Computing, 145, 124 – 139. https://doi.org/10.1016/j.jpdc.2020.06.012 Rouissi, N., Gharsellaoui, H., & Bouamama, S. (2019). Improvement of watermarking-LEACH algorithm based on trust for wireless sensor networks. Procedia Computer Science, 159, 803 – 813. https://doi.org/10.1016/j.procs.2019.09.239 Roy, P., & Chowdhury, C. (2021). A survey of machine learning techniques for indoor localization and navigation systems. Journal of Intelligent and Robotic Systems: Theory and Applications, 101 (3), https://doi.org/10.1007/s10846-021-01327-z Sadikin, F., van Deursen, T., & Kumar, S. (2020). A ZigBee intrusion detection system for IoT using secure and efficient data collection. Internet of Things, 12, 100306. https://doi.org/10.1016/j.iot.2020.100306 Saheed, Y. K., Abdulganiyu, O. H., Tchakoucht, T. A., & Rakshit, S. (2022). A Novel Wrapper and Filter-Based Feature Dimensionality Reduction Methods for Anomaly Intrusion Detection in Wireless Sensor Networks. 1–22. Saidi, A., Benahmed, K., & Seddiki, N. (2020). Secure cluster head election algorithm and misbehavior detection approach based on trust management technique for clustered wireless sensor networks. Ad Hoc Networks, 106. https://doi.org/10.1016/j.adhoc.2020.102215 Saif, S., Das, P., Biswas, S., Khari, M., & Shanmuganathan, V. (2022). HIIDS: Hybrid intelligent intrusion detection system empowered with machine learning and metaheuristic algorithms for application in IoT based healthcare. Microprocessors and Microsystems, 104622. https://doi.org/10.1016/j.micpro.2022.104622 Sakthivel, T., & Chandrasekaran, R. M. (2018). A dummy packet-based hybrid security framework for mitigating routing misbehavior in multi-Hop wireless networks. Wireless Personal Communications, 101 (3), 1581 – 1618. https://doi.org/10.1007/s11277-018-5778-2 Saravana Kumar, N. M., Suryaprabha, E., & Hariprasath, K. (2021). Machine learning based hybrid model for energy efficient secured transmission in wireless sensor networks. Journal of Ambient Intelligence and Humanized Computing, https://doi.org/10.1007/s12652-021-02946-y Shi, L., & Li, K. (2022). Privacy protection and intrusion detection system of wireless sensor network based on artificial neural network. Computational Intelligence and Neuroscience, 2022, 1795454. https://doi.org/10.1155/2022/1795454 Singh, A., Amutha, J., Nagar, J., & Sharma, S. (2023). A deep learning approach to predict the number of K-barriers for intrusion detection over a circular region using wireless sensor networks. Expert Systems with Applications, 211 (July 2021), 118588. https://doi.org/10.1016/j.eswa.2022.118588 Singh, A., Amutha, J., Nagar, J., Sharma, S., & Lee, C. C. (2022a). AutoML-ID: Automated machine learning model for intrusion detection using wireless sensor network. Scientific Reports, 12 (1), 1 – 14. https://doi.org/10.1038/s41598-021-99269-x Singh, A., Amutha, J., Nagar, J., Sharma, S., & Lee, C. C. (2022b). LT-FS-ID: Log-transformed feature learning and feature-scaling-based machine learning algorithms to predict the k-barriers for intrusion detection using wireless sensor network. Sensors, 22 (24), 3. https://doi.org/10.1109/JSEN.2022.3226962 Singh, A., Nagar, J., Sharma, S., & Kotiyal, V. (2021). A Gaussian process regression approach to predict the K-barrier coverage probability for intrusion detection in wireless sensor networks. Expert Systems with Applications, 172 (October 2020), 114603. https://doi.org/10.1016/j.eswa.2021.114603 Singh, R., Singh, J., & Singh, R. (2017). Fuzzy based advanced hybrid intrusion detection system to detect malicious nodes in wireless sensor networks. Wireless Communications and Mobile Computing, 2017. https://doi.org/10.1155/2017/3548607 Suleiman, M. F., & Issac, B. (2018). Performance comparison of intrusion detection machine learning classifiers on benchmark and new datasets. 28th International Conference on Computer Theory and Applications, ICCTA 2018 - Proceedings, 19 – 23. https://doi.org/10.1109/ICCTA45985.2018.9499140 Sun, P., Liu, P., Li, Q., Liu, C., Lu, X., Hao, R., & Chen, J. (2020). DL-IDS: Extracting features using CNN-LSTM hybrid network for intrusion detection system. Security and Communication Networks, 2020. https://doi.org/10.1155/2020/8890306 Swarna Priya, R. M., Maddikunta, P. K. R., Parimala, M., Koppu, S., Gadekallu, T. R., Chowdhary, C. L., & Alazab, M. (2020). An effective feature engineering for DNN using hybrid PCA-GWO for intrusion detection in IoMT architecture. Computer Communications, 160 (May), 139 – 149. https://doi.org/10.1016/j.comcom.2020.05.048 Ullah, S., Khan, M. A., Ahmad, J., Jamal, S. S., Huma, Z. E., Hassan, M. T., Nikolaos Pitropakis, A., & Buchanan, W. J. (2022). HDL-IDS: A hybrid deep learning architecture for intrusion detection in the Internet of vehicles. Sensors, 22 (4), 1 – 20. https://doi.org/10.3390/s22041340 Umarani, C., & Kannan, S. (2020). Intrusion detection system using hybrid tissue growing algorithm for wireless sensor network. Peer-to-Peer Networking and Applications, 13 (3), 752 – 761. https://doi.org/10.1007/s12083-019-00781-9 Upadhyay, D., Manero, J., Zaman, M., & Sampalli, S. (2021). Learning classifiers for intrusion detection on power grids. Ieee Transactions on Network and Service Management, 18 (1), 1104 – 1116. https://doi.org/10.1109/TNSM.2020.3032618 VenkataRao, S., & Ananth, V. (2021). A hybrid optimization algorithm and shamir secret sharing based secure data transmission for IoT based WSN. International Journal of Intelligent Engineering and Systems, 14 (6), 498 – 506. https://doi.org/10.22266/ijies2021.1231.44 Vinitha, A., Rukmini, M. S. S., & Dhirajsunehra. (2019). Secure and energy aware multi-Hop routing protocol in WSN using Taylor-based hybrid optimization algorithm. Journal of King Saud University - Computer and Information Sciences (xxxx). https://doi.org/10.1016/j.jksuci.2019.11.009. Wang, Y., Ma, J., Sharma, A., Singh, P. K., Gaba, G. S., Masud, M., & Baz, M. (2021). An exhaustive research on the application of intrusion detection technology in computer network security in sensor networks. Journal of Sensors, 2021. https://doi.org/10.1155/2021/5558860 Wu, Y., Wei, D., & Feng, J. (2020). Network attacks detection methods based on deep learning techniques: A survey. Security and Communication Networks, 2020. https://doi.org/10.1155/2020/8872923 Yang, L., Moubayed, A., & Shami, A. (2022). MTH-IDS: A multitiered hybrid intrusion detection system for Internet of vehicles. IEEE Internet of Things Journal, 9 (1), 616 – 632. https://doi.org/10.1109/JIOT.2021.3084796 Yin, C., Zhang, S., Yin, Z., & Wang, J. (2019). Anomaly detection model based on data stream clustering. Cluster Computing, 22 (s1), 1729 – 1738. https://doi.org/10.1007/s10586-017-1066-2 Yuan, Y., Huo, L., Wang, Z., & Hogrefe, D. (2018). Secure APIT localization scheme against Sybil attacks in distributed wireless sensor networks. IEEE Access, 6, 27629 – 27636. https://doi.org/10.1109/ACCESS.2018.2836898 Zhang, W., Han, D., Li, K. C., & Massetto, F. I. (2020). Wireless sensor network intrusion detection system based on MK-ELM. Soft Computing, 24 (16), 12361 – 12374. https://doi.org/10.1007/s00500-020-04678-1 Zhao, G., Wang, Y., & Wang, J. (2023). Lightweight intrusion detection model of the Internet of Things with Hybrid Cloud-Fog Computing. 2023. Zou, Y., Zhu, J., Wang, X., & Hanzo, L. (2016). A survey on wireless security: Technical challenges, recent advances, and future trends. Proceedings of the IEEE, 104 (9), 1727 – 1765. https://doi.org/10.1109/JPROC.2016.2558521

By Gebrekiros Gebreyesus Gebremariam; J. Panda and S. Indu

Reported by Author; Author; Author

Titel:	Design of advanced intrusion detection systems based on hybrid machine learning techniques in hierarchically wireless sensor networks
Autor/in / Beteiligte Person:	Gebrekiros Gebreyesus Gebremariam ; Panda, J. ; Indu, S.
Link:	Volltext (PDF) View record in DOAJ (Volltext) https://doaj.org/toc/0954-0091 https://doaj.org/toc/1360-0494
Zeitschrift:	Connection Science, Jg. 35 (2023-12-01), Heft 1
Veröffentlichung:	Taylor & Francis Group, 2023
Medientyp:	academicJournal
ISSN:	0954-0091 (print) ; 1360-0494 (print)
DOI:	10.1080/09540091.2023.2246703
Schlagwort:	hybrid security technique intrusion detection system wireless sensor networks malicious nodes detection and classification hybrid machine learning models Electronic computers. Computer science QA75.5-76.95
Sonstiges:	Nachgewiesen in: Directory of Open Access Journals Sprachen: English Collection: LCC:Electronic computers. Computer science Document Type: article File Description: electronic resource Language: English

Klicken Sie ein Format an und speichern Sie dann die Daten oder geben Sie eine Empfänger-Adresse ein und lassen Sie sich per Email zusenden.

BibTeX Citavi, JabRef, u.a.
(Literaturverwaltung)

PDF kein Volltext!
(Merkzettel, Notizen)

RIS Endnote, Citavi u.a.
(Literaturverwaltung)

MODS
(XML zur Weiterverarbeitung)

oder

Wählen Sie das für Sie passende Zitationsformat und kopieren Sie es dann in die Zwischenablage, lassen es sich per Mail zusenden oder speichern es als PDF-Datei.

Gewünschter Zitations-Stil:

oder

Bitte prüfen Sie, ob die Zitation formal korrekt ist, bevor Sie sie in einer Arbeit verwenden. Benutzen Sie gegebenenfalls den "Exportieren"-Dialog, wenn Sie ein Literaturverwaltungsprogramm verwenden und die Zitat-Angaben selbst formatieren wollen.