摘要:
The accurate detection of road defects is crucial for enhancing the safety and efficiency of road maintenance. This study focuses on six common types of pavement defects: transverse cracks, longitudinal cracks, alligator cracking, oblique cracks, potholes, and repair marks. In real-world scenarios, key challenges include effectively distinguishing between the foreground and background, as well as accurately identifying small-sized (e.g., fine cracks, dense alligator cracking, and clustered potholes) and overlapping defects (e.g., intersecting cracks or clustered damage areas where multiple defects appear close together). To address these issues, this paper proposes a Pavement-DETR model based on the Real-Time Detection Transformer (RT-DETR), aiming to optimize the overall accuracy of defect detection. To achieve this goal, three main improvements are proposed: (1) the introduction of the Channel-Spatial Shuffle (CSS) attention mechanism in the third (S3) and fourth (S4) stages of the ResNet backbone, which correspond to mid-level and high-level feature layers, enabling the model to focus more precisely on road defect features; (2) the adoption of the Conv3XC structure for feature fusion enhances the model's ability to differentiate between the foreground and background, which is achieved through multi-level convolutions, channel expansion, and skip connections, which also contribute to improved gradient flow and training stability; (3) the proposal of a loss function combining Powerful-IoU v2 (PIoU v2) and Normalized Wasserstein Distance (NWD) weighted averaging, where PIoU v2 focuses on optimizing overlapping regions, and NWD targets small object optimization. The combined loss function enables comprehensive optimization of the bounding boxes, improving the model's accuracy and convergence speed. Experimental results show that on the UAV-PDD2023 dataset, Pavement-DETR improves the mean average precision (mAP) by 7.7% at IoU = 0.5, increases mAP by 8.9% at IoU = 0.5-0.95, and improves F1 Score by 7%. These results demonstrate that Pavement-DETR exhibits better performance in road defect detection, making it highly significant for road maintenance work.
摘要:
This article is concerned with the rth moment global exponential stabilization of delayed memristive neural networks (DMNNs). By using the comparison strategy, the theories of differential inclusion and inequality techniques, the exponential stabilization of DMNNs is investigated. To achieve this purpose, a state feedback controller and an adaptive controller are designed, respectively. The comparison strategy is a new analyzed method without employing Lyapunov stability theory and relaxes the constraint of time delays. In addition, the obtained results are represented by algebraic criteria, which are convenient for testing. In the end, a numerical simulation is given to show the validity of the derived criteria.
摘要:
Camouflaged Object Detection (COD) aims to segment objects that are highly integrated with their background, presenting significant challenges such as low contrast, complex textures, and blurred boundaries. Existing deep learning methods often struggle to achieve robust segmentation under these conditions. To address these limitations, this paper proposes a novel COD network, SAM2-DFBCNet, built upon the SAM2 Hiera architecture. Our network incorporates three key modules: (1) the Camouflage-Aware Context Enhancement Module (CACEM), which fuses local and global features through an attention mechanism to enhance contextual awareness in low-contrast scenes; (2) the Cross-Scale Feature Interaction Bridge (CSFIB), which employs a bidirectional convolutional GRU for the dynamic fusion of multi-scale features, effectively mitigating representation inconsistencies caused by complex textures and deformations; and (3) the Dynamic Boundary Refinement Module (DBRM), which combines channel and spatial attention mechanisms to optimize boundary localization accuracy and enhance segmentation details. Extensive experiments on three public datasets-CAMO, COD10K, and NC4K-demonstrate that SAM2-DFBCNet outperforms twenty state-of-the-art methods, achieving maximum improvements of 7.4%, 5.78%, and 4.78% in key metrics such as S-measure (Sα), F-measure (Fβ), and mean E-measure (Eϕ), respectively, while reducing the Mean Absolute Error (M) by 37.8%. These results validate the superior performance and robustness of our approach in complex camouflage scenarios.
期刊:
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS,2024年46(4):8411-8424 ISSN:1064-1246
通讯作者:
Yuan, Cao
作者机构:
[Yaqin Li; Ziyi Zhang; Cao Yuan; Jing Hu] School of Mathematics and Computer Science, Wuhan Polytechnic University, Wuhan, China
关键词:
Small target;deep learning;model compression;traffic sign detection
摘要:
Traffic sign detection technology plays an important role in driver assistance systems and automated driving systems. This paper proposes DeployEase-YOLO, a real-time high-precision detection scheme based on an adaptive scaling channel pruning strategy, to facilitate the deployment of detectors on edge devices. More specifically, based on the characteristics of small traffic signs and complex background, this paper first of all adds a small target detection layer to the basic architecture of YOLOv5 in order to improve the detection accuracy of small traffic signs.Then, when capturing specific scenes with large fields of view, higher resolution and richer pixel information are preserved instead of directly scaling the image size. Finally, the network structure is pruned and compressed using an adaptive scaling channel pruning strategy, and the pruned network is subjected to a secondary sparse pruning operation. The number of parameters and computations is greatly reduced without increasing the depth of the network structure or the influence of the input image size, thus compressing the model to the minimum within the compressible range. Experimental results show that the model trained by Experimental results show that the model trained by DeployEase-YOLO achieves higher accuracy and a smaller size on TT100k, a challenging traffic sign detection dataset. Compared to existing methods, DeployEase-YOLO achieves an average accuracy of 93.3%, representing a 1.3% improvement over the state-of-the-art YOLOv7 network, while reducing the number of parameters and computations to 41.69% and 59.98% of the original, respectively, with a compressed volume of 53.22% of the previous one. This proves that the DeployEase-YOLO has a great deal of potential for use in the area of small traffic sign detection. The algorithm outperforms existing methods in terms of accuracy and speed, and has the advantage of a compressed network structure that facilitates deployment of the model on resource-limited devices.
摘要:
Melanoma is a malignant skin tumor that threatens human life and health. Early detection is essential for effective treatment. However, the low contrast between melanoma lesions and normal skin and the irregularity in size and shape make skin lesions difficult to detect with the naked eye in the early stages, making the task of skin lesion segmentation challenging. Traditional encoder-decoder built with U-shaped networks using convolutional neural network (CNN) networks have limitations in establishing long-term dependencies and global contextual connections, while the Transformer architecture is limited in its application to small medical datasets. To address these issues, we propose a new skin lesion segmentation network, SUTrans-NET, which combines CNN and Transformer in a parallel fashion to form a dual encoder, where both CNN and Transformer branches perform dynamic interactive fusion of image information in each layer. At the same time, we introduce our designed multi-grouping module Spatial Group Attention (SGA) to complement the spatial and texture information of the Transformer branch, and utilize the Focus idea of YOLOV5 to construct the Patch Embedding module in the Transformer to prevent the loss of pixel accuracy. In addition, we design a decoder with full-scale information fusion capability to fully fuse shallow and deep features at different stages of the encoder. The effectiveness of our method is demonstrated on the ISIC 2016, ISIC 2017, ISIC 2018 and PH2 datasets and its advantages over existing methods are verified. Copyright 2024 Li et al.
摘要:
The immense representation power of deep learning frameworks has kept them in the spotlight in hyperspectral image (HSI) classification. Graph Convolutional Neural Networks (GCNs) can be used to compensate for the lack of spatial information in Convolutional Neural Networks (CNNs). However, most GCNs construct graph data structures based on pixel points, which requires the construction of neighborhood matrices on all data. Meanwhile, the setting of GCNs to construct similarity relations based on spatial structure is not fully applicable to HSIs. To make the network more compatible with HSIs, we propose a staged feature fusion model called SFFNet, a neural network framework connecting CNN and GCN models. The CNN performs the first stage of feature extraction, assisted by adding neighboring features and overcoming the defects of local convolution; then, the GCN performs the second stage for classification, and the graph data structure is constructed based on spectral similarity, optimizing the original connectivity relationships. In addition, the framework enables the batch training of the GCN by using the extracted spectral features as nodes, which greatly reduces the hardware requirements. The experimental results on three publicly available benchmark hyperspectral datasets show that our proposed framework outperforms other relevant deep learning models, with an overall classification accuracy of over 97%.
摘要:
Frequent highway accidents occur in the Guizhou region, among which poor visibility due to fog is one of the main causative factors. In this region, traditional large-scale, high-computational-power fog monitoring systems are difficult to install and have high costs due to complex terrains, high altitudes, and winding roads, causing traffic management departments to fail to obtain fog information accurately and timely, which undoubtedly becomes a significant safety hazard. To solve this problem, this study proposes a fog monitoring solution based on the lightweight deep learning model ABNet. The solution first preprocesses the input images, including generating the fog concentration distribution map using the fog imaging model, and obtaining the high-frequency component image using filters based on the 2D discrete wavelet transform. Subsequently, these two processed images and the original image are fed into the three branches of the ABNet for training to fully extract fog concentration and high frequency information, thereby improving model performance and prediction accuracy. The ABNet model parameters only require 38.52MB, and the computational complexity is a mere 1.71GFLOPs, effectively solving the limited storage and computational resources problem in edge computing. The model was evaluated using the Guizhou highway fog weather data set, and ABNet exhibited impressive performance with a composite classification accuracy as high as 92.3%, reaching 92.4% in average precision rate and 92.3% in average recall rate. In comparison, the performance of models like VisNet, VGG16, EfficientNetV2, and Swin Transformer V2 seemed inferior. The experimental results validated the excellent performance of the ABNet model in terms of accuracy and efficiency. The ABNet model in this study, with its lightweight deep learning design, small parameter scale, and lower computational power requirements, provides a solution suitable for complex terrains and practical environments of edge computing devices, and it provides vital technological support to improve traffic safety on the highways in the Guizhou region.
摘要:
The Segment Anything Model (SAM) is a versatile image segmentation model that enables zero-shot segmentation of various objects in any image using prompts, including bounding boxes, points, texts, and more. However, studies have shown that the SAM performs poorly in agricultural tasks like crop disease segmentation and pest segmentation. To address this issue, the agricultural SAM adapter (ASA) is proposed, which incorporates agricultural domain expertise into the segmentation model through a simple but effective adapter technique. By leveraging the distinctive characteristics of agricultural image segmentation and suitable user prompts, the model enables zero-shot segmentation, providing a new approach for zero-sample image segmentation in the agricultural domain. Comprehensive experiments are conducted to assess the efficacy of the ASA compared to the default SAM. The results show that the proposed model achieves significant improvements on all 12 agricultural segmentation tasks. Notably, the average Dice score improved by 41.48% on two coffee-leaf-disease segmentation tasks.
关键词:
day-ahead prediction;mutation rate;data augmentation;GAN model
摘要:
This study introduces a data augmentation technique based on generative adversarial networks (GANs) to improve the accuracy of day-ahead wind power predictions. To address the peculiarities of abrupt weather data, we propose a novel method for detecting mutation rates (MR) and local mutation rates (LMR). By analyzing historical data, we curated datasets that met specific mutation rate criteria. These transformed wind speed datasets were used as training instances, and using GAN-based methodologies, we generated a series of augmented training sets. The enriched dataset was then used to train the wind power prediction model, and the resulting prediction results were meticulously evaluated. Our empirical findings clearly demonstrate a significant improvement in the accuracy of day-ahead wind power prediction due to the proposed data augmentation approach. A comparative analysis with traditional methods showed an approximate 5% increase in monthly average prediction accuracy. This highlights the potential of leveraging mutated wind speed data and GAN-based techniques for data augmentation, leading to improved accuracy and reliability in wind power predictions. In conclusion, this paper presents a robust data augmentation method for wind power prediction, contributing to the potential enhancement of day-ahead prediction accuracy. Future research could explore additional mutation rate detection methods and strategies to further enhance GAN models, thereby amplifying the effectiveness of wind power prediction.
通讯机构:
[Cao Yuan] S;School of Mathematics and Computer Science, Wuhan Polytechnic University, Wuhan 430024, China<&wdkj&>Author to whom correspondence should be addressed.
关键词:
end to end;deep learning;point cloud completion;squeeze and excitation;trilinear interpolation
摘要:
We propose a conceptually simple, general framework and end-to-end approach to point cloud completion, entitled PCA-Net. This approach differs from the existing methods in that it does not require a "simple" network, such as multilayer perceptrons (MLPs), to generate a coarse point cloud and then a "complex" network, such as auto-encoders or transformers, to enhance local details. It can directly learn the mapping between missing and complete points, ensuring that the structure of the input missing point cloud remains unchanged while accurately predicting the complete points. This approach follows the minimalist design of U-Net. In the encoder, we encode the point clouds into point cloud blocks by iterative farthest point sampling (IFPS) and k-nearest neighbors and then extract the depth interaction features between the missing point cloud blocks by the attention mechanism. In the decoder, we introduce a new trilinear interpolation method to recover point cloud details, with the help of the coordinate space and feature space of low-resolution point clouds, and missing point cloud information. This paper also proposes a method to generate multi-view missing point cloud data using a 3D point cloud hidden point removal algorithm, so that each 3D point cloud model generates a missing point cloud through eight uniformly distributed camera poses. Experiments validate the effectiveness and superiority of PCA-Net in several challenging point cloud completion tasks, and PCA-Net also shows great versatility and robustness in real-world missing point cloud completion.
通讯机构:
[Shengyong Xu] C;College of Engineering, Huazhong Agricultural University, Wuhan 430070, China<&wdkj&>Key Laboratory of Agricultural Equipment for the Middle and Lower Reaches of the Yangtze River, Ministry of Agriculture, Wuhan 430070, China<&wdkj&>Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518000, China<&wdkj&>Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Shenzhen 518000, China<&wdkj&>Author to whom correspondence should be addressed.
关键词:
point cloud segmentation;point cloud completion;leaf area measurement;MIX-Net;seedlings;deep learning
摘要:
In this paper, a novel point cloud segmentation and completion framework is proposed to achieve high-quality leaf area measurement of melon seedlings. In particular, the input of our algorithm is the point cloud data collected by an Azure Kinect camera from the top view of the seedlings, and our method can enhance measurement accuracy from two aspects based on the acquired data. On the one hand, we propose a neighborhood space-constrained method to effectively filter out the hover points and outlier noise of the point cloud, which can enhance the quality of the point cloud data significantly. On the other hand, by leveraging the purely linear mixer mechanism, a new network named MIX-Net is developed to achieve segmentation and completion of the point cloud simultaneously. Different from previous methods that separate these two tasks, the proposed network can better balance these two tasks in a more definite and effective way, leading to satisfactory performance on these two tasks. The experimental results prove that our methods can outperform other competitors and provide more accurate measurement results. Specifically, for the seedling segmentation task, our method can obtain a 3.1% and 1.7% performance gain compared with PointNet++ and DGCNN, respectively. Meanwhile, the R2 of leaf area measurement improved from 0.87 to 0.93 and MSE decreased from 2.64 to 2.26 after leaf shading completion.
通讯机构:
[Yaqin Li] S;School of Mathematics and Computer Science, Wuhan Polytechnic University, Wuhan 430024, China<&wdkj&>Author to whom correspondence should be addressed.
关键词:
deep learning;generative adversarial network;deep generative model;super-resolution;feature transform;multiscale feature extraction