作者机构:
[Hao, Sen; Wang, Heng; Wang, Xiaohu] Wuhan Polytechnic University, School of Mathematics and Computer, Wuhan, 430048, China;[Zhang, Cong] Wuhan Polytechnic University, School of Electrical and Electronic Engineering, Wuhan, 430048, China;[Chen, Yilin] Wuhan Institute of Technology, Hubei Key Laboratory of Intelligent Robot, Wuhan, 430073, China
通讯机构:
[Hao, S.] W;Wuhan Polytechnic University, China
关键词:
Deep learning;music generation;recurrent neural network;transformer
通讯机构:
[Kang Zhou] C;College of Mathematics and Computer Science, Wuhan Polytechnic University, Wuhan, China
关键词:
Food quality assessment;Neural network rough-refinement optimization;Metaheuristic algorithm based on NNs;Data mining
摘要:
Food quality assessment is an important part of the food industry. The traditional food quality assessment technologies have the limitations of inconsistent and different technical defects for each method. Data mining technology has significant advantages in dealing with the problems of uncertainty and fuzziness. Therefore, this study proposes a food quality assessment model based on data mining, which aims to realize the standardization and consistency of food quality assessment, and can achieve or exceed the accuracy of existing technologies, so as to solve the obvious problems existing in traditional assessment methods. The core of the proposed model is to design a deep learning framework based on double layer rough-refinement optimization. The first layer is rough optimization, which introduces the thought of multi-objective optimization to optimize the topological structure of neural networks with various candidate types and candidate depths. The second layer is refinement adjustment, which uses meta heuristic algorithm to globally optimize the weight parameters of the network model. The combination of rough and refinement optimization can greatly reduce the computation of overall simultaneous optimization and globally optimize the neural network model with the highest accuracy from the neural network type, topology, and network parameters. Two kinds of food quality assessment problems are used to simulate and verify the proposed deep learning framework. The results prove that the framework is effective, feasible, and adaptability, and the proposed assessment model can well solve different types of food quality assessments.
摘要:
Image segmentation is a crucial part in the automatic detection of rice appearance quality. Due to morphological characteristics of rice grains, missed detection and non-smooth boundaries may exist in the image segmentation of adhesive rice. To address the above issues, this study proposes a novel model named Swgan combined generative adversarial networks (GANs) with nested skip connections for obtaining accurate masks. In order to learn the mask distribution of each object in adhesive rice image and further avoid missed detection, the discriminator in GAN is used as a modifier of Cascade Mask R-CNN model to allow the generator to overcome the limitation of mask generation training, namely detecting multiply objects as a single target. Moreover, the Swgan utilizes Swin-Transformer as the backbone network and incorporates the Cascade Mask R-CNN framework and Nested Feature Pyramid Network (Nested-FPN) to maintain the mask's boundary smoothness during forward propagation. Experimental results indicate that the Swgan is used to obtain better segmentation results from objective detection and segmentation under complex conditions of adhesive rice when compared with state-ofart algorithms. Overall, the Swgan with satisfactory accuracy in image segmentation of adhesive rice combined with physical indicators detection provide reliable quality assessment of rice grains.
摘要:
The Segment Anything Model (SAM) is a versatile image segmentation model that enables zero-shot segmentation of various objects in any image using prompts, including bounding boxes, points, texts, and more. However, studies have shown that the SAM performs poorly in agricultural tasks like crop disease segmentation and pest segmentation. To address this issue, the agricultural SAM adapter (ASA) is proposed, which incorporates agricultural domain expertise into the segmentation model through a simple but effective adapter technique. By leveraging the distinctive characteristics of agricultural image segmentation and suitable user prompts, the model enables zero-shot segmentation, providing a new approach for zero-sample image segmentation in the agricultural domain. Comprehensive experiments are conducted to assess the efficacy of the ASA compared to the default SAM. The results show that the proposed model achieves significant improvements on all 12 agricultural segmentation tasks. Notably, the average Dice score improved by 41.48% on two coffee-leaf-disease segmentation tasks.
摘要:
During the rice quality testing process, the precise segmentation and extraction of grain pixels is a key technique for accurately determining the quality of each seed. Due to the similar physical characteristics, small particles and dense distributions of rice seeds, properly analysing rice is a difficult problem in the field of target segmentation. In this paper, a network called SY-net, which consists of a feature extractor module, a feature pyramid fusion module, a prediction head module and a prototype mask generation module, is proposed for rice seed instance segmentation. In the feature extraction module, a transformer backbone is used to improve the ability of the network to learn rice seed features; in the pyramid fusion module and the prediction head module, a six-layer feature fusion network and a parallel prediction head structure are employed to enhance the utilization of feature information; and in the prototype mask generation module, a large feature map is used to generate high-quality masks. Training and testing were performed on two public datasets and one private rice seed dataset. The results showed that SY-net achieved a mean average precision (mAP) of 90.71% for the private rice seed dataset and an average precision (AP) of 16.5% with small targets in COCO2017. The network improved the efficiency of rice seed segmentation and showed excellent application prospects in performing rice seed quality testing.
摘要:
Image stitching task targets to derive a large panoramic image for obtaining extensive information. However, artifacts such as ghosting or geometric misalignment are inevitably generated. As a practical measure, optimal seamline detection strategies use the spatial information to obtain the optimal seam in RGB image stitching, but they cannot be directly used in hyperspectral image (HSI) stitching. Since the spatial information of numerous continuous bands of HSI is different, the detected seam of the traditional RGB-based method in each band of HSI is divergent, which will cause visual difference and spectral distortion. To solve this problem, we propose a novel optimal seamline detection strategy via graph cuts for HSI stitching in this work. First, we use robust feature matching and elastic warp to align multiple adjacent images into a common geometrical transformation. After that, we design a novel energy function composing both the spatial and spectral information of HSI to determine an optimal seam in continuous regions with high texture consistency. Finally, we use the graph cuts method to eliminate visible artifacts. Our method can determine a unique optimal seam in the whole HSI for stitching so as to obtain high-quality panoramic HSI without artifacts and reduce the spectral distortion. A series of experiments verify the effectiveness and superiority of the proposed method to several advanced approaches in HSI stitching.
摘要:
In this paper we will compare the Plateau's problem with Cech and singular homological boundary conditions, we also compare these with the size minimizing problem for integral currents with a given boundary. Finally we get the agreement on the infimum values for all these Plateau's problems.& COPY; 2023 Elsevier Inc. All rights reserved.
通讯机构:
[Shan Zeng] S;School of Mathematics and Computer Science, Wuhan Polytechnic University, Wuhan, Hubei 430023, China
关键词:
Data augmentation;Anomaly detection;Industrial data
摘要:
Detecting the anomalies in a large amounts of high-dimensional data has been a challenging task. In the Industry 4.0 environment, large-scale high-dimensional monitoring data features the complex pattern of high level semantics. In order to provide enterprise-wide monitoring solutions, it is necessary to identify the high-level semantic patterns of the anomalies in these data without splitting them. Existing end-to-end deep neural networks for time series are capable of recognizing the high-level semantics in natural language or speech signals, but they are barely applied in real-time anomaly detection of industrial data because of the large time costs. In this paper, we leverage the self-supervised contrastive learning methodology and propose a Composite Semantic Augmentation Encoder (CSAE) to provide an appropriate representation of industrial data and implement quick detection of anomalies in industrial application environments. CSAE is a non-sequential deep neural network with two augmentation layers and a mandatory layer. The two layers of data-augmentation are built to expand the size of samples of both low-level semantic anomalies and high-level semantic anomalies, which enables CSAE to discover diverse anomalies and improves its accuracy of high-level semantic pattern recognition. The mandatory layer is built to compress and reserve the temporal information in the industrial data to accelerate the anomaly detection. Therefore, as a non-sequential contrastive learning model, CSAE has faster training convergence than the usual sequence models. The experiment results have verified that CSAE can achieve higher prediction accuracy with less time consumption than existing machine learning models in the tasks of high dimensional anomaly pattern detection. (C) 2022 Elsevier B.V. All rights reserved.
作者机构:
[Deng, Guotai] Cent China Normal Univ, Sch Math & Stat, Wuhan 430079, Peoples R China.;[Deng, Guotai] Cent China Normal Univ, Hubei Key Lab Math Sci, Wuhan 430079, Peoples R China.;[Liu, Chuntai] Wuhan Polytech Univ, Sch Math & Comp Sci, Wuhan 430023, Peoples R China.;[Ngai, Sze-Man] Hunan Normal Univ, Coll Math & Stat, Key Lab High Performance Comp & Stochast Informat, Minist Educ China, Changsha 410081, Hunan, Peoples R China.;[Ngai, Sze-Man] Georgia Southern Univ, Dept Math Sci, Statesboro, GA 30460 USA.
通讯机构:
[Ngai, SM ] H;Hunan Normal Univ, Coll Math & Stat, Key Lab High Performance Comp & Stochast Informat, Minist Educ China, Changsha 410081, Hunan, Peoples R China.;Georgia Southern Univ, Dept Math Sci, Statesboro, GA 30460 USA.
摘要:
The semantic segmentation of outdoor images is the cornerstone of scene understanding and plays a crucial role in the autonomous navigation of robots. Although RGB–D images can provide additional depth information for improving the performance of semantic segmentation tasks, current state–of–the–art methods directly use ground truth depth maps for depth information fusion, which relies on highly developed and expensive depth sensors. Aiming to solve such a problem, we proposed a self–calibrated RGB-D image semantic segmentation neural network model based on an improved residual network without relying on depth sensors, which utilizes multi-modal information from depth maps predicted with depth estimation models and RGB image fusion for image semantic segmentation to enhance the understanding of a scene. First, we designed a novel convolution neural network (CNN) with an encoding and decoding structure as our semantic segmentation model. The encoder was constructed using IResNet to extract the semantic features of the RGB image and the predicted depth map and then effectively fuse them with the self–calibration fusion structure. The decoder restored the resolution of the output features with a series of successive upsampling structures. Second, we presented a feature pyramid attention mechanism to extract the fused information at multiple scales and obtain features with rich semantic information. The experimental results using the publicly available Cityscapes dataset and collected forest scene images show that our model trained with the estimated depth information can achieve comparable performance to the ground truth depth map in improving the accuracy of the semantic segmentation task and even outperforming some competitive methods.
通讯机构:
[Zhang, C ] W;Wuhan Polytech Univ, Sch Elect & Elect Engn, Wuhan 430048, Peoples R China.
关键词:
personal audio system;sound field control;acoustic contrast;reconstruction error;array effort
摘要:
A personal audio system has a wide application prospect in people’s lives, which can be implemented by sound field control technology. However, the current sound field control technology is mainly based on sound pressure or its improvement, ignoring another physical property of sound: particle velocity, which is not conducive to the stability of the entire reconstruction system. To address the problem, a sound field method is constructed in this paper, which minimizes the reconstruction error in the bright zone, minimizes the loudspeaker array effort in the reconstruction system, and at the same time controls the particle velocity and sound pressure of the dark zone. Five unevenly placed loudspeakers were used as the initial setup for the computer simulation experiment. Simulation results suggest that the proposed method is better than the PM (pressure matching) and EDPM (eigen decomposition pseudoinverse method) methods in the bright zone in an acoustic contrast index, the ACC (acoustic contrast control) method in a reconstruction error index, and the ACC, PM, and EDPM methods in the bright zone in a loudspeaker array effort index. The average array effort of the proposed method is the smallest, which is about 9.4790, 8.0712, and 4.8176 dB less than that of the ACC method, the PM method in the bright zone, and the EDPM method in the bright zone, respectively, so the proposed method can produce the most stable reconstruction system when the loudspeaker system is not evenly placed. The results of computer experiments demonstrate the performance of the proposed method, and suggest that compared with traditional methods, the proposed method can achieve more balanced results in the three indexes of acoustic contrast, reconstruction error, and loudspeaker array effort on the whole.
摘要:
As the research on deep learning methods gradually progresses, more and more classification models are applied in the classification of hyperspectral image (HSI). High-dimensional and low-resolution characteristics of HSI, however, make it difficult for conventional models to process its data effectively. In this article, a novel HSI classification model, namely, spatial–spectral pyramid network (SSPN), is designed by combining a 3-D convolutional neural network (3D CNN) with feature pyramid structure. SSPN taking advantage of 3-D convolution coupled with multiscale convolutional extraction is used to obtain a large set of diverse spatial–spectral features. Multiscale interfusion is also applied in SSPN to enrich the features contained in a single feature map and to improve the sensitivity on HSI spatial–spectral information, allowing it to better learn spatial–spectral features. Moreover, the losses of each combination based on multiscale interfusion are calculated via weighted average, which enables SSPN to avoid the excessive influence of single combination in the updating of model parameters. Four HSI public datasets and several comparison models are employed to validate the classification effect of SSPN. Experimental results show that SSPN achieves the highest overall accuracy (OA) in all datasets compared with other classification models, with 100%, 98.8%, 99.8%, and 98.7% on the datasets of Chikusei, Pavia University, Botswana, and Houston 2013, respectively. SSPN is demonstrated to possess higher classification accuracy and better generalization performance on HSI.