Mainstream sentence classification algorithms rely on a single word vector model to obtain the feature vector representation of text, which leads to insufficient text mapping ability. Therefore, a multi-kernel learning method is used to fuse multiple text representations based on different word vectors to improve the accuracy of sentence classification. In the process of fusing different kernel functions, traditional kernel function coefficient optimization methods often lead to long training time and difficulty in finding a local optimum. To address this problem, a new kernel function coefficient optimization method that continuously approximates the optimal kernel function coefficient value based on parameter space segmentation and breadth first search was developed. In this study, a support vector machine (SVM) was used as a classifier to perform classification experiments on seven text datasets, and the experimental results showed that the multi-kernel learning classification results were significantly better than those of single-kernel learning. Moreover, the proposed optimization method performed better than traditional methods with less training cost.
We propose a neural network training framework called momentum-updated representation with reconstruction constraint for 3D (three-dimensional) object recognition using 2D (two-dimensional) images without angle labels. First, self-supervised learning is employed to address the lack of angle labels. Second, we use momentum updating based on a dynamic queue to maintain the stability of the object representation. Furthermore, the reconstruction constraint is applied to the learning process with an auto-encoder module, which enables the representation to capture more semantic information of the objects. Finally, during training, a dynamic queue reduction strategy is proposed for handling the imbalanced data distribution. Experiments on two popular multi-view datasets, ModelNet and ShapeNet, demonstrate that the proposed method outperforms existing methods.
The trajectory optimization of cycling is hindered by the errors of positioning equipment, riding habits of non-motor vehicles, and other factors. It leads to quality problems, such as abnormal data and missing positioning information in the riding trajectory, impacting the application of trajectory-based riding-map inference and riding-path planning. To solve these problems, this paper creates a framework for improving the quality of cycling-trajectory data, based on the construction of a grid index, screening of abnormal trajectory points, elimination of wandering trajectory segments, elimination of illegal trajectory segments, calibration of drift trajectory segments, and recovery of missing trajectory. Comparative and ablation experiments are conducted by using a real non-motor-vehicle cycling-trajectory dataset. The experimental results verify that the proposed method improves the accuracy of cycling-map inference.
Shared-nothing distributed databases are designed for the high scalability and high availability request of Internet-based applications. There have been significant achievements in shared-nothing distributed databases, but for some shared-nothing databases with stateless computation layers, long conflict-detection paths challenge database performance under high-contention workloads. To solve this problem, we design two methods, pre-lock and local cache, together with a high-contention detection module that allow high-contention to be quickly detected and the corresponding high-contention-handling strategy applied. Experiments show that our design and optimization for high-contention transaction-processing architecture can improve the performance of distributed databases under high-contention workloads.
The diagnostic method based on dual-view fundus imaging is widely used in diabetic retinopathy (DR) screening. This method effectively solves the problems of image occlusion and limited field of view under single-view. This paper proposes a learning method of feature fusion between dual-view images based on the attention mechanism to improve the accuracy of DR classification by effectively integrating different view information. Due to the small proportion of lesions in fundus images, the self-attention mechanism was introduced to enhance the learning of local lesion features. Moreover, a cross-attention mechanism is proposed to effectively utilize information between dual-view images to improve the classification of dual-view fundus images. Experiments were performed on the internal DFiD dataset and public DeepDRiD dataset. The proposed method can effectively improve the accuracy of DR classification and can be used for large-scale DR screening to assist doctors in achieving an efficient diagnosis.
The popularity of intelligent devices such as smartphones and surveillance cameras has led to serious face privacy problems. Face de-identification is considered an effective tool for protecting face privacy by concealing identity information. However, most de-identification methods lack explicit control and controllable changes in identifying de-identified face images, resulting in de-identified images that are inapplicable to face authentication and retrieval and other identity-related tasks. Therefore, this study proposes an identity inter-relationship-consistent face de-identification task in which the identity inter-relationship between two arbitrary de-identified faces maintained the same as before de-identification. To this end, a task-driven identity inter-relationship consistent generative adversarial network is introduced to generate de-identified faces with a consistent identity inter-relationship. A rotation-based de-identifier was designed to modify the original identity features to be de-identified with identity inter-relationship consistency. In addition, identity control loss is introduced to guarantee a precise identity generation using a de-identified generator. Qualitative and quantitative results show that our method achieves improvements compared with exiting methods for de-identifying de-identified faces as well as for maintaining their identity inter-relationship consistent.
Review text contains comprehensive user and item information and it has a great influence on users’ purchase decision. When users interact with different target items, they may show complex interests. Therefore, accurately extracting review semantic features and modeling the contextual interaction between items and users is critical for learning user preferences and item attributes. Focusing on enhancing the personalization capture and dynamic interest modeling abilities of recommender systems, and considering the usefulness of different features, we propose a hierarchical description-aware personalized recommendation (DAPR) algorithm. At the word level of review text, we design a personalized information selection network to extract important word semantic features. At the review level, we design a neural network based on a cross-attention mechanism to dynamically learn the usefulness of reviews, concatenate review summaries as descriptions, and devise a co-attention network to capture rich context-aware features. The analysis of five Amazon datasets reveal that the proposed method can achieve comparable recommendation performance to the baseline models.
Chinese sentiment analysis is one of important researches in natural language processing, which aims to discover the sentimental tendencies in the Chinese text. In recent years, research on Chinese text sentiment analysis has made great progress in efficiencies, but few studies have explored the characteristics of the language and downstream task requirements. Therefore, in view of the particularity of Chinese text and the requirements of sentiment analysis, using the Chinese text sentiment analysis method that integrates multi-granularity semantic features, such as characters, words, radicals, and part-of-speech is proposed. This introduces radical features and emotional part-of-speech features based on character and word features. Additionally, this integration uses bidirectional the long short-term memory network (BLSTM), attention mechanism and recurrent convolutional neural network (RCNN). The softmax function is used to predict the sentimental tendencies by integrating multi-granularity semantic features. The comparative experiment results on the NLPECC (natural language processing and Chinese computing) dataset showed that the F1 score of the proposed method was 84.80%, which improved the performance of the existing methods to some extent and completed the Chinese text sentiment analysis task.
Acute kidney injury is a clinical disease with a high morbidity rate, and early identification of potential patients can facilitate medical interventions to reduce morbidity and mortality. In recent years, electronic health records have been widely used to predict an individual’s potential risk. Most of the existing acute kidney injury prediction models tackle the issue of sparsity and irregularity in the physiological variables data by aggregating data or imputing the missing value, but ignore the patient’s health status implied by the missing information. Moreover, they do not consider the characteristics of and correlation between the various modalities. To solve the above issues, we present a multi-modal disease prediction model for acute kidney injury. The proposed model considers a variety of modal data, including physiological variables, disease, and demographic data. A new mask and time span based long short term memory (LSTM) network is designed to learn the time span and missing information of individual Physiological variables, and furthermore, to capture their numerical changes and frequency changes. The multi-head self-attention mechanism is introduced to promote interaction learning of each modality representation. Experiments on the real-world application of acute kidney injury risk prediction and mortality risk prediction demonstrate the effectiveness and rationality of the proposed model.
In real-world scenarios, various events in the news are not only too nuanced and complex to distinguish, but also involve multiple entities. To address these problems, previous event-centric methods are designed to detect events first and then extract arguments, relying on imperfect performance for event trigger detection; this process, however, is unfit to deal with the sheer volume of news in the real world. Given that the performance of named entity recognition (NER) is satisfactory, we shift our perspective from an event-centric to a target-centric view. This paper proposes a new task: target-dependent event detection (TDED), which aims to extract target entities and detect their corresponding events. We also propose a semantic and syntactic aware approach to support thousands of target entity extractions first and subsequently the detection of dozens of event types; this approach can be applied to data from massive corporations. Experimental results on a real-world Chinese financial dataset demonstrated that our model outperformed previous methods, particularly in complex scenarios.
In the era of big data and with the continuous expansion of data, there are significant challenges with efficient access to data. Hence, designing an efficient index structure is of great significance. ALEX (updatable adaptive learned index) is a learned index that uses a machine learning model to replace the traditional B-tree index structure. Although it offers good time and space performance, it suffers from frequent page faults. In order to solve this problem and further improve the performance of ALEX, a memory pre-allocation strategy based on huge pages is proposed, on the basis of ALEX, that can help reduce the rate of memory page faults and improve the overall performance of ALEX. In the memory allocation phase, the pre-allocation strategy is adopted, and the memory free phase adopts a delayed release strategy. Experiments on the Longitudes dataset show that this strategy offers good performance.
Traditional virtual terrain modeling commonly uses a procedural generation method based on manual design, which cannot be used for competent simulation modeling tasks that need to restore real environments, such as in military applications. In this paper, we proposed a landscape simulation modeling method based on remote sensing images. The core of our proposed method is a landscape blended texture generation network (LBTG-Net); this method uses a blended texture generator (BTG) to generate landscape blended textures with the supervision of a style discriminator (SD) and multi-stage classification loss. Then, we procedurally build the complete virtual environment based on the blended texture generated by LBTG-Net. Our method has two main features: (1) accurate land-cover classification ability of remote sensing image inputs; and (2) high quality landscape blended texture outputs to guarantee virtual landscape modeling quality. We used multispectral image data from the Sentinel-2 satellite as the experimental dataset. The experimental results showed that our method offered high performance under mainstream land-cover classification evaluating indicators and can accurately reproduce the environmental distribution of input remote sensing images while completing high-quality virtual terrain simulation modeling.
In this paper, we propose a method for fast splicing of three-dimensional point clouds based on the lock pin model on a container terminal using high overlapping views. This experiment first uses an Azure Kinect depth camera to collect scene point clouds, and subsequently preprocesses the point cloud. The target point cloud is thus obtained. For lock pins with slightly different views, the sample consensus initial algorithm (SAC-IA) is used on the basis of the classic iterative closest point (ICP) algorithm to determine the overlapping position relationship of the two point clouds. In the overall splicing process, the relative size of the bounding box area projected by the lock pin in the z-direction of the camera is adopted to estimate the general shape of the lock pin; the relative size of the bounding box area is also used to select an appropriate number of point cloud views with high overlap in order to ensure the accuracy of registration and reduce processing time by comparing the difference between the area of adjacent views. The experimental results show that the proposed method has a lower relative registration error for the lock pin, and can quickly establish a workpiece model suitable for type matching.
The popularity of positioning devices has generated a large volume of vehicle driving data, making it possible to use historical data to predict the driving time of vehicles. Vehicle driving data consists of two parts: the sequence of road segments that the vehicle travels through, the departure time, the total length of the path, and other external information. The questions of how to extract sequence features in road segments and how to effectively fuse sequence features with external features become the key issues in predicting the travel time. To solve the aforementioned problems, a transformer-based travel time prediction model is proposed, which consists of two parts: a road segment sequence processing module and a feature fusion module. First, the road segment sequence processing module uses the self-attention mechanism to process the road segment sequence and extract the road segment sequence features. The model can not only fully consider the spatiotemporal correlation of road speeds between each road segment and other road segments, but also ensures the parallel input of data into the model, avoiding the low efficiency problem caused by sequential input of data when using recurrent neural networks. The feature fusion module fuses the road segment sequence features with external information, such as departure time, and obtains the predicted travel time. On this basis, the number of road segments connected by the intersection is determined by the upstream and downstream intersection features of the road segment, and the input model is combined with the road segment characteristics to further improve the prediction accuracy of the driving time. Comparative experiments with mainstream prediction methods on real data sets show that the model improves prediction accuracy and training speed, reflecting the effectiveness of the proposed method.
With the development of the Internet of Things, a large number of sensor devices can be connected to a network. Anomaly detection of data generated by these devices is related to the stability of system services. A time series database is a database system optimized for time series data. As an important component of a monitoring system, time series databases are responsible for storing and querying continuous streams of time series data. The current time series database, however, cannot fully utilize system computing resources and cannot meet the latency requirements when coping with queries from multiple data sources. To address these drawbacks, we redesigned the query execution model of a time series database based on the well-known InfluxDB, and we proposed InfluxDB-PP (parallel processing) as a method to address the aforementioned problems. The experimental results show that InfluxDB-PP reduces query latency by about 85.7% compared to InfluxDB for real-time anomaly data query scenarios.
With the continuous development of industrial intelligent inspection technology, the equipment element state recognition system based on digital image processing is widely used. In order to improve the accuracy of power distribution cabinet(PDC) equipment element state recognition in a distribution room, a ResNet(residual networks)-based equipment element state recognition method is proposed. Firstly, the data acquisition system is set up and the data set is constructed. Then, for the PDC image, the preset device component target area is cropped to generate the device component image. For device component images, a ResNet-based component state recognition model was constructed and trained, and the trained model was used to identify component states. Taking the data set for power distribution cabinet equipment element in substation distribution rooms as the research object, a network of single prediction heads is adopted as the component with complex features, and the network of multiple prediction heads is adopted as the component with simple features. Then, the compact and pruning model compression method is used to reduce the number of parameters and the calculation amount under the condition of less accuracy loss. Finally, the architecture design of the inspection system is introduced. A JetSon Nano edge terminal is used as the running hardware of the algorithm module to reduce the communication cost.
With the continuous development of China’s electric power system, the security and reliability of power supply directly affects regional production output and people’s economic life. As an important part of the power dispatch system, traditional fault locations rely on the cumulative experience and manual judgment of dispatchers. Faced with increasing demands, fault locations that rely solely on the traditional method are likely to result in an increase in misjudgment rates and pose a threat to the stable operation of the power system. This paper proposes a Boolean equation based on Kirchhoff’s law and the grid fault location algorithm to address this challenge. The fault location issue can effectively be converted to Boolean linear mixed programming problems and combined with simulated annealing algorithms. When these genetic algorithms are applied to the idea of a network and realized in the grid for fast positioning of small faults, the scheduling error rate can be reduced and the time difference from fault occurrence to fault isolation and fault processing can be shortened; in turn, this saves human resources and improves scheduling efficiency.
Adaptive learning is an educational method that uses computer algorithms to coordinate interaction with learners, and provides customized learning resources and learning activities to address the unique needs of each learner. With the impact of COVID-19, adaptive learning has become increasingly important. One of the challenges with adaptive learning is how to provide personalized learning resources for learners—i.e., how to generate personalized recommendation for learners from a large set of learning resources. Existing methodologies mainly generate recommendations based on a learner’s knowledge level; however, this approach has some limitations. Firstly, when assessing a learner’s knowledge level, learners’ forgetting phenomenon has to date not been well modeled. Secondly, recommendations are generated separately from knowledge tracing tasks, ignoring the interconnectedness between these aspects. In addition, learners’ preferences for the type of learning resources and learning strategies is normally ignored if the knowledge level alone is used. To solve the aforementioned problems, this paper proposes a knowledge and personality incorporated multi-task learning framework (KPM) to boost course recommendations (i.e., the above-mentioned learning resources); the proposed method regards an enhanced knowledge tracing task (EKTT) as an auxiliary task to assist the primary course recommendation task (CRT). Specifically, using EKTT, we design a personalized forgetting controller to enhance the deep knowledge tracing model for accurately assessing a learner’s knowledge level. With CRT, we combine the learner’s knowledge level and sequential behavior with their personality adapted to the specific context to obtain learner’s profile; this data is subsequently used to generate a course recommendation list. Experimental results on real-world educational datasets demonstrate the superiority of our proposed method in terms of hit ratio (HR), normalized discounted cumulative gain (NDCG), and precision, indicating that our method can generate more personalized recommendations.
Distant supervision relation extraction captures noisy instances while reducing the burden of manual annotation, which hinders the training and testing process. To alleviate this problem, we proposed a de-noising method based on the influence function. The influence function measures the influence of each training point; the influence of one training point is defined as the change in test loss after removing the training point. We observed that this property could be used to determine whether a training instance involves noisy data. First, we designed a scoring function based on the influence function. Then, we integrated the scoring function into a bootstrapping framework to obtain the final denoising dataset from a small clean set. Using this preprocessing method, every distantly supervised dataset could be denoised by our method. Experimental results showed that the proposed denoised dataset can achieve good performance on a public dataset.
The concept of knowledge tracking involves tracking changes in a student’s knowledge level based on historical question records and other auxiliary information, and predicting the result of a student’s subsequent answer to a question. Since the performance of existing neural network knowledge tracking models needs to be improved, this paper proposes a deep residual network based on a stacked gated recurrent unit (GRU) network named the stacked-gated recurrent unit-residual (S-GRU-R) network. The proposed solution aims to address over-fitting caused by too many parameters in a long short-term memory (LSTM) network; hence, the solution uses a GRU instead of LSTM to learn information on the sequence of questions. The use of stacked GRU can expand sequence learning capacity, and the use of residual connections can reduce the difficulty of model training. Experiments on the Statics2011 data set were completed using S-GRU-R, and AUC (area under the curve) and F1-score were used as evaluation functions. The results showed that S-GRU-R surpassed other similar recurrent neural network models in these two indicators.