Abstract
Data mapping plays an important role in data integration and exchanges among institutions and organizations with different data standards. However, traditional rule-based approaches and machine learning methods fail to achieve satisfactory results for the data mapping problem. In this paper, we propose a novel and sophisticated deep learning framework for data mapping called mixture feature embedding convolutional neural network (MfeCNN). The MfeCNN model converts the data mapping task to a multiple classification problem. In the model, we incorporated multimodal learning and multiview embedding into a CNN for mixture feature tensor generation and classification prediction. Multimodal features were extracted from various linguistic spaces with a medical natural language processing package. Then, powerful feature embeddings were learned by using the CNN. As many as 10 classes could be simultaneously classified by a softmax prediction layer based on multiview embedding. MfeCNN achieved the best results on unbalanced data (average F1 score, 82.4%) among the traditional state-of-the-art machine learning models and CNN without mixture feature embedding. Our model also outperformed a very deep CNN with 29 layers, which took free texts as inputs. The combination of mixture feature embedding and a deep neural network can achieve high accuracy for data mapping and multiple classification.
Original language | English (US) |
---|---|
Article number | 8368078 |
Pages (from-to) | 165-171 |
Number of pages | 7 |
Journal | IEEE Transactions on Nanobioscience |
Volume | 17 |
Issue number | 3 |
DOIs | |
State | Published - Jul 2018 |
Keywords
- Data mapping
- convolutional neural network
- deep learning
- mixture feature embedding
- multimodal
- multiview
ASJC Scopus subject areas
- Biotechnology
- Bioengineering
- Medicine (miscellaneous)
- Biomedical Engineering
- Pharmaceutical Science
- Computer Science Applications
- Electrical and Electronic Engineering