Predicting Pseudogene–miRNA Associations Based on Feature Fusion and Graph Auto-Encoder

Zhou, Shijia and Sun, Weicheng and Zhang, Ping and Li, Li (2021) Predicting Pseudogene–miRNA Associations Based on Feature Fusion and Graph Auto-Encoder. Frontiers in Genetics, 12. ISSN 1664-8021

[thumbnail of pubmed-zip/versions/1/package-entries/fgene-12-781277/fgene-12-781277.pdf] Text
pubmed-zip/versions/1/package-entries/fgene-12-781277/fgene-12-781277.pdf - Published Version

Download (2MB)

Abstract

Pseudogenes were originally regarded as non-functional components scattered in the genome during evolution. Recent studies have shown that pseudogenes can be transcribed into long non-coding RNA and play a key role at multiple functional levels in different physiological and pathological processes. microRNAs (miRNAs) are a type of non-coding RNA, which plays important regulatory roles in cells. Numerous studies have shown that pseudogenes and miRNAs have interactions and form a ceRNA network with mRNA to regulate biological processes and involve diseases. Exploring the associations of pseudogenes and miRNAs will facilitate the clinical diagnosis of some diseases. Here, we propose a prediction model PMGAE (Pseudogene–MiRNA association prediction based on the Graph Auto-Encoder), which incorporates feature fusion, graph auto-encoder (GAE), and eXtreme Gradient Boosting (XGBoost). First, we calculated three types of similarities including Jaccard similarity, cosine similarity, and Pearson similarity between nodes based on the biological characteristics of pseudogenes and miRNAs. Subsequently, we fused the above similarities to construct a similarity profile as the initial representation features for nodes. Then, we aggregated the similarity profiles and associations of nodes to obtain the low-dimensional representation vector of nodes through a GAE. In the last step, we fed these representation vectors into an XGBoost classifier to predict new pseudogene–miRNA associations (PMAs). The results of five-fold cross validation show that PMGAE achieves a mean AUC of 0.8634 and mean AUPR of 0.8966. Case studies further substantiated the reliability of PMGAE for mining PMAs and the study of endogenous RNA networks in relation to diseases.

Item Type: Article
Subjects: South Asian Archive > Medical Science
Depositing User: Unnamed user with email support@southasianarchive.com
Date Deposited: 07 Jan 2023 10:23
Last Modified: 29 Apr 2024 07:49
URI: http://article.journalrepositoryarticle.com/id/eprint/3

Actions (login required)

View Item
View Item