基于多层次稀疏编码预测蛋白质亚细胞定位
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家重点研发计划 (No. 2017YFD0800204),中央高校基本科研业务费专项资金 (No. KYZ201600175) 资助。


Prediction of protein subcellular localization based on multilayer sparse coding
Author:
Affiliation:

Fund Project:

National Key Technology R&D Program of China (No. 2017YFD0800204), the Fundamental Research Funds for the Central Universities (No. KYZ201600175).

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    文中提出了一种简单有效的蛋白质亚细胞区间定位预测方法,为进一步了解蛋白质的功能和性质提供理论基础。运用稀疏编码,结合氨基酸组成信息提取蛋白质序列特征,基于不同字典大小对得到的特征进行多层次池化整合,并送入支持向量机进行分类。经Jackknife检验,在数据集ZD98、CH317和Gram1253上的预测成功率分别达到95.9%、93.4%和94.7%。实验证明基于多层次稀疏编码的分类预测算法能显著提高蛋白质亚细胞区间定位的预测精度。

    Abstract:

    In order to provide a theoretical basis for better understanding the function and properties of proteins, we proposed a simple and effective feature extraction method for protein sequences to determine the subcellular localization of proteins. First, we introduced sparse coding combined with the information of amino acid composition to extract the feature values of protein sequences. Then the multilayer pooling integration was performed according to different sizes of dictionaries. Finally, the extracted feature values were sent into the support vector machine to test the effectiveness of our model. The success rates in data set ZD98, CH317 and Gram1253 were 95.9%, 93.4% and 94.7%, respectively as verified by the Jackknife test. Experiments showed that our method based on multilayer sparse coding can remarkably improve the accuracy of the prediction of protein subcellular localization.

    参考文献
    相似文献
    引证文献
引用本文

陈行健,胡雪娇,薛卫. 基于多层次稀疏编码预测蛋白质亚细胞定位[J]. 生物工程学报, 2019, 35(4): 687-696

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2018-09-30
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2019-04-18
  • 出版日期:
您是第位访问者
生物工程学报 ® 2024 版权所有

通信地址:中国科学院微生物研究所    邮编:100101

电话:010-64807509   E-mail:cjb@im.ac.cn

技术支持:北京勤云科技发展有限公司