Logotipo ImpactU
Autor

Improving the prediction of sub-cellular locations of proteins with a particle swarm optimization-based boosting strategy

Acceso Cerrado
ID Minciencias: ART-0000043222-29694
Ranking: ART-GC_ART

Abstract:

Learning from imbalanced data sets presents an important challenge to the machine learning community. Traditional classification methods, seeking to minimize the overall error rate of the whole training set, do not perform well on imbalanced data since they assume a relatively balanced class distribution and put too much strength on the majority class. This is a common scenario when predicting sub-cellular locations of proteins since proteins belonging to certain specific locations are naturally more abundant or have been more extensively studied. In this work, a new method to learn from imbalanced data, called SwarmBoost, is proposed in order to reduce overlapping and noise of imbalanced datasets and improve prediction performances. The method combines oversampling, subsampling based on particle swarm optimization and ensemble methods. Our results show that SwarmBoost equals and in several cases outperforms other common boosting algorithms like DataBoost-Im and AdaBoost, constituting a useful tool for improving sub-cellular location predictions.

Tópico:

Imbalanced Data Classification Techniques

Citaciones:

Citations: 1
1

Citaciones por año:

Altmétricas:

Paperbuzz Score: 0
0

Información de la Fuente:

FuenteAnnual International Conference of the IEEE Engineering in Medicine and Biology Society
Cuartil año de publicaciónNo disponible
VolumenNo disponible
IssueNo disponible
Páginas6313 - 6316
pISSNNo disponible
ISSN2375-7477

Enlaces e Identificadores:

Publicaciones editoriales no especializadas