Currently, it is very easy to produce documents, which means that there is too much information, and all this information produced is almost impossible to organize if automatic methods are not used. The automatic classification of documents can be defined as an action executed by an artificial system on a set of structured or unstructured documents. This action is performed by using the words contained in the documents to define the class to which the test document belongs. This paper presents several classification experiments using the Reuters-21578 database in order to observe the performance of naive Bayes classifiers, support vector machines (SVM) and logistic regression. The results obtained show the performance of the classifiers, their behavior when applying cleaning techniques to reduce the size of the documents and different classification scenarios.
Tópico:
Anomaly Detection Techniques and Applications
Citaciones:
1
Citaciones por año:
Altmétricas:
0
Información de la Fuente:
FuenteRevista Colombiana De Tecnologías De Avanzada