In this article we propose an algorithm for the classification, tracking and counting of vehicles and pedestrians in video sequences; The algorithm is divided into two parts, a classification algorithm, which is based on convolutional neural networks, implemented using the You Only Look Once (YOLO) method; and a proposed algorithm for tracking regions of interest based in a well defined taxonomy. For the first stage of classification, We train and evaluate the performance with a set of more than 50000 labels, which we make available for their use. The tracking algorithm is evaluated against manual counts in video sequences of different scenarios captured in the management center of the Secretaria distrital de Movilidad of Bogota.