Several computer vision algorithms have been proposed to detect anomalous activities (robberies, murders, vandalism, among others) in videos. According to the learning approach, they can be classified into probabilistic distribution modeling, sparse coding, and deep learning based methods. The main drawbacks of these approaches are (i) extraction of low-level features that do not capture complex behaviors of instances on the scene, (ii) generation of features from irrelevant regions, (iii) overlooking of relationships among objects, and (iv) omission of long-term dependencies. To solve these issues, we propose a deep learning architecture that leverages the relationships among objects. It achieves this by using an attention mechanism and learning long-term dependencies using a multilayer recurrent neural network (multilayer LSTM). An AUC score of 0.749 on the UCF-Crime dataset confirms that the proposed algorithm competes effectively against several state-of-the-art approaches for anomaly detection in surveillance videos. It also explains the relationship between regions in the video frames and the anomaly detections.