Spectral video has emerged as a non-invasive scientific tool to analyze the behavior of dynamic scenes in high spectral resolution. In view of its importance, several compressive spectral imaging (CSI) systems have been adapted to acquire compressed projections of dynamic scenes at high frame rates. These acquisition techniques capture and encode three-dimensional (3D) spatio-spectral information of the scene into a set of two-dimensional (2D) projections at different time frames. Various computational reconstruction algorithms can be used to recover the underlying spectral video from the compressed measurements and then, processing tasks such as classification, detection, among others, can be performed. However, the main challenge of reconstruction-based approaches is the high computational cost, which increases with the number of frames. This paper presents a CNN-based method to segment moving objects on the compressive domain. This method can exploit spatial-temporal correlations and detect the prominent motion regions without recovering the spectral-temporal data set. Simulation results show that the proposed method outperforms a state-of-the-art approach that also detects motion on the compressive domain, and obtains comparable segmentation performance with respect to a method that works on the reconstructed data.