Wind energy has emerged as a cornerstone in global efforts to transition to renewable energy, driven by its low environmental impact and significant generation potential. However, the inherent intermittency of wind, influenced by complex and dynamic atmospheric patterns, poses significant challenges for accurate wind speed prediction. Existing approaches, including statistical methods, machine learning, and deep learning, often struggle with limitations such as non-linearity, non-stationarity, computational demands, and the requirement for extensive, high-quality datasets. In response to these challenges, we propose a novel neighborhood preserving cross-dataset data augmentation framework for high-horizon wind speed prediction. The proposed method addresses data variability and dynamic behaviors through three key components: (i) the uniform manifold approximation and projection (UMAP) is employed as a non-linear dimensionality reduction technique to encode local relationships in wind speed time-series data while preserving neighborhood structures, (ii) a localized cross-dataset data augmentation (DA) approach is introduced using UMAP-reduced spaces to enhance data diversity and mitigate variability across datasets, and (iii) recurrent neural networks (RNNs) are trained on the augmented datasets to model temporal dependencies and non-linear patterns effectively. Our framework was evaluated using datasets from diverse geographical locations, including the Argonne Weather Observatory (USA), Chengdu Airport (China), and Beijing Capital International Airport (China). Comparative tests using regression-based measures on RNN, GRU, and LSTM architectures showed that the proposed method was better at improving the accuracy and generalizability of predictions, leading to an average reduction in prediction error. Consequently, our study highlights the potential of integrating advanced dimensionality reduction, data augmentation, and deep learning techniques to address critical challenges in renewable energy forecasting.