Near-surface air (Ta) and land surface (Ts) temperatures are essential parameters for research in the fields of agriculture, hydrology, and ecological changes, which require accurate datasets with different temporal and spatial resolutions. However, the sparse spatial distribution of meteorological stations in Northwest China may not effectively provide high-precision Ta data. And it is not clear whether it is necessary to improve the accuracy of Ts which has the most influence on Ta. In response to this situation, the main objective of this study is to estimate Ta for Northwest China using multiple linear regression models (MLR) and random forest (RF) algorithms, based on Landsat 8 images and auxiliary data collected from 2014 to 2019. Ts, NDVI (Normalized Difference Vegetation Index), surface albedo, elevation, wind speed, and Julian day were variables to be selected, then used to estimate the daily average Ta after analysis and adjustment. Also, the Radiative Transfer Equation (RTE) method for calculating Ts would be corrected by NDVI (RTE-NDVI). The results show that: 1) The accuracy of the surface temperature (Ts) was improved by using RTE-NDVI; 2) Both MLR and RF models are suitable for estimating Ta in areas with few meteorological stations; 3) Analyzing the temporal and spatial distribution of errors, it is found that the MLR model performs well in spring and summer, and is lower in autumn, and the accuracy is higher in plain areas away from mountains than in mountainous areas and nearby areas. This study shows that through appropriate selection and combination of variables, the accuracy of estimating the pixel-scale Ta from satellite remote sensing data can be improved in the area that has less meteorological data.