Temperature effects are incorporated into a compact modeling framework for gate-all-around (GAA) nanosheet (NS) MOSFETs based on a fundamental device current equation and two different artificial neural networks (ANNs) from -75°C to 125°C. The first ANN is responsible for predicting key temperature-dependent physical coefficients embedded in the Grove-Frohman formulation, enabling the model to capture the dominant thermal trends of drain current and major parameter variations. The second ANN is constructed as a residual correction network that generates bias-dependent correction factors to compensate nonlinearities not fully captured by the core physics equation. Applying this deep-learning model to various circuit simulations and comparing it with raw data across different temperatures consistently yields excellent results.