Large Language Models (LLMs) are widely used in cloud environments but face serious risks to model confidentiality and data privacy. Trusted Execution Environments (TEEs) provide confidential computing through strong isolation and data encryption, yet most TEEs cannot meet the heavy computation and memory demands of LLM inference. In this work, we evaluate lightweight LLMs in TEE-enabled environments using Intel Trust Domain Extensions (TDX). We analyze the memory usage of LLM models to guide private-memory allocation and compare tokens per second (TPS) across CPU-only, CPU–GPU hybrid, and TEE-based settings. We also apply quantization techniques to further accelerate inference in TDX environment. Our results show that for lightweight LLMs (up to 7B parameters), TPS on TDX is nearly 4x higher than on CPU. Moreover, INT4 quantization achieves 3x throughput gains and 70% storage savings relative to FP16.