NVIDIA and Google Cloud have announced a strategic partnership aimed at bringing Google’s Gemini AI models—and in particular, agentic AI capabilities—directly to customers through local deployments. Rather than operating solely in the cloud, these solutions will run on-premises, offering organizations greater control over their data and infrastructure.
This development is particularly significant for sectors that handle sensitive information, such as healthcare, finance, and government. Leveraging NVIDIA Blackwell platform, including HGX and DGX systems and NVIDIA Confidential Computing, the collaboration ensures enhanced security for enterprise workloads.
The solution will operate on Google Distributed Cloud, allowing businesses to meet stringent data residency and compliance requirements while maintaining high performance. At the same time, the hybrid approach encourages enterprises to leverage both local and cloud infrastructure for greater flexibility and efficiency.
To support scalable deployment, Google Cloud has introduced new tools for managing AI in production environments. The GKE Inference Gateway simplifies and reduces the cost of deploying and scaling AI models, with integrated support for NVIDIA Triton Inference Server and NeMo Guardrails to improve load balancing and governance.
Looking ahead, the two companies are also working on advanced observability features with NVIDIA Dynamo. This initiative aims to streamline the scaling of reasoning-centric AI models across multiple infrastructures, laying the groundwork for broader adoption of agentic AI technologies.