Name: GTC 2026 | Workshop 09：LLMs & On-Device AI Architecture
Start: 2026-03-22T21:00:00+00:00
End: 2026-03-22T23:00:00+00:00
Location: 620 Hansen Way, Palo Alto, CA 94304

💡 Workshop focus on:When inference cost becomes a ceiling for growth, your architecture is your competitive advantage. This session focuses on strategies for cloud and on-device model co-ordination: when to run on-device, how to split the workload, and techniques like caching, compression, and routing to balance experience, privacy, and cost. Workshop format: 1-to-10 enablement session 🚀 What You’ll Learn: Understand the capabilities and trade-offs of cloud vs. on-device AI and common architectural patternsMaster the key levers for inference efficiency: latency, throughput, cost, and energy consumptionDesign layered invocation strategies: small-model-first, routing, fallback, and caching 🤖 Who It’s For: AI engineering leads, architects, and product/tech ownersTeams building agents, developer tools, mobile AI, or smart hardwareProjects hitting a wall with inference cost or latency

GTC 2026 | Workshop 09：LLMs & On-Device AI Architecture

About this event

Topics & Tags

You might also like