
About this event
💡 Workshop focus on:When inference cost becomes a ceiling for growth, your architecture is your competitive advantage. This session focuses on strategies for cloud and on-device model co-ordination: when to run on-device, how to split the workload, and techniques like caching, compression, and routing to balance experience, privacy, and cost.
Workshop format: 1-to-10 enablement session
🚀 What You’ll Learn:
Understand the capabilities and trade-offs of cloud vs. on-device AI and common architectural patternsMaster the key levers for inference efficiency: latency, throughput, cost, and energy consumptionDesign layered invocation strategies: small-model-first, routing, fallback, and caching
🤖 Who It’s For:
AI engineering leads, architects, and product/tech ownersTeams building agents, developer tools, mobile AI, or smart hardwareProjects hitting a wall with inference cost or latency
Topics & Tags
AI
AI
Date & time
Sunday, March 22, 2026 · 9:00 PM – 11:00 PM
America/Los_Angeles
Location
620 Hansen Way, Palo Alto, CA 94304
America/Los_Angeles
Organised by
FounderGro Events