Google adds Gemma 4 12B support to local laptop apps through AI Edge

Google expands local AI options for laptops

Google says its latest open model, Gemma 4 12B, can now run in a range of laptop-focused apps and developer tools through the Google AI Edge stack. The company is positioning the model as a way to bring multimodal and agentic AI capabilities onto everyday machines, with local processing rather than cloud-based execution.

In a June 3 blog post, Google highlighted several products and workflows that now support the model. The company said the combination of Gemma 4 12B and AI Edge makes it possible to build and test local AI features directly on a laptop, including code generation, data analysis, visual output creation, voice dictation, and other tool-use tasks.

Apps now support the model on macOS

Google AI Edge Gallery, a local AI showcase app, is now available on macOS and can use Gemma 4 12B for coding-related tasks. Google said the app can generate scripts on the fly, run them locally, and help users turn raw data into charts and visual summaries. In one example, the company described prompting the model to create a Python program that would generate a chart comparing baby-name data from two years.

The company also pointed to a more advanced coding example involving 3D rendering, saying the model was able to generate code, specify dependencies and correct itself within a single interaction.

Google AI Edge Eloquent, the company’s dictation and editing app, is also now available on macOS. According to Google, the desktop version runs entirely on-device, including transcription and editing features. A new Voice Edit feature lets users speak commands to rewrite or transform text in other applications. Google gave examples such as turning notes into an executive summary or translating text into Hindi.

Google said Gemma 4 12B improves the quality of these interactions, with stronger instruction following and tighter adherence to the user’s requested scope than earlier models. The company said it sees more than a 60% improvement in overall quality in these tasks.

Local serving for developers

Google is also expanding LiteRT-LM, a lightweight tool for running language models locally, with a new serve command. That command lets the CLI act as a local server with an interface compatible with standard AI tools and frameworks. Google said developers can connect external apps and extensions, including OpenClaw, Hermes, OpenCode, Pi, Continue and Aider, to a local Gemma 4 12B endpoint.

The blog post included example terminal commands showing how to import the model and start a local server. Google said the approach is intended to support fully local agentic tools, harnesses and workflows.

Focus on on-device AI

The company is framing the release around privacy, responsiveness and cost efficiency. By keeping data on the user’s device, Google says developers can build local agents and AI assistants without sending information to remote servers. It also said the model is designed to run on everyday laptops, though it directed readers to a model card for system requirements and benchmark details.

The announcement reflects a broader push to make more capable AI models usable outside the cloud. With Gemma 4 12B, Google is betting that laptop-class devices can handle more sophisticated local tasks, from writing and editing to code execution and local app integration.