Liquid AI has released LFM2.5-230M, a compact foundation model designed to run across a wide range of devices, from cloud GPUs to low-cost CPUs. The company says the model is its smallest LFM release so far and is aimed at developers building agentic workflows, data extraction systems, and other lightweight applications.
The model is built on Liquid AI’s LFM2 architecture and is available in both base and post-trained versions on Hugging Face. Liquid AI says the release is intended to make it easier to fine-tune and deploy models locally, with support for a broad set of inference tools including llama.cpp, MLX, vLLM, SGLang, and ONNX.
According to the company, LFM2.5-230M was pre-trained on 19 trillion tokens and includes a 32K context extension stage. Liquid AI then applied a three-step post-training process that combines supervised fine-tuning using distillation from its larger LFM2.5-350M model, direct preference optimization, and multi-domain reinforcement learning. The company says the result is a model that balances flexibility for downstream customization with stronger out-of-the-box performance than many small models.
Liquid AI is positioning the release as a practical option for workloads where speed and efficiency matter more than deep reasoning. In its announcement, the company said the model is well suited to large-scale extraction pipelines and on-device agentic tasks, but not intended for advanced math, code generation, or creative writing. That framing suggests the company is targeting enterprise and embedded use cases rather than general-purpose chatbot deployments.
The company highlighted benchmark results across ten tasks covering knowledge, instruction following, extraction, and tool use. In those tests, LFM2.5-230M was said to compete with, and in some cases outperform, models more than twice its size. The benchmark set included GPQA Diamond, MMLU-Pro, IFEval, CaseReportBench, and several BFCL and τ²-Bench evaluations. While the model does not match larger systems across every category, Liquid AI argues that its efficiency profile gives it an advantage for edge deployments.
Liquid AI also pointed to measured inference speeds on consumer hardware. It said the model reached 213 tokens per second on a Samsung Galaxy S25 Ultra and 42 tokens per second on a Raspberry Pi 5. The company says that performance, combined with a small memory footprint, makes the model suitable for devices with limited compute and power budgets. For CPU inference, Liquid AI says it tuned the flash-attention setting differently depending on the device to improve performance.
The release also includes an example of on-device robotics. Liquid AI said it deployed the model on a Unitree G1 humanoid robot running on an onboard NVIDIA Jetson Orin. In that setup, the model served as a skill-selection layer that converts a natural-language command into a sequence of tool calls using NVIDIA’s SONIC framework. The company said the demonstration showed how a 230 million parameter model can be fine-tuned quickly to act as a control interface for a robot.
For developers, the release underscores Liquid AI’s broader push to market AI that can run wherever needed, not only in the cloud. The company describes LFM2.5 as an open-weight family intended for use across Apple, AMD, Qualcomm, and Nvidia hardware. With LFM2.5-230M, Liquid AI is extending that strategy into a smaller model designed for fast, local inference and easy deployment on constrained devices.