Milestone Systems introduces vision language model for intelligent video analysis

January 8, 2026

Milestone Systems has introduced a new Vision Language Model (VLM) specifically designed to understand traffic scenarios. The solution is based on Nvidia Cosmos Reason and addresses key challenges in modern video security: enormous amounts of data, manual evaluation and time-consuming analysis processes.

The new VLM forms the technological basis for two new products: Video Summarisation for XProtect Video Management Software and VLM as a Service (VLMaaS) for third-party providers and developers.

Video Summarisation for XProtect: Quick insights instead of hours of viewing

Modern video systems capture huge volumes of data every day. Manually reviewing video material ties up resources and delays decisions. With Video Summarisation for XProtect, Milestone uses generative AI to automatically summarise video content, identify relevant events and automate reports.

Users can search visual data in the form of structured summaries, enabling them to gain actionable insights more quickly. The tool can be installed directly in XProtect Smart Client in just a few minutes and is available to download free of charge. Users are only charged for their use of VLM per query – a low-threshold entry point into AI-supported video analysis.

VLM as a Service: Production-ready video AI via API

With Hafnia VLM as a Service, Milestone is opening up its AI technology to developers, system integrators and technology partners. An API gives them access to a production-ready vision language model without having to build their own AI infrastructures or train models at great expense.

VLMaaS makes it possible to quickly add generative video intelligence to existing applications, regardless of the previous level of analysis. This significantly accelerates the development of new solutions: according to Milestone, the effort required is reduced by up to 70 times compared to fine-tuning your own VLM.

Key features include:

  • High-precision, traffic-optimised vision language model based on Nvidia Cosmos Reason
  • Prompt-based control for traffic-related analysis tasks
  • API-first approach with easy integration via HTTPS
  • Fine-tuned models for US and EU markets, with other regions to follow
  • Flexible use as a standalone solution or integrated into the Milestone portfolio
  • 100% responsibly sourced training data with traceable data origin, compliant with GDPR and EU AI Act

The pricing model is pay-per-use and based on API calls. No high initial investments or individual training costs.

Trustworthy AI for safety-critical applications

A key differentiator of the new VLM is its consistent focus on responsible AI. The training data used is fully auditable and specifically optimised for real-world traffic scenarios. This makes the solution particularly suitable for safety-critical applications such as traffic management, smart cities and public infrastructure.

Andrew Burnett, Acting Chief Technology Officer at Milestone Systems, explains:

‘With Video Summarisation for XProtect and the Vision Language Model as a service, we are addressing two of the biggest bottlenecks in video security: information overload and time-consuming manual work. Operators get instant insights directly in XProtect, developers get production-ready intelligence via API – without complex training or infrastructure projects.’

Conclusion

With the introduction of its Vision Language Model, Milestone Systems is setting an important milestone in the further development of intelligent video analysis. The combination of generative AI, clear specialisation in traffic applications and regulatory compliance opens up new possibilities for converting video data into actionable decisions faster, more efficiently and more securely – for both operators and developers.

Related Articles

Commentary: BERLIN – Known risks, familiar words, familiar failures

The power outage in Berlin since 3 January 2026 is extraordinary in its scale, but remarkably familiar in its causes and political consequences. Five damaged high-voltage cables, tens of thousands of households without electricity and heating, restrictions on mobile...

Commentary: Hesse’s clear stance against left-wing extremism

In his statement, Hesse's Interior Minister Roman Poseck paints a deliberately clear picture of left-wing extremism as a threat to security. The core of his position is clear: left-wing extremism is not understood as a marginal phenomenon or merely a side issue of...

Positive safety record at Bavaria’s Christmas markets

Successful protection concepts combining presence, prevention and cooperation At the end of the 2025 Christmas market season, the Bavarian State Ministry of the Interior reports a thoroughly positive safety record. Home Secretary Joachim Herrmann spoke of...

Share This