Milestone Systems introduces vision language model for intelligent video analysis



January 8, 2026



Product news - Security

Milestone Systems has introduced a new Vision Language Model (VLM) specifically designed to understand traffic scenarios. The solution is based on Nvidia Cosmos Reason and addresses key challenges in modern video security: enormous amounts of data, manual evaluation and time-consuming analysis processes.

The new VLM forms the technological basis for two new products: Video Summarisation for XProtect Video Management Software and VLM as a Service (VLMaaS) for third-party providers and developers.

Video Summarisation for XProtect: Quick insights instead of hours of viewing

Modern video systems capture huge volumes of data every day. Manually reviewing video material ties up resources and delays decisions. With Video Summarisation for XProtect, Milestone uses generative AI to automatically summarise video content, identify relevant events and automate reports.

Users can search visual data in the form of structured summaries, enabling them to gain actionable insights more quickly. The tool can be installed directly in XProtect Smart Client in just a few minutes and is available to download free of charge. Users are only charged for their use of VLM per query – a low-threshold entry point into AI-supported video analysis.

VLM as a Service: Production-ready video AI via API

With Hafnia VLM as a Service, Milestone is opening up its AI technology to developers, system integrators and technology partners. An API gives them access to a production-ready vision language model without having to build their own AI infrastructures or train models at great expense.

VLMaaS makes it possible to quickly add generative video intelligence to existing applications, regardless of the previous level of analysis. This significantly accelerates the development of new solutions: according to Milestone, the effort required is reduced by up to 70 times compared to fine-tuning your own VLM.

Key features include:

High-precision, traffic-optimised vision language model based on Nvidia Cosmos Reason
Prompt-based control for traffic-related analysis tasks
API-first approach with easy integration via HTTPS
Fine-tuned models for US and EU markets, with other regions to follow
Flexible use as a standalone solution or integrated into the Milestone portfolio
100% responsibly sourced training data with traceable data origin, compliant with GDPR and EU AI Act

The pricing model is pay-per-use and based on API calls. No high initial investments or individual training costs.

Trustworthy AI for safety-critical applications

A key differentiator of the new VLM is its consistent focus on responsible AI. The training data used is fully auditable and specifically optimised for real-world traffic scenarios. This makes the solution particularly suitable for safety-critical applications such as traffic management, smart cities and public infrastructure.

Andrew Burnett, Acting Chief Technology Officer at Milestone Systems, explains:

‘With Video Summarisation for XProtect and the Vision Language Model as a service, we are addressing two of the biggest bottlenecks in video security: information overload and time-consuming manual work. Operators get instant insights directly in XProtect, developers get production-ready intelligence via API – without complex training or infrastructure projects.’

Conclusion

With the introduction of its Vision Language Model, Milestone Systems is setting an important milestone in the further development of intelligent video analysis. The combination of generative AI, clear specialisation in traffic applications and regulatory compliance opens up new possibilities for converting video data into actionable decisions faster, more efficiently and more securely – for both operators and developers.

Constitutional Limits on the Displacement of Cash and Their Systemic Significance

Apr 8, 2026

A recent legal opinion by Christian Waldhoff of Humboldt University in Berlin analyses the role of cash at the intersection of digitalisation, regulation and the protection of fundamental rights. The study concludes that restrictions on the use of cash do not merely...

Broadband from space: Deutsche Telekom AG launches Starlink service for business customers

Apr 8, 2026

Deutsche Telekom AG is expanding its portfolio for business customers to include a satellite-based internet solution. With its ‘Satellite Internet Access by Starlink’ (SIA) offering, the group is integrating SpaceX’s infrastructure into its service ecosystem to ensure...

Germany: TeleTrusT criticises draft CRA implementing legislation

Apr 8, 2026

The German Association for IT Security (TeleTrusT) has commented on the draft bill prepared by the Federal Ministry of the Interior and Homeland Affairs for national implementing legislation for the EU Cyber Resilience Act (CRA) and sees a clear need for improvements....

January 8, 2026

Product news - Security

Video Summarisation for XProtect: Quick insights instead of hours of viewing

VLM as a Service: Production-ready video AI via API

Trustworthy AI for safety-critical applications

Conclusion

Related Articles

Constitutional Limits on the Displacement of Cash and Their Systemic Significance

Broadband from space: Deutsche Telekom AG launches Starlink service for business customers

Germany: TeleTrusT criticises draft CRA implementing legislation

Sitemap

Information

Newsletter Sign Up

Thank you!