Methods-Time Measurement (MTM) is a globally recognised system used to analyse manual work by breaking it down into standard motion elements and defining how long each should take. MTM-UAS (Universal Analyzing System), one of its most widely used variants, provides a practical method for standardising manual operations in manufacturing.
However, MTM analysis is still performed manually by industrial engineers, a process that is not only time-consuming, labour-intensive, and difficult to scale, but also significantly slower than conventional time studies. Analysing a one-minute operation according to MTM-UAS standards can take up to forty minutes and requires specialised training. While some parameters in MTM, such as distances or product weights, cannot be fully extracted from video data alone, this project aims to develop a model that can automatically generate MTM-UAS codes for a large portion of the task. The output will be easily editable by engineers, enabling them to finalise analyses much faster and with minimal manual effort.
There is high demand in the industry for AI-based tools that can automate this analysis, making MTM more accessible and efficient across large-scale operations.
This project aims to develop a domain-specific Vision Language Model (VLM) that can automatically classify operator actions in video data according to MTM-UAS standards. Our technical approach includes building agentic VLM pipelines for long-horizon video understanding, integrating contextual memory, tool-use recognition, and retrieval-based grounding. We will optimise fine-tuning using LoRA/DeepSpeed, explore model compression techniques, and enhance robustness to real-world video conditions. Evaluation will be conducted through reproducible benchmarks and collaboration with industry partners.
The result will be a scalable AI system that transforms manual MTM-UAS analysis into a fast, automated, and reliable process, supporting digital transformation and operational excellence in manufacturing.
Cagkan Ekici, Khenda, Türkiye