This project develops cross-API agents that can complete real workflows spanning multiple software systems where actions must be coordinated end-to-end. The work defines the cross-API setting with explicit dependency structure and measurable completion criteria then delivers an open benchmark and simulator to evaluate reliability under realistic failures and constraints.
Using large-scale GPU training, we improve completion robustness via reinforcement learning with automatically checked outcomes, enabling agents to validate intermediate results, recover from errors and consistently finalize tasks. The expected result is a step change from tool-using demos to dependable assistants that reduce operational friction in enterprise coordination scenarios such as travel disruption management, scheduling and stakeholder notification, supporting industrial deployment and productisation.
Principal Investigator, Company and Country
Şener Özönder, ArtificaX Technologies, Türkiye