Cogment: Our Open-Source Framework for Human-AI Collaboration
The idea of involving humans directly to benefit the training of AI agents is getting traction, thanks in part to advances made in reinforcement learning and human-in-the-loop training. The ability to leverage human expertise and sensibility as well as AI’s exploration power and pattern recognition capabilities in intelligent ecosystems is a stepping stone towards better, more just, and sturdier systems. However, the path from lab to real-life deployment and operation comes with architecture, functional design, and engineering complexities.
We present CogmentTM, a unifying open-source framework that introduces an actor formalism to support a variety of human-agent collaboration topologies and training approaches, including human-led demonstrations, evaluations, and guidance. Cogment addresses the aforementioned complexities and is also scalable out of the box thanks to a distributed microservice approach. This post offers an overview of how Cogment’s open-source framework supports distributed multi-actor training, deployment & operating. If you are interested in diving directly into more detail, the complete Cogment White Paper is freely available here.
Why Cogment Matters
In order to achieve human-AI collaboration, a shared environment where humans and AI agents can operate and train together is warranted; without it, AI agents cannot learn from humans’ reactions to what they are doing. We are talking, here, about going far beyond human contributions like annotation of data, which is usually carried out offline and for specific kinds of AI agents’ training on examples. Furthermore, human feedback of various degrees of complexity, human demonstrations, and live operating have all been shown to enhance AI training and results. Until now, however, there was no accessible, unifying technological and design framework to quickly develop, train, and deploy such applications. Cogment was designed to answer these needs.
Key Features of Cogment
Cogment also accounts for the inherent lag of some common human feedback types through its retroactive feedback capabilities (i.e. attaching rewards to past actions), all the while maintaining the ability to do online training as well. For example, a smart home AI agent dimming the lights at a specific time can retroactively learn from a negative reward when humans stand up and walk to the switch to override the AI agent’s change a few seconds later.
Let’s say we are training two AI agents in an environment with two humans. The problem with starting the training with this setup is that humans will have to interact with untrained AI agents for a while before these agents start to do something useful, which is not an efficient use of humans. Cogment’s implementation swapping allows more interesting training setups to work around this issue. Simple examples of training regimen that can be easily implemented using Cogment include:
Bootstrapping with pseudo-humans: implement simple rule-based AI agents simulating or mimicking the behaviour of humans and run a lot of fully simulated trials with this setup. Once the AI agents have reached a good performance level, start involving the actual humans. For example, a product recommendation agent needs to understand the correlation between a recommendation and a purchase, which is something that might take some time. Implementing a very average or stereotypical human buyer behaviour (based on historical statistics, for example) will help the agent in learning a baseline, readying it before carrying on with learning more subtleties with actual humans.
Bootstrapping with business expertise-based AI agents: implement the two agents using, for example, a rule-based system to provide some value to the human, and add some stochasticity to bolster variety. These agents will start generating data that can be used to train Machine Learning (ML) based policies. Once these are good enough, the ML-based implementations can replace the initial ones. We can imagine, for example, a sensitive use case (such as a 911 dispatcher or air traffic support agent) in which a default, average-but-safe, heuristic policy is able to interact with the human, even in a real environment. To achieve this goal, another learning agent could be trained on this first, average-but-safe, agent experience and learn to make better decisions. Once the learning agent is determined to be safe enough, its implementation can be swapped in.
In the examples above, we put the emphasis on human-in-the-loop learning, but Cogment’s features obviously extend to AI-only multi-agent systems thanks to the framework’s actor formalism. Implementation swapping makes it easy to test implementations, whether in a diverse setting where they all interact with each other, or separately against one another.
Some industries can involve platforms unifying both simulated and real life environments, like ROS for robotics, but for many others, the tech stacks are different. Simulated environments can rely on video game engines like Unreal or Unity, or industrial simulations such as Simulink. Real life environments can involve devices like sensors or other IoT equipment, big-data stacks (like Apache Spark), or industrial digital twins (like Azure Digital Twins). Finally, humans need an interface to interact with those simulations, usually in the form of a GUI or through voice. Tools to build those human-interfacing clients can be as evolved as video game engines, as custom as specifically designed web apps, or as lightweight as mobile apps.
The Cogment SDKs and underlying technologies allow the accommodation of these diverse paradigms without creating any conceptual or technological divides between research, prototyping, production and deployment phases of such systems.
Use cases
The full Cogment White Paper details a couple of use cases, including the test-bed project Quack Arena and the more complex Smart Dynamic Assistant agent in the context of 911 first responder dispatching. We recommend reading about these use cases in the full white paper if you want to dive into modelling, implementation, and results, or learn more about the Cogment framework’s core concepts and architectures.