Reinforcement learning demo version of Copter which uses a Q learning algorithm to control the helicopter, assigning negative rewards for hitting the walls and ceilings. The environment is described using four variables, which define the copter's position with relation to the walls and the floor/ceiling.
This version is pre-learning. In other words, the state-action variables have not been set. You can watch the learning agent explore the environment and, after some time, it will learn to avoid the walls, ceiling and floor.