Double DQN Navigation project using Unity environment An agent is trained to navigate in a square world to collect yellow bananas in a discrete action space. The gif shows an agent in action trained with Double DQN algorithm for 1000 episodes. The goal is to collect as many yellow bananas as possible while avoiding blue bananas.
Code
Deep Deterministic Policy Gradients for continuous control An RL agent is trained to maintain the moving ball at the target location. The environment has contineous action space corresponding to torque applicable to two joints. A DDPG algorithm is implemented to solve the environment.
Code
Multi-agent continuous control for competition & collaboration The environment has multiple agents seeking to compete and collaborate for tennis play. The multi-agent version of DDPG is implemented to train the agents in contineous action space.
Code