The name for our project, Shallow Blue, is a play off of Deep Blue, the name of the famous IBM chess-playing computer that defeated Garry Kasparov.
Our project uses reinforcement learning to effectively gather resources underwater. Using Minecraft Malmo as our testbed, we dynamically create underwater environments with various resources scattered at the bed of the water. Our AI agent swims under water, finds resources, and swims up for air. Our goal for this project is to improve our agent’s ability to find resources, stay alive, and navigate new underwater terrain!
Observations:
Our network receives a 3x5x5 observation array which consists of surrounding blocks and entities. These 75 observations take on different numerical values depending on the block or item seen (diamond ore, coal ore, diamonds, coal, redstone blocks, TNT blocks).
Continuous Actions:
Rewards:
Terminal States: 500 steps or drowning
Our environment is a 10x40x40 swimming pool with various resources and obstacles on the seabed and surface of the water.