Explore env, the env state will change and give you a reward, take next action, get reward, based on the reward and new learning of env based on previous reward take further action. Don’t have any hardcoded alogithm, you just give the goal to the agent and what are the actions that can be taken. AI sometimes does better job than preprogrammed things, as pre programmed will work in certain way.

Types 1. #value_based 2.policy_based and Model based learning