The .Rmd file contains R code for demonstration of two reinforcement learning methods to solve the multi-armed bandit problem. The .url file shows how the RMarkdown file looks like after being knitted. Click here to view the project.
Methods used include:
- Upper Confidence Bound (UCB)
- Thompson sampling using conjugate priors
- Thompson sampling using Markov chain Monte Carlo (MCMC)