36ERWA::ERWA(
size_t _K,
bool _eg,
bool _a)
45, alpha_is_1_over_n(_a)
51 double r = std::numeric_limits<double>::min();
53 for (
int i = 0; i < r_hat.size(); i++) {
65 cerr <<
"STOP IT!! EXP3 should be receiving reward..." << endl;
80 cerr <<
"STOP IT!! EXP3 should be choosing..." << endl;
84 if (alpha_is_1_over_n) {
85 r_hat[
choice] = r + ((1 /
static_cast<double>(n)) * (
reward - r));
92ostream& operator<<(ostream& out,
const ERWA& erwa) {
93 out <<
"r_hats:" << endl;
94 for (
size_t i = 0; i < erwa.K; i++)
95 out << erwa.r_hat[i] <<
" ";
Implementation of the ERWA algorithm for multiarmed bandits.
static boost::random::mt19937 random_generator
Random source.
size_t choice
Store the last choice made.
size_t choose()
Choose using the current state.
bool choose_next
Belt-and braces: warn if choose/reward happens in the wrong order.
size_t find_max() const
Find the index of the currently maximum r_hat.
void reward(double)
Provide reward for the most recent choice.