Prateek Aggarwal

and 2 more

In this part of the two-part paper, simulations of Probabilistically-Switch-Action-on-Failure learning automaton (PSAFA) are presented in various stationary and non-stationary environments. The PSAFA is a novel fixed structure stochastic automaton (FSSA) framework, and its analytical model is presented in detail in Part 1 of this paper set. The key differentiating feature of this automaton is that it allows action switching in every state. We anticipate that this feature attributes PSAFA dynamic properties that make certain aspects of its performance superior to other FSSA that do not possess this property. In this paper, simulations of the PSAFA in comparison with other FSSA are considered in two types of environments: a stationary environment (with fixed penalty probabilities) and a non-stationary environment, where the penalty probabilities are changing in time periodically as a sinusoidal function. In both cases the simulation demonstrates a dramatic difference in performance for these types of learning automata. The PSAFA shows its huge advantage in adaptability that leads to a better performance for the length of the simulation up to 30,000-150,000 steps. Only for very long stationary conditions Tsetlin automata outperforms PSAFA. In the case of sinusoidal modulations, the PSAFA tremendously outperforms other types of FSSA for all modulation frequencies and for all depths D>3. The performance of PSAFA does not deteriorate with increasing modulation frequency, while other FSSA are not resilient to that increase.

Prateek Aggarwal

and 2 more

We present the novel concept of Probabilistically-Switch-Action-on-Failure learning automaton (PSAFA). The PSAFA is a fixed structure stochastic automaton (FSSA), characterized by a fan-shaped state transition diagram where each branch of the state space is a chain of D states, and is associated with a particular action. The first states of all chains form a circle of initial states. The PSAFA can switch from a present state in any chain to the initial state of the next chain in the circle, on each failure, with some finite probability. This probability, which plays the role of an action- switching probability, is a function of the distance of the present state from initial state of its branch. The learning behavior of PSAFA is determined by the dependence of the action switching probability on the distance from the initial state. The probabilistic action-switching capability distinguishes PSAFA from conventional FSSA that have deterministic action selection at each state, and only some states transit to states with a different Probabilistically-switch-action-on-failure Automaton action. This action-switching capability at any time is also typical for conventional variable structure stochastic automata (VSSA) but it comes with added computational complexity. VSSA are more adaptive than FSSA in non-stationary environments because of this action-switching capability. We believe that the addition of this capability should also make the PSAFA more adaptive in non-stationary environments than classical FSSA while preserving the simplified computational complexity of FSSA. The effectiveness of the proposed framework is demonstrated through the theoretical analysis of optimality of the PSAF learning automaton in stationary environments in part 1 of this 2-part paper.