The optimization of the multistage refueling decision process was studied by the heuristic approach for four-region batch refueling including shuffling. The current state of the process is assumed to be well characterized by so called heuristic features, as excess reactivity or peaking factor. The features are summed up with each weight, defining the decision evaluation function which should be maximized at each refueling by the best decision. Thus the final criterion, i.e., the average discharge burnup at the end of the whole reactor life can be regarded as a function of the weight and is maximized in the weight space by the hill climbing algorithm. The approach can also be interpreted as an attempt to determine, through learning, the general importances of the rules of thumb in the refueling policy as maximization of the excess reactivity or power flattening. A numerical simulation is given, and the maximum burnup, the refueling scheme, and the optimal weight are discussed in relation to the power-peaking factor constraint. Though the method is not guaranteed as for the optimality, reasonable solutions are obtained and the intuitive understanding of the process is possible by discussing the optimum weight of each rule.