A closer look at length-niching selection and spatial crossover in variable-length evolutionary rule set learning
- We explore variable-length metaheuristics for optimizing sets of rules for regression tasks by extending an earlier short paper that performed a preliminary analysis of several variants of a single-objective Genetic Algorithm. We describe more in depth the algorithm and operator variants used and document design decisions as well as the rationale behind them. The earlier work identified crossover as being detrimental for solution compactness; we take a closer look by analysing convergence behaviour of the variants tested. We are able to conclude that using one of the investigated crossover operators trades prediction error outliers for more smaller errors at the expense of solution compactness. The positive effects of length-niching selection (holding off premature convergence to a certain solution length) are undetectable in fitness values in the settings considered. We further perform comparisons with already-known rule-based algorithms XCSF and CART Decision Trees and conclude that,We explore variable-length metaheuristics for optimizing sets of rules for regression tasks by extending an earlier short paper that performed a preliminary analysis of several variants of a single-objective Genetic Algorithm. We describe more in depth the algorithm and operator variants used and document design decisions as well as the rationale behind them. The earlier work identified crossover as being detrimental for solution compactness; we take a closer look by analysing convergence behaviour of the variants tested. We are able to conclude that using one of the investigated crossover operators trades prediction error outliers for more smaller errors at the expense of solution compactness. The positive effects of length-niching selection (holding off premature convergence to a certain solution length) are undetectable in fitness values in the settings considered. We further perform comparisons with already-known rule-based algorithms XCSF and CART Decision Trees and conclude that, even without parameter tuning, the best-performing of the variants of the GA outperforms XCSF on the tasks considered, comes close to being competitive with respect to test Mean Absolute Error and creates similarly compact solutions as the Decision Tree algorithm. The 54 learning tasks considered are synthetic and in the limit learnable by rule-based algorithms.…