Is there one algorithm that outperforms the rest, across diverse input settings?
No single algorithm offers uniformly low error. At small scales, data-dependent algorithms dominate; at large scales data-independent algorithms dominate.
Above we rank algorithms by their average regret
. The regret of an algorithm in one experimental configuration is the error of the algorithm divided by the error of the best algorithm for that configuration. We consider a collection of configurations (varying dimensionality, scale, domain size, and shape) and average the regret values over all configurations. A regret value of 1.0 is optimal, but achieving optimal regret is not practically possible with the current state-of-the-art algorithms because the best algorithm for a configuration
cannot be selected without knowledge of the input data.
By adjusting the settings on the right, one can view regret on specific sets of input configurations. For example, by setting Scale to 1000, we can view the best performing algorithms at a small scale, but still across all domain sizes and shapes.
When comparing algorithms, it is important to remember that this comparison is with respect to a specific class of tasks: answering sets of range queries on 1-dimensional or 2-dimensional data. Some of the algorithms above have been designed specifically for such tasks, and only support these tasks, and are therefore specialized. Other algorithms are more general, supporting these tasks and others. For example, the MWEM algorithm can support a general class of queries (sets of linear queries) in high dimensions, while the DAWA algorithm is limited to the 1 and 2-dimensional case.