Auto-WLM: machine learning enhanced workload management in Amazon Redshift
This paper appeared in Sigmod'23. What? Auto-WLM is a machine learning based *automatic workload manager* currently used in production in Amazon Redshift. I thought this would be a machine learning paper, you know deep learning and stuff. But this paper turned out to be a practical/applied data systems paper. At its core, this paper is about improving query performance and resource utilization in data warehouses, possibly the first for a database system in production at scale. They are not using deep learning, and rightly so! The main take-away from the paper is that locally-trained simple models (using XGBoost , a decision tree-based model built from the query plan trees) outperformed globally trained models, likely due to their ability to "instance optimize" to specific databases and workloads. They are using simple ML. And it works. Why? This is an important problem. If tuning is done prematurely, resources are unnecessarily wasted, and if it is done too late, overall