Paper Title
Data-Driven Urban Energy Simulation for Mega-City By Integrating Machine Learning Into an Urban Building Energy Simulation Modeling: A Case Study of Guangdong-Hong Kong-Macao Greater Bay Area

Abstract
Understanding regional building energy patterns is the prerequisite to efficiently and effectively promote sustainable urban development. Previous studies have proposed various data-driven methods to investigate the relationship between building energy consumption and hundreds of potential influencing features. To identify the critical features, this study develops a data-driven random forest (RF) based framework, consisting of 24,764 buildings in 881 cityblocks, to model the relationship between city-block-level building-oriented features and building energy consumption. The RF model is found to outperform other machine learning models including logistic regression, k-nearest neighborhood, support vector machine, and decision tree models in the predictive accuracy of the classification problem. Keywords - Building energy modeling