Comparison of machine learning techniques for predicting energy loads in buildings

achine learning methods can be used to help design energy-efficient buildings reducing energy loads while maintaining the desired internal temperature. They work by estimating a response from a set of inputs such as building geometry, material properties, project costs, local weather conditions, as well as environmental impacts. These methods require a training phase which considers a dataset drawn from selected variables in the problem domain. This paper evaluates the performance of four machine learning methods to predict cooling and heating loads of residential buildings. The dataset consists of 768 samples with eight input variables and two output variables derived from building designs. The methods were selected based on exhaustive research with cross validation. Four statistical measures and one synthesis index were used for the performance assessment and comparison. The proposed framework resulted in accurate prediction models with optimized parameters that can potentially avoid modeling and testing various designs, helping to economize in the initial phase of the project


Introduction
The basic principle of building energy efficiency is to use less energy for operations including heating, cooling, lighting and other appliances, without affecting the health and comfort of its occupants.Improving the energy efficiency of functional buildings brings many environmental and economic benefits such as reduced greenhouse gas emissions and operational cost savings.In many developed and developing countries, energy efficiency has become the main way to meet a rising energy demand (FRIESS; RAKHSHAN, 2017).
In order to reduce the energy demand growth and decrease the amount of energy used associated with buildings, it is critical to understand how energy is distributed throughout a building, and how building parameters contribute to energy consumption (MUSTAFARAJ et al., 2014).Simulation tools can provide a reliable framework for assessing energy distribution in buildings and can help designers to understand the importance of building and weather parameters.However, when considering the decision-making process during the project cycle, carrying out a set of simulations can lead to complex scenarios and may be time-consuming.In order to avoid these drawbacks, machine learning methods can be used for energy demand prediction.These methods require the shortest amount of time in order to model the entire building and they are becoming commonly used for preliminary estimations (MELO et al., 2016).
Although building orientation and layout have been shown to be highly important in reducing building energy consumption in cold and hot climates, the design can be often constrained by the specific characteristics of the building planned and the size, shape, and orientation of the building plot.Energyefficient buildings with special designs such as orientation, insulation and windows are being appropriately adapted to withstand severe weather conditions (HOLOPAINEN, 2017).Natural ventilation (MARCONDES et al., 2010) and natural light (FONSECA; DIDONE; PEREIRA, 2012) also play an important role in energy saving.Additionally, one can have buildings with walls composed by different materials (SPECHT et al., 2010) and the consideration of daylight when evaluating buildings regarding energy performance (DIDONE;PEREIRA, 2010).
In a general context, climatic conditions in residential buildings may be determined by using technologies such as air conditioners and heaters.However, using this equipment constantly can generate high energy consumption.An alternative to reduce the use of cooling and heating equipment, maintaining the desired indoor climate conditions, is to design energy-efficient buildings able to produce such conditions.In order to assess the energy efficiency of a building, its heating and cooling loads should be estimated and analysed based on physical characteristics defined during the design process.Moreover, information such as global location, the purpose of the building, occupation and activity level should be taken into consideration.Among the computational tools for this purpose are those that simulate scenarios which often produce accurate results.For instance, Mustafaraj et al. (2014) developed a 3D model related to building architecture, occupancy and heating, ventilation and air conditioning operations.Two calibration stages were considered and the final model identified monthly savings of energy between 20 and 27%.In the Brazilian context, simulation results indicated possible savings in electricity consumption of up to 26% for optimized designs (KRÜGER; MORI, 2012).
Although helpful and interesting in the design cycle, such tools may require advanced knowledge of the user due to the multidisciplinary aspect.In addition, simulations may consume considerable financial and computational costs and results may vary depending on the software used.It should be mentioned that accurate cooling load (CL) and heating load (HL) estimations and correctly identifying parameters that significantly affect building energy demand are necessary to determine appropriate equipment specifications, install systems properly and optimize building designs.
An alternative approach to tackle these drawbacks is to develop a predictive surrogate model that can accurately predict energy consumption based on a few common factors.If the predictive model accurately estimates the simulation model results, then this model could be used instead of the simulation software to estimate performance for different conditions while potentially requiring less information.Considering the context of energy performance in buildings, various efforts to build alternative surrogate predictive models can be identified in the literature.
Using extensive parametric thermal simulations, Pessenlehner and Mahdavi (2003) examined the influence of morphological parameters that define residential building shapes for heating loads.Based on experiments carried out by Pessenlehner and Mahdavi (2003), Tsanas and Xifara (2012) provided a meticulous statistical analysis to gain important insight of the underlying properties of input and output variables.Using the same data collected by Pessenlehner and Mahdavi (2003), Cheng and Cao (2014) and Chou and Bui (2014) implemented artificial intelligence techniques to predict the energy performance of buildings.Catalina, Virgone and Blanco (2008) (BRE et al., 2016).Chou and Bui (2014) suggested further studies focusing on the optimization of parameters of the model to achieve improvements in their accuracy in predicting heating and cooling loads in buildings.Following their suggestion, the objective of this paper is to use four predictive machine learning techniques which implement a model selection procedure that automatically searches for the best model in a set of user-defined parameters to assess and evaluate the the performance of alternative building designs in the early stages of the design process.In addition, this optimized model can help architects to analyze the relative impact of significant parameters of interest while maintaining energy performance standard requirements.
The remainder of this paper is organized as follows: the second section describes the data set, the machine learning methods, the model selection procedure and the performance measures used in this paper.The third section validates and analyses the performance of all models and compares the results of the simulation.In the same section, a discussion is conducted considering the performance of each method, their strengths and limitations.The last section presents the conclusions.

Method
Machine learning methods can be adopted to estimate \ response from a set of inputs.These methods require a training phase, called supervised training, which considers a dataset drawn from selected variables in the problem domain.The dataset used in the training phase should represent as much as possible the context of the problem in which the tool will be used.This choice may influence their accuracy considerably.

Dataset
The dataset used in this study is available in Tsanas and Xifara (2012).The data were obtained by the simulation of a set of buildings using a software called Ecotect.Ecotect is an environmental analysis tool compatible with building information modeling software, such as Autodesk Revit Architecture, and is used to perform a comprehensive preliminary building energy performance analysis.It includes a wide range of analysis functions with a highly visual and interactive display enabling analytical results to be presented directly in the context of the building model (YANG; HE; YE, 2014).The dataset consists of eight input variables and two output variables, shown in Table 1.A modular geometry system was derived based on an elementary cube (3.5 × 3.5 × 3.5m).In order to generate different building shapes, eighteen such elements were used according to Figure 1.A subset of twelve shapes with distinct relative compactness values (see Table 1) was selected for the simulations, as shown in Figure 2. Source: Tsanas and Xifara (2012).
106  The Relative Compactness (RC) indicator is used to show different building types and it is given by Equation 1: Where: V is the building volume; and A is the surface area of the building.
The surface area was calculated as the total of the wall area, roof area and floor area.Figure 3 shows the details of the wall area, roof area, floor area and overall building height.
Four major orientations were considered in the experiments: north, east, west and south.Three percentages of the glazing area to floor area ratio were 10%, 25% and 40%.Moreover, five different glazing distributions were simulated: (a) uniform: with 25% glazing for each face; (b) north: 55% for the north face and 15% for each of the other faces; (c) east: 55% for the east face and 15% for each of the other faces; (d) south: 55% for the south face and 15% for each of the other faces; and (e) west: 55% for the west face and 15% for each of the other faces.
Additionally, no glazing areas are simulated in the experiment.Finally, all the buildings were rotated to face the four cardinal directions.Based on this simulation setup, the dataset comprises 12 × 3 × 5 × 4 + 12 × 4 = 768 samples of buildings.Table 1 provides the detailed input and output parameters in this study.
The simulation assumes the buildings are in Athens, Greece and each block is occupied by seven people doing sedentary activities, totaling a mean consumption of 70W.The indoor settings of the blocks were defined as: clothing: 0.6 clo, humidity: 60%, air speed: 0.30 m/s, lighting level: 300 lux (equivalent to five 9W LED lamps considering the lamp luminous efficacy as 80 lm/W and the given dimensions of the modular cube).The sensitive and latent internal heat gains were assumed as 5W/m² and 2 W/m², respectively.The air infiltration rate was 0.5 and the air change rate with wind sensitivity was 0.25 air charger per hour.Air change rate with wind sensitivity is an Ecotect parameter that modifies the air infiltration rate based on the current wind speed.For the thermal properties, a mixed mode with 95% efficiency was used, a thermostat range of 19°-24° C, with 15-20 h of operation on weekdays and 10-20 h at weekends.It was considered that all buildings were constructed with the same material, all of which had the lowest U-value.The lower the U-value is, the better the material is as a heat insulator.The characteristics used (U-values between brackets) were: walls (1.780 W/m2K), floor (0.860 W/m2K), roofs (0.500 W/m2K) and windows (2.260 W/m2K).Additional details of the simulation experiments are provided by Tsanas and Xifara (2012).

Machine learning methods
In this study, the algorithms were programmed in Python 2.7 programming language using the sciPy and numPy scientific computing libraries.The pandas package was used for data processing and analysis.The regression algorithms and cross validation approaches were implemented using the Scikit-learn machine learning library (PEDREGOSA et al., 2011) and the ffnet package (WOJCIECHOWSKI, 2011).The following paragraphs describe the machine learning methods used in this paper.
Decision trees (DT) build classification or regression models in the form of a tree structure.They break down a dataset into smaller and smaller subsets while at the same time an associated decision tree is incrementally developed.The final result is a tree with decision and leaf nodes (HASTIE; TIBSHIRANI; FRIEDMAN, 2009).They take a set of attributes as input and return a predicted value for the respective input.The decision, associated with the decision node, is made by running a test sequence (DUMONT, 2009): each internal node of the tree corresponds to a test of the value of properties and the branches of this node identify possible test values.Each leaf node specifies the return value if the leaf is reached.In this method, the estimated parameter is the maximum depth of the tree.

Support
Vector Machines (SVM) (SHANMUGAMANI; SADIQUE; RAMAMOORTHY, 2015) are machine learning algorithms performing a linear combination of attributes by functions called kernel functions aimed to assign a class to a given sample.Different types of kernel functions can be used and different parameters can be varied according to the selected kernel.The SVM is commonly formulated as an optimization problem as follows (Equation 2: Eq. 2 Where: yi are the outputs; xi are the input samples; φ is used to transform the data to a highdimensional space; w represents the decision function coefficients; the constant C > 0 is the error separating hyperplane; 108 h is the number of support vectors; and ξ is used to penalize the objective function. The dot product φ' (zi ) φ (zi ) is replaced by kernel K(zi, zj) that has some special properties.In this study, we used the linear kernel represented by K(zi, zj) = zizj and the radial basis kernel K(zi, zj) = exp (−γ||zi -zj ||), where γ is a parameter of the radial basis function.The performance of the above methods depends on the appropriate choice of parameter C for the linear kernel and γ and C for the RBF kernel.
The Random Forest (RF) (HASTIE; TIBSHIRANI; FRIEDMAN, 2009) is an ensemble learning method for classification that operates by building k decision trees from the training set in k iterations.
In each iteration, the training algorithm firstly randomly selects a set of samples from the training set.To reproduce a decision tree from this subset, the RF randomly chooses a subset of features as the candidate features for each node.Thus, each decision tree is built through the ensemble using random independent subsets of both features and samples.The prediction of a new sample class is performed as follows: each individual classifier votes and the most voted class is elected.The minimum number of samples in newly created leaves is the parameter of this method.
The Multi-Layer Perceptron (MLP) (HAYKIN, 2008;NISSEN, 2005) was used in various areas, performing pattern recognition functions, control and signal processing.This architecture has one or more hidden layers, which comprise computational neurons, also called hidden neurons.The activity of hidden neurons is involved between the external and output layers of the network.Including one or more hidden layers, the network is able to capture non-linear relationships between inputs and outputs.This algorithm uses a number of hidden layers and neurons, the training algorithm, the connectivity and the normalization flag as parameters.If the renormalization flag is set to true, then the data are renormalized.The number of hidden layers is represented as a list of values.For instance, the configuration [5,5] indicates 2 hidden layers of 5 neurons each.The algorithm addresses the following methods to optimize the weights: lbfgs refers to the Quasi-Newton approach, sgd refers to the Stochastic Gradient Descent Method, and tnc is the gradient information on the truncated Newton algorithm.The connectivity can be simply connected or fully connected.Figure 4 exemplifies the types of connectivity.Note: the scheme of the simply connected scheme is shown in (A), where the neurons are connected only to the neurons of the previous layer, and in (B) there is the fully connected scheme, where a neuron is connected to all its predecessors.

Grid Search with Cross validation
In order to find the best predictive model and prevent overfitting, an approach based on the grid search and k-fold cross validation was implemented.It is well known that the suitable choice of parameter values of a machine learning method can cause a considerable impact on its accuracy.Furthermore, the optimal values for the parameters can vary according to the problem.Grid Search is a strategy for automatic and optimized parameter adjustments of the model.This technique builds a mesh from sets of predefined values for each parameter.For each possible combination of parameters, the predictive model is trained with some of the data, generating a set of outputs.The best parameter values are those that produced the best set of outputs.The number of configurations for the method is given by Equation 3: Eq. 3 Where: P is the number of parameters; and is the number of values chosen for the k-th parameter (BERGSTRA; BENGIO, 2012).
In the training step, the strategy known as k-Fold cross validation was adopted, which divides the data set into k sets.The model is trained on k-1 sets and validated with the remaining part.Training and testing steps are repeated k times alternating the training and the testing sets.Figure 5 illustrates the application of k-Fold cross validation.In this study, k = 10 was adopted.

Performance evaluation
Multiple evaluating criteria were used to compare the performance of prediction models.Given a data set composed by N observations, the performance measures the Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE) and Coefficient of Determination (R2), which are given by the following Equations (Equations 4-7): Eq. 5 Eq. 6 Eq. 7 Where: yi is the expected value for the output variable (HL or CL) with the input xi; zi is the predicted value for the same input xi; and yM is the average of the predicted values of the output variable y.
In order to obtain a comprehensive performance measure, the measures known as RMSE, MAE, MAPE and 1-R2 are combined into a Synthesis Index (SI), as follows (CHOU; BUI, 2014) (Equation 8).Eq. 8 Where: M is the number of performance measures; and Pi is the performance measure.
The SI range is 0-1 and an SI value close to 0 indicates a highly accurate prediction model.

Results and discussion
Each machine learning method was trained and validated in 50 independent runs.Table 2 shows the set of parameters used as input for the grid search procedure, as well as the grid size.The machine learning methods appear in the first column: Decision Trees (DT), Multi-Layer Perceptron Neural Network (MLP), Random Forests (RF) and Support Vector Machines (SVM).The second column describes the parameter name for each method, while the third column shows the corresponding parameter settings.The last column shows the grid size, calculated as Equation (3).For example, the grid size for MLP is equal to 80: there are ten configurations for hidden layers, 4 distinct training algorithms and two connectivity schemes.Therefore, the grid size has 10 x 4 x 2 = 80 possible arrangements of parameters.Other parameters involved in the methods, not defined for this step, were kept with the default values as set in the implementations in the scikit-learn package (PEDREGOSA et al, 2011).
Figure 6 illustrates the values of the four statistical measures averaged in 50 runs for the predicted heating and cooling loads.In each bar, the vertical black line indicates the standard deviation.For all machine learning methods implemented here, it can be observed that heating loads can be estimated more accurately than cooling loads.This conclusion is in agreement with other studies in the literature.Tsanas and Xifara (2012) conducted an extensive statistical analysis on the same dataset used in this paper.They found both heat and cooling loads are strongly positively correlated to Relative Compactness and overall height, and strongly negatively correlated with the surface area and roof area.The correlation coefficients and details of the statistical procedure can be found in Tsanas and 110 Xifara (2012).In their study, they concluded that heating loads are estimated with considerably greater accuracy than cooling loads because some variables interact more efficiently to provide an estimate of heating loads.
Taking into consideration the heating loading predictions, it can be observed that all methods produced similar results for all the statistical measures.However, random forests produced the best values for all statistical measures.The good performance of random forests can be explained by the internal optimization problem that is solved during the training step, which internally accounts for redundant and interacting variables, leading to better prediction abilities.On the contrary, a similar behavior cannot be observed for cooling loading predictions.Clearly, multi-layer perceptron neural networks and support vector machines outperformed random forests and decision trees.The underlying relationships for cooling loads are quite complicated to be adequately captured by random forests and decision trees.In addition, as nonlinear estimators, MLP and SVM show more flexibility in their model parameters which lead to better predictions.To compare the performance of the developed models in this paper we used the Synthesis Index (SI).Table 3 lists the summary of averaged statistical measures for the cooling load (CL) and heating load (HL) for each model.Random forests had the best results based on the SI values for the heating load, while the multi-layer perceptron neural network model produced the best SI for cooling loads.Particularly, RF performs better for heating loads, as can be seen when comparing the SI values produced by RF and the remaining predictors.The conclusions obtained when analyzing the cooling loads are different: the support vector machine and multi-layer perceptron neural network show similar statistical measures.However, the neural network performed slightly better in all the measures.The previous analyses suggest two different machine learning models to predict heating and cooling loads.Interestingly, MLP and SVM, which produced the best statistical measures for cooling loads, presented the worst performance for heating loads.Tables 5 and 6 present the statistical measures for the best models in this paper for both cooling and heating loads.In order to provide a comparison with other models in the literature, the tables also show the results obtained from other studies.Tsanas and Xifara (2012)

Conclusions
This paper evaluated the application of four machine learning methods to predict energy efficiency in residential buildings: decision trees, random forests, multi-layer perceptron neural networks and support vector machines.Their parameters were adjusted through the grid search and trained with cross validation.The dataset consists of a data set of 768 simulated buildings.
From the results obtained, random forests proved to be the best option for predicting heating loads while multi-layer perceptron neural networks produced the most accurate results for cooling loads.Support vector machines obtained accurate predictions for cooling loads, but with a slightly lower performance.After comparing them with the machine learning methods found in the literature, the results obtained in this paper show that the search in the parameters can generate accurate models, and are an alternative for early prediction of building cooling and heating loads.However, the machine learning methods developed here, even though accurate, are only applicable to the twelve specified building types in the simulated dataset.
The models with optimized parameters developed in this study are able to evaluate different sets of parameters, resulting in simulation settings that can potentially avoid modeling and testing various prototypes, helping to save resources in the initial phase of the design.Expecting to improve the results presented here, other machine learning methods can be implemented in further research.In addition, the grid search strategy can be replaced by an optimization evolutionary algorithm to set the parameters of the machine learning methods.

Figure 1 -
Figure 1 -Generation of shapes based on eighteen cubical elements

Figure 3 -
Figure 3 -Generic definition of building areas

Table 1 -Representation of the input and output variables
The Synthesis Index (SI) values close to zero indicate a highly accurate prediction model.The performance metrics presented in the table are the Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), Root Mean Square Error (RMSE) and Coefficient of Determination (R2).The machine learning models applied are Decision Trees (DT), Multi-Layer Perceptron Neural Network (MLP), Random Forests (RF) and Support Vector Machines (SVM).

Table 4 -Average computing time to perform the grid search and build the models with optimized parameters
Note: the machine learning models tested for heating (HL) and cooling loads (CL) are Decision Trees (DT), Multi-Layer Perceptron Neural Network (MLP), Random Forests (RF) and Support Vector Machines (SVM).The real time (in seconds) is averaged on 10 runs.

Table 6 -Cooling load -comparison between the results of our study and those in the literature used as a reference
Note: the best results are highlighted in bold.The performance metrics presented in the table are the Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), Root Mean Square Error (RMSE) and Coefficient of Determination (R 2 ).Although the predictive models proposed here and those found in the literature produced accurate results, they should be used carefully.It should be mentioned that they are only applicable to the twelve specified building types considering the simulated experiment setup.Besides, even in computer simulations, uncertainties in thermal and physical properties of materials can influence thermal performance (SILVA; ALMEIDA; GHISI, 2017) and may be considered.Comprehensive tests using real data are necessary to assess the performance of the methods in real world situations, leading to the development of new and improved models.Some authors using data measured from a wireless sensor network have identified that atmospheric pressure, exterior air temperature and wind speed are important parameters to predict energy loads (CANDANEDO; FELDHEIM; DERAMAIX, 2017).