Abstract
This study presents an improved technique that uses many machine-learning models to estimate the compressive strength of concrete. The goal of the project is to increase the precision of strength predictions based on the age and composition of concrete mixes. Cement, fly ash, water, superplasticizer, coarse and fine aggregate, and sample age are among the materials. Megapascals (MPa) are used to quantify compressive strength. To determine the connections between mix proportions, age, and strength, a variety of blends were examined. Machine learning techniques including Random Forest, XGBoost, AdaBoost, Bagging, Support Vector Regression, and Linear Regression were used. The efficiency of the model was assessed using performance indicators such as accuracy, R-squared (R2), Mean Absolute Error (MAE), and Mean Squared Error (MSE). With an MAE of 2.2, MSE of 10.5, R2 of 0.94, MAPE of 8.5, RMSE of 3.25, and accuracy of 0.92, XGBoost (optimized) performed the best. This model performed noticeably better than others, highlighting how machine learning may improve predictions of compressive strength and optimize the composition of concrete, thus promoting the fields of materials science and civil engineering.
Keywords:
Concrete materials; RF; Strength; SVR; Construction; Machine learning algorithms
1. INTRODUCTION
Concrete is one of the most extensively used building materials globally, celebrated for its asset, durability, and versatility [1]. Its mechanical properties, particularly compressive strength, are crucial for the structural integrity of various civil engineering applications [2]. As urbanization accelerates and construction demands increase, optimizing concrete formulations to enhance performance has become paramount [3]. Outmoded approaches for assessing the compressive asset of concrete involve extensive experimentation, which can be time-consuming and costly [4]. Therefore, the integration of machine learning (ML) techniques into concrete research presents a promising avenue for improving prediction accuracy and reducing the reliance on empirical testing [5]. The compressive strength of concrete is inclined by various factors, including the proportions of its constituents, the curing process, and the age of the concrete [6].
Common ingredients in concrete mixes include cement, water, coarse aggregate, fine aggregate, and supplementary products such as fly ash and superplasticizers [7]. Each of these components plays a vital role in determining the overall performance of the concrete [8]. For instance, the use of fly ash not only improves workability but can also enhance long-term strength and durability by mitigating shrinkage and reducing permeability [9]. Similarly, superplasticizers are added to improve the flow characteristics of the concrete mix, allowing for better compaction without increasing water content, which is critical for achieving optimal strength [10]. Given the complexity of concrete’s behavior and the numerous variables affecting its properties, machine learning offers a powerful tool for modeling and prediction [11].
Utilizing algorithms that can learn from data patterns, researchers can develop predictive models that accurately estimate compressive strength based on input variables [12, 13]. This research seeks to investigate the performance of several ML algorithms, such as Linear Regression (LR), Support Vector Regression (SVR), AdaBoost, Bagging, Random Forest, and XGBoost, in forecasting the compressive strength of material [14]. The motivation behind this investigate is to evaluvate the predictive capacity of concrete strength models by employing advanced ML techniques [15, 16]. Previous studies have shown that machine learning can yield more reliable predictions compared to traditional statistical methods, thereby facilitating better decision-making in concrete mix design and optimization [17].
Moreover, as the construction industry increasingly adopts digital technologies and data-driven approaches, the relevance of machine learning applications in concrete technology is more critical than ever [18, 19]. This study’s objectives include collecting and analyzing a comprehensive dataset that encompasses various concrete mix proportions, supplementary materials, and curing times [20, 21]. By leveraging machine learning, the research seeks to identify the most influential factors affecting compressive strength and develop a robust predictive model [22]. The ultimate goal is to provide a framework for optimizing concrete formulations that can contribute to safer, more sustainable construction practices [23,24,25].
The importance of this study extends beyond academic interest; it has practical implications for engineers and construction professionals [26]. As building codes and regulations evolve to emphasize sustainability and resilience, developing concrete with optimized performance characteristics will be essential [27]. Machine learning’s ability to process vast datasets and uncover intricate relationships will enable the industry to adopt innovative materials and methods, that align with contemporary sustainability goals [28]. This study that goals to connection the break between traditional concrete science and modern computational techniques, offering insights that can drive advancements in concrete technology and application [29].
Machine learning (ML) models have revolutionized the field of materials science and civil engineering by offering precise and reliable predictions of complex material behaviours [30]. These models excel in analyzing large datasets, identifying hidden patterns, and providing data-driven insights that traditional statistical methods often overlook [31]. ML algorithms like RF, XGBoost, and SVR can adapt to nonlinear relationships and interactions among variables, enhancing the accuracy of predictions [32]. Furthermore, ensemble methods combine the strengths of multiple models to reduce errors and improve robustness, making them invaluable tools for optimizing material compositions and predicting properties such as concrete compressive strength [33].
The aim of this study is to enhance the accuracy of compressive strength predictions for concrete using advanced machine learning techniques [34]. The primary goal is to develop predictive models that account for the nonlinear relationships between mix components—cement, fly ash, water, superplasticizer, aggregates—and the age of concrete [35]. By leveraging models such as XGBoost, RF, and AdaBoost, the study seeks to identify optimal compositions and improve material performance [36]. This research aims to bridge gaps in predictive accuracy compared to traditional methods, ultimately contributing to advancements in materials science and promoting data-driven innovations in civil engineering [37].
This research aims to investigate the possible of ML systems in forecasting the compressive strength of concrete by analyzing various mix designs and material properties [38, 39]. The specific research objectives are as follows:
2. DATA COLLECTION AND DATASET COMPILATION
To compile a comprehensive dataset encompassing a wide range of concrete mixes, including varying proportions of Growing older, particles of fly ash, water, asphalt, the ingredient, and a mixture of fine and coarse aggregate [20]. To ensure the dataset represents diverse environmental conditions and curing methods to enhance the generalizability of the machine learning models [17].
The data collected for this study includes experimental data on the compressive strength of concrete [40]. The dataset comprises various concrete mix proportions, including the types and quantities of materials such as cement, fly ash, water, superplasticizer, coarse and fine aggregates, and sample age [41]. The data also includes the corresponding compressive strength values, measured in Megapascals (MPa), for each mix. This data is used to train and test machine learning models for predicting concrete strength based on the mix composition and age [42].
Exploratory Data Analysis (EDA): Perform a comprehensive exploratory data analysis to uncover patterns, trends, and relationships between the input variables (mix compositions) and the target variable (compressive strength). Use graphical visualizations, such as scatter plots, histograms, and heatmaps, to gain a clearer insight into the correlations and distributions within the dataset [30].
Exploratory Data Analysis (EDA) was integral in understanding the relationships between concrete material components and compressive strength. Through statistical visualization techniques such as scatter plots, correlation matrices, and box plots, the variations in cement, fly ash, water, superplasticizer, aggregates, and sample age were analyzed. EDA revealed key patterns and nonlinear dependencies in the dataset, such as the influence of water-to-cement ratio on strength. It also helped identify outliers and data distribution trends, which informed the preprocessing and feature engineering stages. This systematic analysis ensured robust model input, enhancing the forecasting accuracy of ML algorithms.
Feature Manufacturing and Collection: To identify and generate relevant structures that can improve the forecasting accuracy of the models, such as interaction terms or derived variables. To employ feature selection techniques to determine the most significant variables affecting compressive strength and eliminate redundant or irrelevant features [31].
Implementation of Machine Learning Algorithms: To apply various ML - algorithms, including LR, SVR, AdaBoost, Bagging, Random Forest, and XGBoost [32]. To optimize hyperparameters for each model to enhance their performance in predicting compressive strength [33].
Model Assessment and Presentation System of measurement: To estimate the presentation of each ML algorithm using appropriate metrics, including Mean Absolute Error (MAE), Mean Squared Error (MSE), R-squared (R2), Mean Absolute Percentage Error (MAPE), Root Mean Squared Error (RMSE), and accuracy [34]. To conduct cross authentication to ensure the robustness and dependability of the model forecasts.
Comparison assessment of ML Models: To perform a qualified analysis of the results obtained from different ML-algorithms to categorize the most effective method for forecasting compressive strength [35]. To analyse the assets and weaknesses of each model based on their presentation metrics and predictive accuracy.
Development of a Predictive Framework: To develop a user-friendly predictive framework or software tool that can be utilized by engineers and researchers for estimating compressive strength based on input concrete mix parameters. To provide guidelines for optimizing concrete formulations based on the insights gained from the machine learning models [36].
Recommendations for Practical Applications: To offer recommendations for integrating machine learning techniques into concrete mix design practices in the construction industry. To discuss the implications of the findings for improving construction efficiency, sustainability, and structural performance.
Contribution to Future Research: To donate to the body of information on the submission of ML in civil engineering and materials science, laying the groundwork for future studies exploring more advanced algorithms or additional performance metrics [37]. To suggest avenues for further research, including the exploration of other performance indicators (e.g., tensile strength, durability) using machine learning methodologies. By achieving these objectives, this study aims to enhance the understanding of the parameters indicating compressive strength in concrete and promote the effective use of machine learning for predictive modelling in civil engineering applications [38]. Table 1 shows ML - Models Accepted to approximation concrete compressive strength.
Background on Machine Learning Algorithms: ML systems have revolutionized the field of civil engineering by offering powerful tools for extrapolative modelling and optimization. In concrete technology, ML is utilized toward model complex relationships between input variables such as material properties, mix compositions, and output parameters like compressive strength. These processes learn from historical data, uncover designs, and use them to predict future behaviour with higher accuracy than traditional statistical models.
Machine learning algorithms can be approximately considered into two types: non-ensemble models and ensemble models [39]. Non-ensemble models rely on a single algorithm to predict outcomes, while ensemble models associate the forecasts of multiple base models to improve overall presentation and toughness in figure 1 and 2.
3. NON-ENSEMBLE MODELS
Non-ensemble models, on the other hand, rely on a single model to make predictions. These models, such as SVR and LR, focus on finding the optimal function or decision boundary for forecast. Multiple Linear Regression (MLR): MLR is a widely applied and straightforward machine learning technique that models the connection between a dependent variable (compressive strength) and several self-determining variables (such as cement content, water content, etc.). MLR presumes a linear association between the input features and the output variable [40].
The equation for MLR is given as:
Where, Y = predicted compressive strength (dependent variable), β0 = intercept β1, β2,…, βn = coefficients of the independent variables, X1, X2,…, Xn = independent variables (input features such as cement, water, etc.), and, € = error term or residual.
Application in Concrete: MLR can be applied to predict the compressive strength of concrete by fitting the relationship between material proportions (cement, water, aggregates, etc.) and strength.
SVR: It is a supervised learning algorithm derived from Support Vector Machines (SVM). It performs regression analysis by finding the best-fit hyperplane that minimizes the error within a specified threshold. SVR is well-suited for handling non-linear relationships by utilizing different kernel functions.
SVR minimizes the following objective function:
Where, W = weight vector, C = regularization parameter that controls the trade-off between minimizing training error and model complexity, L€ = epsilon-insensitive loss function, yi = real value of the mark variable, and, f(xi) = predicted value using the model.
Application in Concrete: SVR is effective for predicting compressive strength when the relationship between concrete properties and strength is non-linear. It can be fine-tuned using dissimilar kernel functions (e.g., linear, polynomial, RBF).
Ensemble Models: Ensemble models syndicate the forecasts of several separate models to improve correctness and reduce overfitting. These models, such as RF, XGBoost, and AdaBoost, leverage the strengths of different learning algorithms and provide more robust predictions by aggregating the results from multiple decision trees or learners. Ensemble models improve prediction accuracy by combination the outputs of multiple base models. They reduce overfitting and variance, making them more robust than individual models.
AdaBoost (Adaptive Boosting): AdaBoost is an collaborative learning performance that sequentially trains multiple weak learners, typically decision trees. Each subsequent learner focuses on instances that were previously misclassified, thereby “boosting” the model’s performance. The final estimate is a subjective sum of the forecasts of each weak learner
AdaBoost’s prediction function can be expressed as:
Where, M = number of weak learners, αm = weight assigned to the mth weak learner, and hm(x) = prediction of the mth weak learner.
Application in Concrete: AdaBoost can improve the accuracy of compressive strength predictions by focusing more on samples where prior models had higher errors, thus refining the overall model performance.
Random Forest (RF): Random Forest is an ensemble technique that builds numerous decision trees during the training process. Each tree is qualified on a casually chosen subset of the dataset, and in regression tasks, the final prediction is obtained by averaging the outputs of all the trees. This method reduces variance and improves accuracy by aggregating the predictions from multiple trees.
For deterioration tasks, the forecast is the average of the individual tree predictions:
Where, N = number of trees and, hi(x) = prediction from the ith tree
Application in Concrete: Random Forest is useful for predicting compressive strength as it can handle both linear and non-linear relationships, and it is resistant to overfitting even with large datasets.
XGBoost (Extreme Gradient Boosting): XGBoost is an optimized implementation of the gradient boosting algorithm, designed to be efficient and highly accurate. It combines multiple weak learners (decision trees) and minimizes the loss function through gradient descent. XGBoost introduces regularization to prevent overfitting and uses techniques such as tree pruning and parallel computation to improve performance.
The objective function in XGBoost is:
Where, L(yi, y^i) = loss function between the actual and predicted values, Ω(fk) = regularization term for the complexity of the model.
Application in Concrete: XGBoost is highly effective for predicting compressive strength due to its aptitude to handle multifaceted and non-linear associations between the input features and the target variable, while also being computationally efficient.
Bootstrap Aggregating: Bagging, or Bootstrap Aggregating, is an collaborative technique that improves model constancy and accuracy by training numerous base models (typically decision trees) on dissimilar subsets of the training data, sampled with replacement. The predictions of each model are then aggregated to produce the final result. This method reduces variance and overfitting.
The bagging forecast is the average of predictions from all base models:
Where, MMM = number of base models and, hi(x) = prediction of the ithi^{th}ith base model
Application in Concrete: Bagging can be used to improve compressive strength predictions by reducing variance and enhancing model robustness through the aggregation of multiple decision trees. Both non-ensemble and ensemble ML - models offer valuable approaches for forecasting the CS of material. While non-ensemble models like MLR and SVR provide simple yet effective predictions, ensemble models such as AdaBoost, Random Forest, XGBoost, and Bagging enhance prediction accuracy by combining multiple learners and reducing model errors. These methods, when applied appropriately, can greatly improve the design and optimization of concrete mixes in construction projects.
Non-ensemble models, such as SVR and LR, rely on individual algorithms for predictions. In contrast, ensemble models, like RF, XGBoost, and AdaBoost, combine predictions from multiple base models to improve accuracy and reduce bias or variance. The separation was necessary to analyze their distinct capabilities in predicting compressive strength. Ensemble methods generally outperform non-ensemble models due to their capability to detention complex data patterns and enhance robustness. This distinction allowed us to evaluate their comparative effectiveness, demonstrating the superior performance of optimized ensemble models like XGBoost in this study.
A study using K-Fold Cross-Validation and arithmetical investigation: In predictive modelling, statistical analysis plays a crucial role in evaluating model performance, validating assumptions, and ensuring the reliability of results. In this study, statistical methods were employed to analyse the CS forecast of concrete using ML - algorithms. To ensure robust evaluation and avoid overfitting, K-fold cross-validation was used as the primary validation technique. Below are the details of the statistical analysis and cross-validation process.
The verification and validation of the 13 machine learning models were performed using rigorous evaluation metrics and cross-validation techniques. Models were trained on a dataset comprising concrete mix components and tested on unseen data to assess generalization. Performance was quantified using metrics such as R2, Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and Mean Absolute Percentage Error (MAPE). K-fold cross authentication was active to minimize overfitting and ensure reliability. XGBoost emerged as the best-performing model with superior accuracy and error metrics, confirming the robustness and validity of the approach.
Arithmetical Analysis: The statistical analysis in this study involves the calculation of key performance metrics, such as:
MAE: MAE measures the regular magnitude of the complete errors between the forecast and actual values. It is a linear score, which means all the individual differences are equally weighted.
Where, yi = actual value, y^i = predicted value, and n = number of data points.
MSE: MSE evaluates the average squared differences between actual and predicted values, giving more weightiness to larger errors. It is useful for classifying models that have large prediction errors.
RMSE: RMSE is the four-sided basis of the MSE and delivers a more explainable degree of prediction mistake. It retains a similar unit as the board flexible, making it easier to compare with the actual data.
R2: R2 represents the percentage of alteration in the reliant on variable that is clarified by the self-governing variables in the model. A higher R2 value specifies a better model presentation.
Where is the unkind of the actual values.
MAPE: MAPE special delivery the estimate error as a proportion, making it easy to interpret.
Each of these metrics helps assess the presentation of different ML models, identifying strengths and weaknesses in terms of their predictive accuracy. Each model’s results after cross-validation were analysed in terms of prediction accuracy, consistency, and generalization ability. Models like XGBoost and Random Forest, which consistently showed lower RMSE and higher R2 scores, performed better due to their capability to capture non-linear associations and reduce overfitting through ensemble methods. Statistical analysis and K-fold cross-validation provided a robust framework for evaluating the presentation of various ML models in forecasting concrete CS.
In Figure 1, the sigmoid function shows how a neural network maps input to output probabilities by compressing values into a range between −1 and 1, which is critical for binary classification tasks. Figure 2 illustrates the process of implementing machine learning, from data collection to model training and validation. Figure 3 demonstrates the boosting framework, which sequentially improves model accuracy by focusing on weak learners, while Figure 4 depicts the bagging method, where multiple models run in parallel to reduce variance. Figure 5 provides the overall machine-learning approach adopted for forecasting the compressive strength of concrete.
4. RESULTS AND DISCUSSION
The data summarizes the frequency distribution of various concrete mix components resulting in compressive strength over time. The weight of cement varies from 50 to 700 kg/m3, with fly ash ranging between 0 to 325 kg/m3. Critical for workability and strength, the water content ranges from 40 to 560 kg/m3, while superplasticizer—a chemical admixture used to enhance fluidity—ranges from 0 to 65 kg/m3. Coarse aggregate values lie between 800 to 1500 kg/m3, and fine aggregate values are between 600 to 1300 kg/m3. The power of enlargement of concrete is recorded across different ages, from 0 to 650 days, with the power of enlargement values ranging from 6 to 18 MPa. Figure 6 illustrates the relationship between various contribution variables. This comprehensive dataset aids in understanding the role of each component in determining the strength of concrete. I will now generate a bar chart representation of the dataset (Figure 7, parts a-h). Since the chart generation isn’t available right now, I suggest visualizing the following: (a) Cement (kg/m3) vs Frequency, (b) Fly Ash (kg/m3) vs Frequency, (c) Water (kg/m3) vs Frequency, (d) Superplasticizer (kg/m3) vs Frequency, (e) Coarse Aggregate (kg/m3) vs Frequency, (f) Fine Aggregate (kg/m3) vs Frequency, (g) Age (days) vs Frequency, and, (h) Concrete Compressive Strength (MPa) vs Frequency.
The graphs show how the variation in each material’s quantity impacts the resulting compressive strength. Cement and water content play a significant role, with higher cement content generally leading to increased strength, while an optimal water-to-cement ratio is critical for achieving maximum compressive strength. The influence of additives like fly ash and superplasticizer, along with aggregate composition, further affects the durability and strength of the concrete mix over time. The descriptive statistics of the concrete mix parameters offer valuable insights into the material distribution and their impact on concrete strength. The dataset consists of 650 samples, with cement content varying between 140 and 570 kg/m3, averaging 295.5 kg/m3 and a standard deviation of 90.25. Fly ash content displays considerable variability, with a mean of 85.1 kg/m3 and a range from 0 to 215 kg/m3. The water content has an average of 175.45 kg/m3, consistent with the ideal water-to-cement ratio necessary for achieving high compressive strength.
The use of superplasticizers varies widely, with a median value of 6.2 kg/m3. Coarse aggregate and fine aggregate have relatively narrow ranges, with averages of 1025.25 kg/m3 and 805.5 kg/m3, correspondingly. The average age of the samples is 50 days, with strength values spanning from 6 to 85 MPa. The mean compressive strength is 33.25 MPa, with most samples achieving between 22.5 and 43 MPa, highlighting the variability in concrete performance. The Pearson correlation matrix provided illustrates the associations between various concrete mix limits and the CCS of concrete. The values range between −1 and 1, where 1 represents a perfect positive correlation, −1 a faultless negative correlation, and 0 indicates no correlation. Cement (C) shows a strong positive association with compressive strength (0.52), meaning that as adhesive content increases, the compressive strength tends to increase significantly. Fly ash (FA) has a slight negative correlation with CS (−0.06), implying that cumulative fly ash content has a negligible impact on CS.
Water (W) displays a moderate negative correlation with compressive strength (−0.21), highlighting that excess water tends to weaken the concrete. Superplasticizer (SP) exhibits a weak positive correlation (0.24) with compressive strength, suggesting that the use of this additive slightly enhances concrete strength by improving workability. Coarse aggregate (CA) and fine aggregate (F) show weak negative correlations with compressive strength (−0.14 and −0.23, respectively), indicating that the aggregate composition has a minor inverse relationship with concrete strength. Age (A) of the concrete has a strong positive correlation with compressive strength (0.8), underscoring the well-established fact that concrete gains strength over time. Overall, the matrix demonstrates that cement content, age, and the use of superplasticizer are the most influential factors in determining concrete compressive strength, while water content and aggregate proportions have more limited impacts. Table 2 shows Descriptive Statistics of Concrete Mix Parameters.
The presentation of various ML - models in envisaging CS is evaluated using actual and predicted values, alongside their respective errors. The Multiple Linear Regression (MLR) model shows moderate predictive capabilities, as evidenced by the discrepancies between actual and predicted values across different sample points. Support Vector Regression (SVR) yielded a slightly improved performance, indicating its suitability for capturing complex relationships within the dataset. The AdaBoost and Bagging models demonstrated comparable effectiveness, as they both managed to reduce prediction errors, thus enhancing the robustness of the predictions. Random Forest (RF) emerged as one of the top performers, showcasing a strong ability to minimize errors due to its ensemble learning approach, effectively handling the variability present in the dataset. XGBoost also exhibited notable performance, leveraging its boosting techniques to further refine predictions and reduce errors. Figure 8 shows Pearson’s correlation coefficient between Input Parameters were used for this study. Figure 9 illustrates the K-fold cross authentication process, a robust procedure used to assess the presentation of predictive models.
Figure 10 through 12 present a comprehensive analysis of the models’ performances. Each figure includes two parts: (a) a comparative overview of actual, predicted values, and associated errors for the respective models; (b) the distribution of tested versus predicted errors, illustrating the predictive accuracy of each method. Figure 10 showcases MLR results alongside their error distributions. Figure 10 focuses SVR on the performance of AdaBoost and the corresponding error distribution. Figure 11 displays Adaboost model results, while Figure 11 outlines Bagging performance. Figure 12 features RF results, culminating in Figure 12, which encapsulates the XGBoost overall error distribution across all models. These findings indicate that advanced machine learning models, particularly RF and XGBoost, significantly enhance predictive accuracy, highlighting their potential for practical applications in concrete strength prediction.
The dataset is divided into K equal subsets or folds, where each fold serves as a test set while the outstanding K-1 folds are used for training, ensuring comprehensive model validation. Table 3 presents the presentation metrics of various ML algorithms used for envisaging CS in material. The metrics include MAE, MSE, R2, MAPE, RMSE, and accuracy. Among the models evaluated, Linear Regression (LR) exhibited the highest MAE (8.2) and MSE (92.1), indicating less precise predictions compared to more advanced techniques.
In contrast, XGBoost (Optimized) demonstrated the best performance with the lowest MAE (2.2) and MSE (10.5), achieving a high R2 value of 0.94, which indicates a strong correlation between predicted and actual values. This model also reported the lowest MAPE (8.5) and RMSE (3.25), suggesting superior predictive accuracy. Random Forest and Bagging models also showed commendable presentation, with R2 values of 0.91 and 0.90, correspondingly, indicating their effectiveness in capturing the variability in the data. Overall, the consequences highlight that optimized collaborative approaches, particularly XGBoost, significantly outpace outmoded models like LR, confirming their efficacy in predicting complex relationships in concrete compressive strength.
Table 4 presents an evaluation of various ML - models for envisaging CS across different strength ranges in concrete. The models assessed include LR, SVR, AdaBoost, Bagging, Random Forest, and XGBoost. Each model’s performance is characterized by metrics such as MAE, MSE, R2, MAPE, RMSE, and accuracy. For the strength range of 5–30 MPa, XGBoost (Optimized) showed the best performance, with an MAE of 2.52 MPa and R2 of 0.74, categorized as competent.
In contrast, Linear Regression performed inadequately in this range, with an MAE of 4.12 MPa and an R2 of only 0.42. In the medium strength range of 30–55 MPa, XGBoost excelled again with an MAE of 2.19 MPa and was deemed highly effective. The performance of SVR and Random Forest was moderate to reliable, while Linear Regression remained marginal. For the highest strength range of 55–85 MPa, models like AdaBoost and Random Forest delivered adequate predictions, with higher errors than in the lower ranges. Overall, advanced ensemble methods, particularly XGBoost, consistently outperformed traditional methods across strength categories shown in Figure 13.
The analysis of various concrete mix components reveals their influence on strength development during curing shown in Figure 14. Curing age, ranging from 1 to 400 days, is critical, as prolonged curing typically enhances compressive strength, particularly with a constant cement content of 290.75 and controlled additions of superplasticizer (6.50) and fly ash (85.50). The cement range of 140 to 600 kg/m3 indicates that higher cement content can improve strength but may lead to diminishing returns beyond an optimal point. The water-cement ratio remains vital, with water content between 130 and 230 kg/m3 significantly affecting workability and strength; too much water can dilute strength. Aggregates play a crucial role; maintaining coarse aggregate within 800 to 1150 kg/m3 and fine aggregate between 600 and 950 kg/m3 ensures adequate packing and minimizes voids, contributing to the overall durability and performance of the concrete mix.
Traditional statistical techniques, such as multiple linear regression, often rely on predefined assumptions about data distribution and linearity, which may limit their predictive accuracy for complex, nonlinear relationships in concrete mix designs. In contrast, the proposed machine learning methods, including XGBoost, Random Forest, and AdaBoost, effectively handle nonlinearities, interactions between variables, and large datasets without requiring strict assumptions. For instance, XGBoost achieved an R2 of 0.94 and an MAE of 2.2, significantly outperforming traditional methods in predictive accuracy. This demonstrates the superiority of advanced machine learning models in capturing intricate relationships in concrete composition and strength predictions.
In discussing the results from the parametric analysis illustrated in Figure 15, the role of age, cement, and fly ash is crucial for predicting concrete strength. The analysis highlights those variations in curing age, along with the proportions of cement and fly ash, significantly affect the compressive strength (CS) of concrete. The findings emphasize the need for careful consideration of these parameters during the mix design process to optimize concrete strength effectively. The findings of this study align with the existing literature, demonstrating the critical roles of Curing period, Binder, and Pozzolan in influencing the CS of concrete. Emphasize the importance of optimizing these input parameters to enhance the mechanical properties of concrete, highlighting that even slight variations can significantly affect strength outcomes. The support this notion, showing that the interplay between cement and fly ash can lead to improved durability and strength characteristics, particularly in sustainable construction practices. Table 5 shows information gathered through parametric evaluation.
Investigating the role of curing period, binder, and pozzolan through parametric analysis in predicting concrete strength.
Generic Boosting is an ensemble learning performance that enhances weak learners by iteratively refining model predictions [41]. It minimizes errors by focusing on misclassified or poorly predicted data points, assigning higher weights to them in subsequent iterations. The method aggregates predictions from multiple models, each correcting its predecessor’s mistakes. Boosting algorithms, such as AdaBoost and XGBoost, employ this principle to achieve higher accuracy [42]. XGBoost further optimizes boosting by using regularization to prevent overfitting and parallel computation for efficiency. In this study, Generic Boosting demonstrated its effectiveness in modeling complex relationships within the concrete mix dataset [43].
The study demonstrated that machine learning models, especially XGBoost, outperformed traditional statistical methods, accomplishing an R2 of 0.94 and an MAE of 2.2. These results highlight the ability of machine learning to capture complex, nonlinear relationships between concrete mix components, such as cement, fly ash, water, superplasticizer, aggregates, and sample age. The use of ensemble models like Random Forest and AdaBoost further strengthened the predictive accuracy by reducing overfitting and enhancing generalization. Additionally, the study confirmed that the optimal composition of concrete mix can be better understood and predicted through machine learning, providing insights for improving material performance and optimizing concrete mix designs in civil engineering applications. These findings suggest that machine learning models can play a pivotal role in advancing materials science and construction practices.
Further corroborate our results by indicating that age is a pivotal factor in the development of concrete strength over time, suggesting that understanding the curing process is essential for predicting CS accurately. Delve into the role of additional cementitious constituents, such as fly ash, in mitigating the environmental impact while also enhancing concrete performance, which is particularly relevant given the increasing focus on sustainable building materials. Moreover, the evidence that a well-balanced mix design incorporating fly ash can significantly enhance compressive strength, especially when paired with adequate curing times. Their findings reinforce the need for a thorough understanding of how these materials interact to achieve optimal results. Collectively, these studies underline the importance of a parametric approach in analyzing concrete’s properties, providing a foundation for future research to explore innovative mix designs and their practical applications in construction.
5. CONCLUSION
This study investigated the influence of various input parameters—curing age, cement content, fly ash, water, superplasticizer, coarse aggregate, and fine aggregate—on the compressive strength (CS) of concrete. The analysis revealed that curing age and the proportion of cement and fly ash significantly impact CS, highlighting their critical roles in concrete performance. The results indicate that increasing curing age leads to a marked enhancement in CS, supporting findings from previous literature that emphasize the importance of time in strength development. Optimal cement content was found to be crucial, with an increase in strength observed as cement content varied from 140 to 600 kg/m3. The study established a direct correlation between the addition of fly ash and improved CS, particularly when used in conjunction with a well-designed mix of cement and aggregates. Performance metrics of machine learning models applied to predict CS demonstrated varying levels of accuracy and efficiency.
The XGBoost model (optimized) achieved the highest accuracy of 92%, with a mean absolute error (MAE) of 2.2 MPa, while Random Forest and AdaBoost also displayed competitive performances, reinforcing the effectiveness of advanced machine learning techniques in concrete strength prediction. Notably, the models’ performances varied across strength ranges, indicating that specific algorithms may be better suited for different conditions, thus allowing for more tailored applications in practical scenarios. In conclusion, the study emphasizes the significance of a thorough understanding of the interplay between input parameters for predicting the CS of concrete. The combination of ML - models offers a promising approach for enhancing predictive accuracy, paving the way for innovative applications in concrete technology. Future research should focus on decontaminating these analytical models and exploring additional variables to further enhance concrete performance and sustainability in construction practices.
6. BIBLIOGRAPHY
-
[1] ZHENG, W., ZAMAN, A., FAROOQ, F., et al, “Sustainable predictive model of concrete utilizing waste ingredient: individual alogrithms with optimized ensemble approaches”, Materials Today. Communications, v. 35, pp. 105901, 2023. doi: http://doi.org/10.1016/j.mtcomm.2023.105901.
» https://doi.org/10.1016/j.mtcomm.2023.105901 -
[2] ALADEJARE, A.E., ALOFE, E.D., ONIFADE, M., et al, “Empirical estimation of uniaxial compressive strength of rock: database of simple, multiple, and artificial intelligence-based regressions”, Geotechnical and Geological Engineering, v. 39, n. 6, pp. 4427–4455, 2021. doi: http://doi.org/10.1007/s10706-021-01772-5.
» https://doi.org/10.1007/s10706-021-01772-5 -
[3] ALADEJARE, A.E., AKEJU, V.O., WANG, Y., “Data-driven characterization of the correlation between uniaxial compressive strength and Youngs’ modulus of rock without regression models”, Transportation Geotechnics, v. 32, pp. 100680, 2022. doi: http://doi.org/10.1016/j.trgeo.2021.100680.
» https://doi.org/10.1016/j.trgeo.2021.100680 -
[4] SRINIVASAN, S.S., MUTHUSAMY, N., ANBARASU, N.A., “The structural performance of fiber-reinforced concrete beams with nanosilica”, Matéria (Rio de Janeiro), v. 29, n. 3, pp. e20240194, 2024. doi: http://doi.org/10.1590/1517-7076-rmat-2024-0194.
» https://doi.org/10.1590/1517-7076-rmat-2024-0194 -
[5] AHMAD, A., AHMAD, W., CHAIYASARN, K., et al, “Prediction of geopolymer concrete compressive strength using novel machine learning algorithms”, Polymers, v. 13, n. 19, pp. 3389, 2021. doi: http://doi.org/10.3390/polym13193389. PubMed PMID: 34641204.
» https://doi.org/10.3390/polym13193389 - [6] VARUTHAIYA, M., PALANISAMY, C., SIVAKUMAR, V., et al, “Concrete with sisal fibered geopolymer: a behavioral study”, Journal of Ceramic Processing Research, v. 23, n. 6, pp. 912–919, 2022.
-
[7] AHMED, A.H.A., JIN, W., MOSAAD, A.H.A., “Artificial intelligence models for predicting mechanical properties of recycled aggregate concrete (RAC): critical review”, Journal of Advanced Concrete Technology, v. 20, n. 6, pp. 404–429, 2022. doi: http://doi.org/10.3151/jact.20.404.
» https://doi.org/10.3151/jact.20.404 -
[8] THIKE, P.H., ZHAO, Z., SHI, P., et al, “Significance of artificial neural network analytical models in materials’ performance prediction”, Bulletin of Materials Science, v. 43, n. 1, pp. 211, 2020. doi: http://doi.org/10.1007/s12034-020-02154-y.
» https://doi.org/10.1007/s12034-020-02154-y - [9] ADEBAYO, J., “Towards effective tools for debugging machine learning models”, PhD diss., Massachusetts Institute of Technology, Cambridge, 2022.
-
[10] ADEBAYO, P., JATHUNGE, C.B., DARBANDI, A., et al, “Development, modeling, and optimization of ground source heat pump systems for cold climates: a comprehensive review”, Energy and Buildings, v. 320, pp. 114646, 2024. doi: https://doi.org/10.1016/j.enbuild.2024.114646.
» https://doi.org/10.1016/j.enbuild.2024.114646 -
[11] NAVEEN Kumar, S., NATARAJAN, M. and NAVEEN ARASU, A., “A comprehensive microstructural analysis for enhancing concrete’s longevity and environmental sustainability”, Journal of Environmental Nanotechnology, v. 13, n. 2, pp. 368–376, 2024. doi: http://doi.org/10.13074/jent.2024.06.242584.
» https://doi.org/10.13074/jent.2024.06.242584 - [12] KINATTINKARA, S., ARUMUGAM, T., SAMIAPPAN, N., et al, “Deriving an alternative energy using anaerobic co-digestion of water hyacinth, food waste, and cow manure”, Journal of Renewable Energy and Environment, v. 10, n. 1, pp. 19–25, 2023.
-
[13] ADEWUYI, A.Y., ADEBAYO, K.B., ADEBAYO, D., et al, “Application of big data analytics to forecast future waste trends and inform sustainable planning”, World Journal of Advanced Research and Reviews, v. 23, n. 1, pp. 2469–2479, 2024. doi: http://doi.org/10.30574/wjarr.2024.23.1.2229.
» https://doi.org/10.30574/wjarr.2024.23.1.2229 -
[14] KHAJEHZADEH, M., KEAWSAWASVONG, S., MOTAHARI, M.R., et al, “Effective machine-learning models for rock mass deformation modulus estimation based on rock mass classification systems”, Engineering and Science, v. 29, n. 1120, pp. 1120, 2024. doi: http://doi.org/10.30919/es1120.
» https://doi.org/10.30919/es1120 -
[15] CHEN, Y., GUO, J., HUANG, J., et al, “A novel method for financial distress prediction based on sparse neural networks with L 1/2 regularization”, International Journal of Machine Learning and Cybernetics, v. 13, n. 7, pp. 2089–2103, 2022. doi: http://doi.org/10.1007/s13042-022-01566-y. PubMed PMID: 35492262.
» https://doi.org/10.1007/s13042-022-01566-y -
[16] WANG, F., WONG, W.-K., WANG, Z., et al, “Emerging pathways to sustainable economic development: An interdisciplinary exploration of resource efficiency, technological innovation, and ecosystem resilience in resource-rich regions”, Resources Policy, v. 85, pp. 103747, 2023. doi: http://doi.org/10.1016/j.resourpol.2023.103747.
» https://doi.org/10.1016/j.resourpol.2023.103747 - [17] OTCHERE, D.A. (ed), Data science and machine learning applications in subsurface engineering, Boca Raton, CRC Press, 2024.
-
[18] KHAJEHZADEH, M., KEAWSAWASVONG, S., “Predicting slope safety using an optimized machine learning model”, Heliyon, v. 9, n. 12, pp. e23012, 2023. doi: http://doi.org/10.1016/j.heliyon.2023.e23012. PubMed PMID: 38076160.
» https://doi.org/10.1016/j.heliyon.2023.e23012 -
[19] PARTHASAARATHI, R., BALASUNDARAM, N., NAVEEN ARASU, A., “A stiffness analysis of treated and non-treated meshed coir layer fibre reinforced cement concrete”, AIP Conference Proceedings, vol. 2861, pp. 050002, 2023. doi: http://doi.org/10.1063/5.0158672.
» https://doi.org/10.1063/5.0158672 -
[20] MEHRAJ, N., MATEU, C., CABEZA, L.F., “Use of artificial intelligence methods in designing thermal energy storage tanks: a bibliometric analysis”, Journal of Energy Storage, v. 97, pp. 112794, 2024. doi: http://doi.org/10.1016/j.est.2024.112794.
» https://doi.org/10.1016/j.est.2024.112794 -
[21] NG, W.L., CHAN, A., ONG, Y.S., et al, “Deep learning for fabrication and maturation of 3D bioprinted tissues and organs”, Virtual and Physical Prototyping, v. 15, n. 3, pp. 340–358, 2020. doi: http://doi.org/10.1080/17452759.2020.1771741.
» https://doi.org/10.1080/17452759.2020.1771741 -
[22] NAVEEN ARASU, A., MUTHUSAMY, N., NATARAJAN, B., et al, “Optimization of high performance concrete composites by using nano materials”, Research on Engineering Structures and Materials, v. 9, n. 3, pp. 843–859, 2023. doi: http://doi.org/10.17515/resm2022.602ma1213.
» https://doi.org/10.17515/resm2022.602ma1213 -
[23] SUN, Q., CHEN, H., WANG, Y., et al, “Does environmental carbon pressure lead to low-carbon technology innovation? Empirical evidence from Chinese cities based on satellite remote sensing and machine learning”, Computers & Industrial Engineering, v. 189, pp. 109948, 2024. doi: http://doi.org/10.1016/j.cie.2024.109948.
» https://doi.org/10.1016/j.cie.2024.109948 -
[24] MAVI, K., NEDA, K.B., FULFORD, R., et al, “Forecasting project success in the construction industry using adaptive neuro-fuzzy inference system”, International Journal of Construction Management, vol. 24, no. 14, pp. 1550–1568, 2023. https://doi.org/10.1080/15623599.2023.2266676.
» https://doi.org/10.1080/15623599.2023.2266676 -
[25] PARTHASAARATHI, R., BALASUNDARAM, N., NAVEEN ARASU, A., et al, “Analysing the Impact and Investigating Coconut Shell Fiber Reinforced Concrete (CSFRC) under Varied Loading Conditions”, Journal of Advanced Research in Applied Sciences and Engineering Technology, v. 35, n. 1, pp. 106–120, 2024. doi: https://doi.org/10.37934/araset.35.1.106120.
» https://doi.org/10.37934/araset.35.1.106120 -
[26] HOSSEINI, S., KHATTI, J., TAIWO, B.O., et al, “Assessment of the ground vibration during blasting in mining projects using different computational approaches”, Scientific Reports, v. 13, n. 1, pp. 18582, 2023. doi: http://doi.org/10.1038/s41598-023-46064-5. PubMed PMID: 37903881.
» https://doi.org/10.1038/s41598-023-46064-5 -
[27] KOMADJA, G.C., PRADHAN, S.P., OLUWASEGUN, A.D., et al, “Geotechnical and geological investigation of slope stability of a section of road cut debris-slopes along NH-7, Uttarakhand, India”, Results in Engineering, v. 10, pp. 100227, 2021. doi: http://doi.org/10.1016/j.rineng.2021.100227.
» https://doi.org/10.1016/j.rineng.2021.100227 -
[28] SAMEK, W., MONTAVON, G., LAPUSCHKIN, S., et al, “Explaining deep neural networks and beyond: a review of methods and applications”, Proceedings of the IEEE, v. 109, n. 3, pp. 247–278, 2021. doi: http://doi.org/10.1109/JPROC.2021.3060483.
» https://doi.org/10.1109/JPROC.2021.3060483 - [29] NAVEEN ARASU, A., RANJINI, D., PRABHU, R., “Investigation on partial replacement of cement by GGBS”, Journal of Critical Reviews, v. 7, n. 17, pp. 3827–3831, 2020.
-
[30] SAMEK, W., MONTAVON, G., LAPUSCHKIN, S., et al, “Explaining deep neural networks and beyond: a review of methods and applications”, Proceedings of the IEEE, v. 109, n. 3, pp. 247–278, 2021. doi: http://doi.org/10.1109/JPROC.2021.3060483.
» https://doi.org/10.1109/JPROC.2021.3060483 -
[31] DENG, W.H., NAGIREDDY, M., MICHELLE, S.A.L., et al, “Exploring how machine learning practitioners (try to) use fairness toolkits”, Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, pp. 473–484. 2022. doi: http://doi.org/10.1145/3531146.3533113.
» https://doi.org/10.1145/3531146.3533113 -
[32] MOHAMMADPOUR, A., KAZEMI, A., BAGHAPOUR, M.A., et al, “Bioengineered FeZn/GA@ Cu nanocomposite utilizing spent coffee ground extract and gum arabic: enhanced nitrate removal via (RSM) and machine learning optimization”, International Journal of Biological Macromolecules, v. 277, n. Pt 2, pp. 134060, 2024. doi: http://doi.org/10.1016/j.ijbiomac.2024.134060. PubMed PMID: 39097464.
» https://doi.org/10.1016/j.ijbiomac.2024.134060 -
[33] ADEWUYI, O.B., FOLLY, K.A., DAVID, T.O., et al, “Power system voltage stability margin estimation using adaptive neuro-fuzzy Inference system enhanced with particle swarm optimization”, Sustainability, v. 14, n. 22, pp. 15448, 2022. doi: http://doi.org/10.3390/su142215448.
» https://doi.org/10.3390/su142215448 -
[34] KADHAR, S.A., GOPAL, E., SIVAKUMAR, V., et al, “Optimizing flow, strength, and durability in high-strength self-compacting and self-curing concrete utilizing lightweight aggregates”, Matéria (Rio de Janeiro), v. 29, n. 1, pp. e20230336, 2024. doi: http://doi.org/10.1590/1517-7076-rmat-2023-0336.
» https://doi.org/10.1590/1517-7076-rmat-2023-0336 -
[35] HAN, S., ZHAO, S., LU, D., et al, “Performance improvement of recycled concrete aggregates and their potential applications in infrastructure: a review”, Buildings, v. 13, n. 6, pp. 1411, 2023. http://doi.org/10.3390/buildings13061411.
» https://doi.org/10.3390/buildings13061411 - [36] SHANKAR, S., NATARAJAN, M., ARASU, A, “Exploring the strength and durability characteristics of high-performance fibre reinforced concrete containing nanosilica”, Journal of the Balkan Tribological Association, v. 30, n. 1, pp. 142–152, 2024.
-
[37] ASHRAF, U., ZHANG, H., ANEES, A., et al, “An ensemble-based strategy for robust predictive volcanic rock typing efficiency on a global-scale: a novel workflow driven by big data analytics”, The Science of the Total Environment, v. 937, pp. 173425, 2024. doi: http://doi.org/10.1016/j.scitotenv.2024.173425. PubMed PMID: 38795994.
» https://doi.org/10.1016/j.scitotenv.2024.173425 -
[38] VIVEK, S., PRIYA, V., SUDHARSAN, S.T., et al, “Experimental investigation on bricks by using cow dung, rice husk, egg shell powder as a partial replacement for fly ash”, The Asian Review of Civil Engineering, v. 9, n. 2, pp. 1–7, 2020. doi: http://doi.org/10.51983/tarce-2020.9.2.2556.
» https://doi.org/10.51983/tarce-2020.9.2.2556 -
[39] SAINI, S.K., MAHATO, S., PANDEY, D.N., et al, “Modeling flood susceptibility zones using hybrid machine learning models of an agricultural dominant landscape of India”, Environmental Science and Pollution Research International, v. 30, n. 43, pp. 97463–97485, 2023. doi: http://doi.org/10.1007/s11356-023-29049-9. PubMed PMID: 37594709.
» https://doi.org/10.1007/s11356-023-29049-9 -
[40] ESPINO, M.T., TUAZON, B.J., ESPERA JR, A.H., et al, “Statistical methods for design and testing of 3D-printed polymers”, MRS Communications, v. 13, n. 2, pp. 193–211, 2023. doi: http://doi.org/10.1557/s43579-023-00332-7. PubMed PMID: 37153534.
» https://doi.org/10.1557/s43579-023-00332-7 -
[41] APPADURAI, A.S., SUNDARESAN, A.A., NAMMALVAR, A., “Mechanical characterization and durability studies on concrete developed with M-Sand and River Sand”, Matéria (Rio de Janeiro), v. 29, n. 4, pp. e20240404, 2024. doi: http://doi.org/10.1590/1517-7076-rmat-2024-0404.
» https://doi.org/10.1590/1517-7076-rmat-2024-0404 -
[42] SAKTHIVEL, S., PALANIRAJ, S., PARAMASIVAM, R., et al, “Optimizing concrete strength with tapioca peel ash: a central composite design approach”, Matéria (Rio de Janeiro), v. 29, n. 4, pp. e20240422, 2024. doi: http://doi.org/10.1590/1517-7076-rmat-2024-0422.
» https://doi.org/10.1590/1517-7076-rmat-2024-0422 -
[43] AGOR, C.D., MBADIKE, E.M., ALANEME, G.U., “Evaluation of sisal fiber and aluminum waste concrete blend for sustainable construction using adaptive neuro-fuzzy inference system”, Scientific Reports, v. 13, n. 1, pp. 2814, 2023. doi: http://doi.org/10.1038/s41598-023-30008-0. PubMed PMID: 36797414.
» https://doi.org/10.1038/s41598-023-30008-0
Publication Dates
-
Publication in this collection
21 Mar 2025 -
Date of issue
2025
History
-
Received
13 Nov 2024 -
Accepted
04 Jan 2025






























