Modeling quality, satisfaction and perceived crowding in public healthcare: a study with low-income Brazilian patients

Abstract This paper is one of the first to propose a research model to assess the quality and satisfaction perceived by low-income Brazilian patients using public healthcare services, under the moderating effect of perceived crowding. The model was evaluated from data obtained from 417 patients, and its proposed relationships and statistics were tested through Structural Equation Modeling using a Partial Least Squares approach (PLS-SEM). To explore the moderating effect of perceived crowding, the sample was divided into two groups and tested by employing multi-group analysis (MGA). The results show that Reliability, Safety, Tangibles, Empathy and Responsiveness have a positive effect on Perceived Healthcare Quality (PHQ), which, in turn, has a marked effect on patient satisfaction (PS), ergo, PHQ acts as antecedent of PS. However, perceived crowding has no moderating effect on any relationship in the model.


Introduction
Globalization has changed the business environment by increasing competitiveness and attributing greater relevance to the service sector.Adequate service management directly impacts the relationship between consumers and organizations, thus converting service quality into a source of differentiation and competitive advantage for companies (Saueressig et al., 2021;Sun & Pang, 2017).
Assessing the quality of healthcare services is essential, as it allows the differences between customer expectations and perceptions to be understood, in addition to developing improvement strategies (Verleye et al., 2017;Choi et al., 2004).When patients perceive its service to be of good quality, the financial means of a hospital, such as earnings and net value, increase between 17% and 27% (Naidu, 2009;Romano & Mutter, 2004).
The SERVQUAL and the SERVPERF scales are commonly used for this assessment, as they evaluate the service quality dimensions proposed by Parasuraman et al. (1988): reliability; assurance; tangibles; responsiveness; and empathy.The first scale analyzes the difference between expectations and performance of the services provided, whereas the second focuses on service provider performance (Teshnizi et al., 2018;McFadyen et al., 2001).Although SERVPERF is statistically superior and easier to apply (Rodrigues et al., 2011;Carrillat et al., 2007;Brady et al., 2002), there is still no unanimity among researchers and healthcare professionals about its use (Akdere et al., 2020;Shafei et al., 2019).
Recent literature illustrates the application of this scale in several countries.Shafei et al. (2019) compared quality scales and concluded that the Weighted SERVPERF was the most appropriate one for the Egyptian context.Akdere et al. (2020) collected data in a Turkish hospital to verify the predictive power of SERVPERF to assess the overall quality of service provision by using logistic regression.Also in Turkey, more recently, the SERVPERF model was adapted to hospital service performances during the COVID-19 pandemic, proposing a model with new dimensions (Erdogan & Ayyildiz, 2022).Another advance in the literature on SERVPERF was the joint use of multicriteria methods such as the AHP (Alp et al., 2022).The present research differs methodologically from previous studies, however, by applying Structural Equation Modeling (SEM) with multi-group analysis (MGA) to assess patient satisfaction.Other important differences are its application in the Brazilian context, which has peculiarities, and the assessment of the moderating effect of crowding on the relationships studied.
The Brazilian Unified Health System (BUHS) is one of the largest and most comprehensive of the world, offering services which vary from blood pressure measurement to organ transplantation.It consumes about 45% of the resources allocated to health and serves 80% of the Brazilian population, approximately 150 million people, 51.4% of which belong to the lower-income segment (Boccolini & Souza, 2016).
Although private health institutions complement its services, the increase in demand for BUHS has not been supported by the necessary investment in infrastructure to supply it, which, in turn, has caused a decrease in service quality (Massuda et al., 2018;Almeida et al., 2000).Therefore, studies that assess the perception low-income patients have of quality and satisfaction are essential to determine which dimensions constitute public healthcare service quality and identify whether they can be improved.
It is important to remark that the effects of crowding, which are widely studied in retailing, have become a concern for researchers and healthcare professionals, as they pose a barrier to providing services in an effective and efficient way (Sawang et al., 2019).However, the study of crowding in healthcare has mostly been restricted to measuring the duration of service providers' internal processes and suggesting ways to minimize them (van der Linden et al., 2019;Wang et al., 2017;Boyle et al., 2015).Thus, we believe that our study is one of the first to combine the perception low-income patients have of public healthcare services with the SERVPERF framework, in order to show how these elements influence patient satisfaction under the moderating influence of perceived crowding.
Moreover, the COVID-19 outbreak has accentuated the dangers of crowded hospital settings, especially, in emergency departments (Ferraz et al., 2021;Lin et al., 2020;Whiteside et al., 2020;Woodworth, 2020).Thus, it is now even more important to understand the effects of crowding and how low-income patients perceive them.
Therefore, the main objective of this paper is to answer the following research question: "What are the relationships between service quality and patient satisfaction under the moderating effect of perceived crowding in public healthcare?"In order to answer it, we developed a research model with the aforementioned constructs, which was evaluated using data obtained from 417 lowincome patients who answered a structured questionnaire, carried out in loco in a Brazilian Basic Healthcare Unit (BHU).We then employed Structural Equation Modeling using a Partial Least Squares approach (PLS-SEM), and multi-group analysis (MGA) to assess the integrity of the model.Structural Equation Modeling is the appropriate technique to investigate relationships between latent variables, that is, those that cannot be measured directly, but through a set of indicators (Hair et al., 2022).Specifically, PLS-SEM was adopted because it is suitable for exploratory research, not requiring assumptions about data distribution, and being suitable for smaller sample sizes (Hair et al., 2021).
The remainder of this study is organized as follows.We introduce the research model and hypotheses in section 2, followed by the method in section 3. We present the results in section 4 and discuss them in section 5.The paper concludes by pinpointing both theoretical and managerial implications and important research avenues in section 6.

The low-income markets and healthcare
Since the 1970's, there have been differences over the definition of low-income markets.On the one hand, there are researchers who characterize such markets from a behavioral perspective, defining them as composed of consumers whose behaviors and values are conservative, affected by low self-esteem and aggravated by the prejudice of which they are victims (Barki & Parente, 2010).
On the other hand, there are researchers who characterize these markets from an economic perspective, stating that people in low-income markets subsist on tight budgets, frequently of less than $2.00 per day, which do not allow them to try out new products (Heckman & Hanna, 2015).Due to the plurality of the definitions and the absence of a consensus, we have adopted the characterization from an economic perspective in this study.
In numerous societies, there are individuals and groups placed in prominent economic positions, having better privileges and opportunities to enjoy better healthcare services than their low-income counterparts.However, for individuals in lowincome markets, access to healthcare is not a straightforward concept, as their health decisions are often subject to a series of constraints, such as affordability, accessibility, availability of services at the healthcare settings, and a lack of quality healthcare providers (Guimarães et al., 2019;Haenssgen & Ariana, 2016).According to Banerjee et al. (2004), there is a positive correlation between health and wealth.They state that greater health risks are associated with low-income markets, as individuals are less likely to be able to pay for health plans, medicines and treatment for chronic diseases.Bhattacharjee et al. (2017) state that low-income communities have become captives of the healthcare system to which they are allowed access, which is generally stigmatized as of poor quality and unattractive to wealthier patients.This creates a negative perception of these services, leading low-income patients to shy away from receiving any kind of treatment.This perception is important because most people on low-incomes have restricted access to private healthcare services and receive their treatment in public establishments, which suffer from a lack of resources and struggle to meet the high demand for healthcare.This situation creates a crowded environment in which long queues are usual and there is little sympathy between doctors and patients (Archibong et al., 2020).

Quality in healthcare
Patients have three levels of expectations regarding service quality in healthcare: basic expectations, in which the service is expected to be reliable, competent and safe; focus on the human being, in which the service is expected to be fast, accessible and individualized; and infrastructure, in which the health professionals are expected to be competent, honest and responsible, providing correct diagnoses and treatments (Babakus & Mangold, 1992;Donabedian et al., 1982).
As with most of services, assessing quality in healthcare is hampered by the intrinsic characteristics of a service: intangibility, heterogeneity and inseparability (Parasuraman et al., 1988).Therefore, patients have an important role to play in service evaluation, given that they are active participants in a process which is, influenced by their actions and emotions (Kim, 2019;Osei-Frimpong & Owusu-Frimpong, 2017).
However, patients are unable to accurately assess the technical service quality provided by doctors, because this analysis is based almost exclusively on the functional quality.In short, the perception of quality of these services is complex and defined by how well they meet or exceed patient expectation (Kansra & Jha, 2016;Walton & Hume, 2012).
Over the years, health institutions have improved their quality measurement methods and made their employees more aware of the importance of the work they do 5/29 in order to correct deficiencies and add value and humanization for their patients (Silva & Fernandes, 2019;Choi et al., 2004).
Such improvement is even more relevant in developing countries, whose public healthcare systems serve low-income communities and face a set of barriers related to service quality management, such as: the lack of commitment of senior managers; empowerment of certain employees who consider themselves the only ones capable of assessing the quality of given services; and limited resources to invest in such practices (Bernardo et al., 2022;Verleye et al., 2017;Erickson & Andrews, 2011).
The State of Art presents studies which use the SERVQUAL (Andaleeb, 2008;van Duong et al., 2004;Baltussen et al., 2002;Haddad et al., 1998) or the SERVPERF (Akdere et al., 2020;Narang, 2011) scales to assess the quality of healthcare in lowincome communities.They are both based on the on the five dimensions by Parasuraman et al. (1988): Reliability; Assurance; Tangibles; Empathy and Responsiveness.Narang (2011), for instance, measured patient perception in public healthcare centers in the rural Indian context.The results showed that certain aspects, such as "the availability of medical equipment" and "the availability of medical care" were poorly evaluated.In addition, they pointed out the existence of a strong relationship between quality and demographic aspects, such as education, gender and income.Akdere et al. (2020), in turn, collected data from 972 outpatients in a Turkish hospital, located in one of the poorest regions of the country.The authors concluded that the patients were satisfied with the Assurance and Responsiveness of the hospital, but not with its Tangibles.
Moreover, according to Akdere et al. (2020), although it is easier to apply, the use of the SERVPERF scale is not yet universal among researchers and health professionals.The authors recommended its use be replicated in new studies in other developing countries, especially those whose public healthcare systems serve large low-income communities, such as BUHS.
Hence, this study seeks to assess both the reliability and validity of a SERVPERF scale, adapted for the Brazilian public healthcare context, by measuring how lowincome patients perceive healthcare quality.To achieve this, we formulated the first hypothesis (H1), which was developed into five sub-hypothesis: H1a: Reliability has a positive effect on perceived healthcare quality.
H1b: Assurance has a positive effect on perceived healthcare quality.

H1c:
Tangibles have a positive effect on perceived healthcare quality.
H1d: Empathy has a positive effect on perceived healthcare quality.
H1e: Responsiveness has a positive effect on perceived healthcare quality.

Patient satisfaction
In healthcare, the concept of a quality-oriented public administration has turned patients into the main agents in evaluating health services.Regarding the BUHS, the use of satisfaction surveys to evaluate health services has gained prominence since the 1990's when reforms, which aimed to develop a patient-centered culture, began (Passero et al., 2018;Hollanda et al., 2012).However, many researchers have questioned the adoption of traditional satisfaction surveys to evaluate public health services, especially in developing countries, stating that the gratitude, which certain patients may feel after being treated, makes these inquiries biased (Silva et al., 2018;Rathert et al., 2015).Rathert et al. (2015) state that satisfaction surveys tend to favor the respondent's expectations instead of the actual assessment of the health services.According to Silva et al. (2018), a low expectation towards the service tends to result in higher satisfaction whereas a high degree of demand tends to lead to less satisfaction with the service, as it is more difficult to meet high expectations.
In this scenario, Russell et al. (2015) highlight that the patients perceptions of their care, in addition to typical clinical indicators, are important tools for assessing quality in healthcare.Pai et al. (2018) stated that typical satisfaction surveys often do not capture elements of healthcare that patients have said are the most important to them and, thus, these measures have had limited utility for actually improving the quality of healthcare delivery.Moreover, Wang et al. (2019) concluded that patient evaluation measures go beyond traditional satisfaction surveys and capture aspects of patient perception as the latter may have effects long after the clinical visit.
Thus, we have adopted a patient perception survey in our study in order to distinguish it from traditional satisfaction surveys, which are currently adopted by the BUHS, and to make the results more effective.Furthermore, we have used it to assess not only patient satisfaction, but their perception of quality and crowding as well.The concept of patient satisfaction is presented below.
Patient satisfaction is defined as the assessment of different dimensions of healthcare services, in which the cure is the expected main goal (Badri et al., 2009).This evaluation is intrinsically linked to service quality and improves both the image of a hospital and its earnings (Naidu, 2009;Romano & Mutter, 2004).According to Lien et al. (2014), communication is a key factor for patient satisfaction: if it provides information about the service that will be performed, it helps to reduce uncertainty about patient expectation regarding a given procedure, which, in turn, increases patient satisfaction.Soares & Farhangmehr (2015) stated that care, reliability, empathy, and responsiveness also have a direct effect on patient satisfaction.
Regarding public healthcare settings, Moliner (2009) stated that measuring the satisfaction of a patient is important and even more necessary in developing countries, where low-income patients experience services at a more emotional level.Moreover, Gallani et al. (2020) concluded that the disclosure of patient satisfaction performance has pivotal importance for hospital decision makers, especially, those who manage public settings.
Hence, we formulated the second hypothesis (H2): H2: Perceived healthcare quality has a positive effect on patient satisfaction.

Perceived crowding
Crowding is conceptualized as a multidimensional construct related to the human or spatial density of a given environment, being characterized from two perspectives: the objective and the subjective.The first refers to the number of people and the level of interaction between them, whereas the second is related to perception, which leads one to perceive an environment as crowded (Machleit et al., 2000).Perceived crowding, on the other hand, is defined as the emotional reaction caused by stress in a crowded place that might cause negative emotions (Dion, 2004).
There are many studies in the marketing literature about consumer behavior in crowded spaces, especially in retailing (Metha et al., 2013;Baker & Wakefield, 2012;Hui & Bateson, 1991).Nevertheless, the results of these studies are inconclusive when it comes to people's reactions to certain environments: some more sensitive individuals choose to leave a crowded place, while others, attracted by the crowding, conceive it as a way of social interaction (Wei et al., 2019;Noone & Mattila, 2009).
According to Noone et al. (2009), when one perceives the human or spatial density of a given environment as high, the negative effects of this perception tend to decrease the level of satisfaction.Pons et al. (2016) demonstrated that customers become less satisfied as this perception increases by using perceived crowding as a moderating variable in retail stores.
The study of crowding in healthcare has become a major concern for researchers and health professionals, as it poses a barrier to providing services in an effective and efficient way, diminishing overall quality.Furthermore, this area of study is still in the process of being developed (Sawang et al., 2019).The majority of the works published to date have focused on mitigating the effects of crowding, especially in emergency departments, through time studies and optimal resource allocation (van der Linden et al., 2019;Wang et al., 2017;Boyle et al., 2015).
Therefore, in order to approach crowding in healthcare from a new perspective and assess its moderating effect on both perceived quality and the satisfaction of patients from low-income communities, we formulated the third (H3) and fourth (H4) hypotheses of our study, presented below: H3: The relationship between the dimensions of quality and perceived healthcare quality is significantly more negative when perceived crowding is high.H4: The relationship between perceived healthcare quality and patient satisfaction is significantly more negative when perceived crowding is high.
Figure 1 outlines the proposed research model, in addition to the hypotheses this study has developed.It is important to note that Perceived Healthcare Quality (PHQ) is a second-order construct composed of the constructs which represent each SERVPERF dimension of quality.According to Hair et al. (2014a), second-order constructs are those whose covariance is explained by two levels of latent variables.Moreover, the proposed model is formativereflexive: while PHQ has a formative measurement, the other constructs -each dimension of quality and Patient Satisfaction (PS) -have reflexive measurements (Duarte & Amaro, 2018).
Figure 2 outlines the previously present proposed research model to which we have added Perceived Crowding (CROWD) to test its moderating effect on the relationship between each quality dimension and PHQ and that between PHQ and PS, in addition to the new hypotheses developed in subsection 2.4.

Method
Our study was carried out in a BHU, a branch of the BUHS, which serves three lowincome communities in a small town located in the interior of the state of São Paulo.We adopted a two-part structured questionnaire as the data collection instrument.The first part had three previously validated scales that measured the constructs of the research model, PHQ and PS; and CROWD.The second part had socio and demographic questions to help profile the sample collected.
We have used a SERVPERF scale already adapted for the hospital context to measure PHQ (Narang, 2011), which assessed the five dimensions of service quality by means of 22 items.To measure PS, we used a three-item scale (Moliner, 2009) whereas to measure CROWD, an eight-item scale was used (Dion, 2004).
All the items used in this study were subjected to a process of reverse translation to ensure that they would be understood by Brazilian respondents.The first translation, from English to Portuguese, was carried out by the authors themselves while the second, from Portuguese to English, was carried out by a British English speaker, a teacher in a private language teaching unit.According to Wild et al. (2005), this method ensures greater validity and excellence of the translation.Table 1 presents a compilation of the original structures of the questionnaires used in this study, as well their dimensions, items and authors.The hospital provides a good clinical examination.

R4
The hospital staff is dependable.

R5
The hospital is organized.

R6
The hospital has accurate payment arrangements.

A1
The hospital has adequate availability of doctors.

A2
The hospital provides its services honestly.

A3
The hospital assists in the cure and recovery of patients.

A4
The hospital provides its services by the time it promises to do so.A5 The hospital has adequate availability of drugs.

A6
The hospital has adequate availability of doctors for women.

T1
The hospital has neat and clean premises.T2 The hospital has adequate medical equipment.T3 The hospital has adequate rooms.T4 The hospital staff has clean appearance.

E1
The hospital staff follows-up, monitors patients.

E2
The hospital staff shows sympathy and support for the patients.

E3
The hospital staff respects patients appropriately.

RS1
The hospital has prompt care.

RS2
The hospital staff is always willing to help patients if they have any kind of doubts.

RS3
When requested for a drug, the hospital provides it.
Patient Satisfaction (PS) PS1 I am satisfied.

PS3
Compared to other hospitals, the level of satisfaction has been high.

Perceived Crowding (CROWD)
CROWD1 When the establishment is crowded, waiting time rises.

CROWD2
When the establishment is crowded, I am poorly attended by its employees.

CROWD3
When the establishment is crowded, the circulation in it is difficult.

CROWD4
When the establishment is crowded, I have to wait standing up to be served.CROWD5 When the establishment is crowded, I feel uncomfortable.CROWD6 When the establishment is crowded, I feel stuffy.CROWD7 When the establishment is crowded, I feel unhappy.CROWD8 When the establishment is crowded, I feel annoyed.
Furthermore, we graded the items using five-point Likert scales which ranged from a score of 1 for "Strongly Disagree", 2 for "Disagree", 3 for "Neither agree or disagree", 4 for "Agree" and 5 for "Strongly Agree".
In order to verify the patients' understanding of the items, we conducted two pre-test rounds, in December 2019.The analysis of both rounds showed that the use of the term "Health Center" proved to be pivotal for the questionnaire to be completely understood by the respondents, given that users when referring to this specific public healthcare setting do not commonly adopt the term "BHU".Moreover, as none of the respondents answered item R6 -"The hospital has accurate payment arrangements" -, claiming not to have knowledge or access to the BHU's billing, we removed it from the final version of the questionnaire.
We also excluded item A6 -"The hospital has adequate availability of doctors for women" -from the final version of the questionnaire because the scale 3 -"Neither agree or disagree" -was always chosen when answered by male respondents and due to the fact that the content of this question is addressed more comprehensively by item A1 -"The hospital has adequate availability of doctors".
After both rounds of pre-test were carried out, we established a four-week period for the data collection: from the 6 th to the 31 st of January, 2020.It was done in two shifts: in the morning, from 7 a.m. to 12 p.m., and in the afternoon, from 1 p.m. to 5 p.m. Our intention was to obtain between 200 and 250 valid questionnaires for each of the application shifts, thus resulting in a final sample of between 400 and 500 responses.
Patients who had just been attended at the BHU were approach and asked to complete the data collection instrument in loco.This method, according to Hussain et al. (2019), ensures that the perception of the service provided is more accurately captured, being a few moments after patients have been attended.
After the consent form was read to a respondent, who gave their verbal consent to participating in the study, the meaning of the Likert scale was explained.Each participant received a printed and laminated copy of the scale so that they could refer to it while the questionnaire was being conducted.We read the questions and asked each participant to verbalize the answer they wanted us to write down.The responses were recorded using a tablet, on which a digital version of the instrument had been installed, created using the Google Forms tool.
The sample itself is defined as being non-probabilistic, as not all people from a given population have a fixed chance of answering the questionnaire, and the sampling technique adopted was by convenience (Malhotra & Dash, 2009).417 valid responses were obtained for the final sample.We considered as valid those questionnaires which had been completely answered by patients who had not given the same score for all items.
We used the G*Power 3.1 software to verify the sample size adequacy for the statistical analysis techniques that considered the Partial Least Squares method (PLS-SEM).Considering the inputs -moderate effect size (f 2 ) of 0.15, power of the test (1-β) equal to 0.95, and 5 predictors -, the results indicated that our final sample well exceeded the minimum required sample size of 107 responses, and was therefore considered adequate.
PLS-SEM is used when data is not normally distributed and the scales used in the research model are adapted from other previously developed and tested models (Hair et al., 2014a), meeting the requirements of the current study.According to the method proposed by Hair et al. (2011), the data analysis process has four stages: Descriptive Analysis, Measurement Model Analysis, Structural Model Analysis, and Moderator Variable Analysis.We used the IBM SPSS 21 and SmartPLS 3.0 to conduct the analyses.
After characterizing the sample by means of the Descriptive Analysis, we performed the Measurement Model Analysis through the Confirmatory Factor Analysis (CFA), which verifies whether the model has a good fit.We also conducted a Variance Inflation Factor (VIF) analysis in order to ensure the absence of multicollinearity.VIF can be calculated by Equation 1 below (Hair et al., 2011(Hair et al., , 2014a)): where Ri 2 represents the unadjusted coefficient of determination for regressing the i th independent variable on the remaining ones.
Then, we assessed the reliability of the constructs using Composite Reliability (CR), which explains the total amount of the true score variance in relation to the total score variance and is calculated by Equation 2 below (Hair et al., 2014a): where Σλ represents the sum of the factor loadings and Σ is the sum of errors of measurements also known as residual variance.We also assessed the convergent validity of the constructs through the Average Variance Extracted (AVE) by Equation 3 (Hair et al., 2014a): where  2 represents factor loading squared; Σ 2 indicates the sum of factor loadings squared; and Σ is the sum of errors of measurements.Finally, we used the Fornell-Larcker criterion to assess the discriminant validity of the constructs that is established if the condition presented by Equation 4 holds (Henseler et al., 2015): where   is the correlation coefficient between the construct scores of constructs   and   .
In the first step of the Structural Model Analysis, we evaluated the statistical significance of the model relationships through the t-value and p-value.In the second, we assessed the Determination Coefficient (R 2 ) of the structural model, which explains its predictive power.Thirdly, we analyzed the Validated Stone-Geisser Redundancy Measure (Q 2 ) to assess the predictive power of each inner construct in the model (Henseler et al., 2009).
Next, we used the bias-corrected confidence interval technique, which is appropriate when the sample collected is non-parametric, in order to analyze CROWD as a moderator variable for the model.The specific group results of a path coefficient are significantly different if the bias-corrected confidence intervals do not overlap (Sarstedt et al., 2011).To perform the analysis, we used the MGA tool, part of the SmartPLS, after splitting the sample into two groups: low and high crowding.
We calculated the item averages related to the CROWD scale for each respondent.As these items were designed to indicate a positive perception of crowding, according to the method by Li et al. (2017), mean scores above 3 Likert points indicated high perceptions, whereas those equal to or less than 3 Likert points indicated low perceptions.We used this criterion to arrange the respondents into such groups.

Descriptive analysis
Of the 417 respondents, 56.4% were male and 18.5% aged 25 to 29 years old.Regarding other factors, a significant proportion of respondents had not completed high school (33.6%); nearly half (49.6%) gave their religion as Catholic; and 48.2% stated they were married.
The majority of the respondents' families comprised more than three people (55.2%).Regarding income, 40% said they receive less than 1 minimum wage; 34.3% exactly 1 minimum wage; and 16.3% between 1 and 2 minimum wages per month.It is significant to note that 8.4% of the respondents said they had no monthly income and none of the participants reported a personal income above three minimum wages.Such data reiterate the economic definition we have adopted in this study to characterize low-income markets.

Measurement model analysis
Once the path diagram was built, we conducted the CFA in order to assess construct loadings.Loadings between 0.5 and 0.7 are satisfactory (Hair et al., 2014b).Table 2 shows the values obtained for each construct along with their mean and standard deviations (SD).According to the table, all factor loadings are satisfactory.Item C5 (0.581) has the lowest loading.As its value is above the lower limit discussed in the literature, we elected to keep it in the model.
Once this stage was completed, we assessed the multicollinearity level of the indicators by using the VIF, whose values must be less than 5.0 (Hair et al., 2011).It should be noted that the indicators referring to PHQ, which is a second-order construct, appear twice, as recommended by Hair et al. (2014b).All the values are satisfactory, as shown in Table 3.The next step consisted of analyzing the quality measurement indicators: AVE and CR.The AVE values must be equal to 0.5 or higher (Hair et al., 2014b) whereas the CR values must be greater than 0.6 (Sarstedt et al., 2014).Table 4 shows the indicators for each model construct.All the construct CR values are satisfactory.This was also the case for the AVE values, except those referring to the R (0.467) and A (0.463) constructs, which are slightly less than 0.5.However, according to Fornell & Larcker (1981), if the AVE value is slightly less than 0.6, but the CR value is greater than 0.6, the convergent validity of a construct may still be considered adequate.Therefore, we have verified the convergent validity of the model.
The final step of this analysis was to assess the discriminant validity of the model, which was carried out using the Fornell-Larcker criterion.The discriminant validity is established when the square root of the AVE of each construct is greater than the correlation coefficient between the constructs.Table 5 presents the matrix of values for this criterion.According to the table, all the quadratic values of the AVE's latent constructs are superior to the value of the correlations.Thus, we have verified the discriminant validity of the model.For all the stages of the Measurement Model Analysis, we obtained satisfactory results within the parameters discussed in the literature.Therefore, as a result, the Structural Model Analysis can be carried out.

Structural model analysis
In the first step of this analysis, we assessed the significance of the relationships between the constructs in the structural model.Hair et al. (2014b) suggest that if the path coefficient value is greater than 0.1, the t-value must be greater than 1.96 and the p-value less than 0.05 for the statistical significance of the hypothesis to be accepted.Table 6 presents the values of the path coefficients, t-value and p-value for each relationship in the model.Among the hypotheses established, the strongest relationship in the model was found between PHQ and PS, with a path coefficient equal to 0.663.T and R have the greatest influence on PHQ, whose path coefficients are respectively equal to 0.345 and 0.319.The statistical data also shows that E has the least influence on PHQ, with a coefficient of 0.215.
The second step was the assessment of the R 2 of the structural model.Values above 0.26 are satisfactory (Sarstedt et al., 2014).We obtained an R 2 equal to 0.439.Finally, we used the Q 2 to assess the predictive capacity of each inner construct.Values above zero are acceptable (Henseler et al., 2009).The Q 2 is 0.274 for PHQ and 0.271 for PS.
Figure 3 illustrates the validated research model with its path coefficients.

Moderator variable analysis
The last stage of this analysis was an assessment of the moderating effect of Perceived Crowding on the relationships between the dimensions of quality and PHQ and between PHQ and PS.We divided the respondents into two groups according to their perception of crowding: low (48.9%) and high (51.1%)perceived crowding.In order to ensure that the measurement model remains valid even when applied to the groups, we have analyzed it for each one of groups separately.
In doing so, problems such as negative loadings of indicators, low internal consistency and discriminant validity were found, mainly for the group with lower crowding levels.Respondents with considerable differences in responses to indicators of the same construct were surveyed to eliminate potential outliers since in theory, the values of reflective indicators should converge.The literature recommends that removing outliers can improve PLS-SEM results (Leguina, 2015).Thus, respondents who had a standard deviation greater than 1 in any of the constructs were eliminated from the sample (Mashhadlou & Izadpanah, 2021), obtaining a new sample size of n = 284.This new sample was used exclusively for the MGA, and the analysis of the general model was based on the complete sample.The results of the group measurement models improved significantly with this approach.
For the construction of groups for analysis of crowding moderation, we opted for the approach known as Extreme Groups Analysis (EGA), which allows greater levels of power in hypothesis tests (Preacher et al., 2005).This technique has been employed in recent behavioral research (Emerson et al., 2022;Zekan & Mazanec, 2022;Murphy & Creux, 2021).To compose the new sample, the 25% of respondents with the lowest average crowding and the 25% with the highest crowding levels were selected, thus obtaining subgroups of similar size.Thus, 50% of the sample with intermediate crowding characteristics was disregarded, to exacerbate potential effects, comparing groups with more significant differences.
The sample size of each group (n =74) obeys the 10-times rule in which the minimum sample must have at least 10 times the number of indicators in the formative construct with the largest number of variables.In this case, as the PHQ was formed by 5 other constructs, the minimum size required would be 50 respondents.Measurement models for low and high perceived crowding are found in Appendix A and Appendix B of this study respectively and demonstrate acceptable levels in all validity and reliability measures.
Once the model for each group was validated, the MGA was performed.Table 7 presents the bootstrapping results and path coefficients for each group.The results show that all t-values are greater than 1.96 and all p-value are less than 0.05, therefore, the path coefficients for each group in the MGA are statistically significant and satisfactory for the relations.Figure 4 features a comparison between the paths coefficients found within each group.Table 8 features the moderator variable analysis for both groups by using the biascorrected confidence interval.

Discussion
After analyzing both the Measurement and Structural models, along with the Determination Coefficient (R 2 ) and the Validated Stone-Geisser Redundancy Measure (Q 2 ), we can state that the research model as a whole has good predictive power.
Taking into account the statistical data, we can see that the first hypothesis developments (H1) were supported.Thus, each dimension of quality, e.g.Reliability (H1a), Assurance (H1b), Tangibles (H1c), Empathy (H1d) and Responsiveness (H1e) has a positive effect on Perceived Healthcare Quality (PHQ).Although these results were expected, they demonstrate the validity and reliability of adapting and applying the SERVPERF scale for studies regarding quality in healthcare, especially in the Brazilian context where development of these studies is still in progress, as discussed by Massuda et al. (2018).
Moreover, all of the constructs have high values for their path coefficients.Kim (2019) states that an assessment of quality in healthcare is a complex process, influenced by patient actions, emotions and expectations.Therefore, studies conducted in similar settings and subjects may present different outcomes.In our model, low-income patients perceived Tangibles to have the greatest influence on PHQ, followed by Reliability, Assurance, Responsiveness and Empathy.In contrast with our results, Tangibles were the least perceived construct of quality in the studies of low-income patients by Narang (2011) and Akdere et al. (2020), in the Indian and the Turkish public healthcare context, respectively.
Furthermore, the fact that Responsiveness and Empathy have the lowest values for the path coefficients indicates that these dimensions must be a priority for BHU management, if the overall quality of healthcare services is to be improved.According to Verleye et al. (2017), public healthcare managers must take a proactive role when it comes to managing quality, especially concerning low-income patients.
According to Badri et al. (2009), patient satisfaction is defined by the assessment of different dimensions of healthcare services, in which the cure is the expected main goal.Moliner (2009) states that in the case of low-income patients this assessment is mainly based on their perceptions, as they experience health services on a more emotional level.The statistics show that the path coefficient between PHQ and PS has the highest value in our model.This result, associated with t and p-values, supports the second hypothesis (H2) of our study, according to which perceived healthcare quality has a positive effect on patient satisfaction.
Thus, PHQ is an antecedent of PS, e.g., the better the service quality, the more satisfied a patient will be and vice-versa.Moreover, this relationship must be constantly assessed and improved in order to raise the excellence of the service by creating value for a patient at the moment it is provided.
As discussed by Dion (2004), Perceived Crowding (CROWD) is the emotional reaction caused by stress in a place with a high concentration of people and that might cause negative emotions.Widely studied in the retail sector, crowding has become a common concern among researchers and health professionals who have analyzed it through time studies and optimal resource allocation (van der Linden et al., 2019;Wang et al., 2017;Boyle et al., 2015).
In order to approach this topic from a new perspective, we tested CROWD as a moderating variable between the constructs in our model by using two groups -low and high perceived crowding -, devised according to patient perception of crowding.After analysing the statistical data of a subsample of our data, we concluded that this variable has no moderating effect on every relationship in the model.Hence, we can say that our third hypothesis (H3), which states that the relationship between the dimensions of quality and perceived healthcare quality is significantly more negative when perceived crowding is high, is not supported, as CROWD moderates no relationship between the constructs of the PHQ.
This result can be explained based on the inherent characteristics of the low-income communities, in particular, on their perception of healthcare provision in public establishments.
According to Bhattacharjee et al. (2017), the perception regarding these services is so negative that lead low-income patients to shy away from receiving any kind of treatment.Moreover, Archibong et al. (2020) state that low-income people perceive public healthcare services as of poor quality and they are provided in a crowded environment in which long queues are usual and there is little sympathy between doctors and patients.
Thus, as the patients already expected to receive a poor quality care in an unattractive environment, their perception of crowding -either high or low -did not influence the way they perceive the different dimensions of healthcare quality.
Similarly, we could also conclude that the fourth hypothesis (H4) of our study, according to which the relationship between perceived healthcare quality and patient satisfaction is significantly more negative when perceived crowding is high, is not supported either.As discussed by Naidu (2009), patients tends to weigh their satisfaction towards healthcare services based on their previous experiences.Hence, we can explain this result by the stigma of poor service that surrounds public healthcare and its impact on patient satisfaction.As the low income patients' previous experience with public healthcare delivery has likely been negative and left them unsatisfied, they already expect not to feel satisfied when coming back for receiving treatment and their perception of crowding has no influence on this regard.
Nevertheless, we would like to stress that, even though both H3 and H4 were not supported, it is important to assess crowding in healthcare, especially in a complex and dynamic environment as it is the Brazilian public healthcare context and even more important from the perspective of low-income patients.

Final remarks
The main objective of this paper was to assess relationships between the perceived quality and satisfaction of low-income patients towards the Brazilian Unified Health System under the moderating effect of perceived crowding.After analyzing the statistical data, we could see that all quality dimensions have positive and significant relationships with PHQ, which, in turn, has a significant effect on PS.
However, the results did not support the hypothesis that CROWD has a moderating effect on all the relationships between the constructs.Thus, the results were unable to show that patients tend to be less satisfied with service quality, which equally decreases when the perception of crowding is high.
Our findings enhance academic understanding of the factors regarding quality in public healthcare from the perception of low-income patients.Moreover, we have demonstrated the validity and reliability of adopting and applying the SERVPERF scale for studies regarding quality in the context of Brazilian healthcare and, thus, we would encourage researchers and health professionals to develop new studies on the subject in the country.
Although the effect of crowding is widely studied in retailing, its study in healthcare is still 'work in progress' and mainly focuses on measuring the duration of the internal processes of service providers, and suggesting ways to reduce it.Our study differs from previous work by developing a research model in which perceived crowding is tested as a moderating variable in a specific population segment, that is, people on low incomes, being of the first to evaluate the theme.
Even though moderation did not occur, our model is a contribution to the field of MGA, since it serves as a window to understanding how CROWD affects both PHQ and PS in public healthcare and deepens the discussion on how people on low incomes perceive crowded environments outside of retailing.
In terms of managerial contributions, our study highlights the dimensions of quality which have the lowest performance, thus helping managers to understand the factors which directly impact PHQ and, consequently, PS.As public healthcare establishments regularly suffer from a lack of resources, policy makers can use our results to direct efforts and resources, both human and financial, assertively, re-establishing quality dimensions through improvement projects.Also, understanding crowding in healthcare environments can help devise alternatives to minimize it, such as the development of a scheduling system and the adoption of a standardized color scale which identifies and prioritizes each patient's urgency of care, thereby reducing queue waiting times.
Finally, the restrictions of our study and research avenues should be acknowledged.Although we collected data in loco, our sample is still non-probabilistic and by convenience, thus one has to be careful when generalizing the results obtained, given that they only reflect the perception of a single low-income population regarding healthcare and not the view of every single person of this segment.The size of the public healthcare establishment and the data collection period are other restrictions of our study, as we restricted our analysis to a single BHU over the period of a month.
We recommend, therefore, that future studies include more than one BHU in their scopes and extend the data collection period for a couple of months.Additionally, our study could be replicated in larger public healthcare establishments, such as hospitals placed in huge urban centers, which serve low-income communities.Further studies should also continue to explore possible differences in the relationship between groups of patients divided by perceived crowding.Furthermore, they should test new moderator variables in the research model, such as age and gender, in order to assess significant differences between them.
Considering that our study was conducted prior to the COVID-19 outbreak, we suggest that our work be repeated within the present context, in order to test whether there are indeed differences in the proposed relationships of the model.The situation experienced may have changed people's level of demand for quality and health services, taking into account other variables that may be affecting this relationship more, such as waiting lists, the type of disease, or even the fact of being attended by a doctor 'face to face'.Finally, the pandemic context has made understanding how crowding affects healthcare settings and how low-income patients perceive it even more important.

Figure 2 .
Figure 2. Proposed research model with crowding as a moderator variable.

Table 1 .
Original structures of the questionnaires.

Table 2 .
Factor loading, mean and standard deviations.As the CROWD items are not part of the measurement model, we have omitted their loadings.

Table 4 .
AVE and CR for each construct.

Table 6 .
Path coefficients and their significance.
Note: b Significant at 0.05 level based on 5000 bootstraps.Source: The authors, 2020.

Table 7 .
Bootstrapping results for MGA.