EMPIRICAL COMPARISON OF THE MULTIDIMENSIONAL MODELS OF ITEM RESPONSE THEORY IN E-COMMERCE

The measurement of latent traits within the organizational field such as quality, effectiveness and learning has been conducted in several formats using a wide variety of quantitative methods, including Item Response Theory that consistently increased in organizational studies. The purpose of this article is to compare the hierarchical and non-hierarchical structures of three multidimensional models of Item Response Theory, based on the interface quality measurement in e-commerce sites. We compared the multiple unidimensional, compensatory multidimensional and bifactorial models, and also elaborated and applied 75 items in a sample of 441 e-commerce websites. As a result, we conducted a discussion of the latent construct, the quality in e-commerce and its multidimensional configuration to adjust and compare three multidimensional models.


INTRODUCTION
The quality of websites is a complex matter and sometimes difficult to be measured directly.Multiple features should be discussed from various angles that exceed the simple to use interface technical questions, including issues related to aesthetics, reliability, interactivity and other factors that involve non-technical issues.According to Nielsen (2007), technical issues such as usability continue to be necessary, but not a sufficient condition that provides the quality demanded for a website, in all contexts.

Multidimensional Item Response Theory
The mathematical foundation of IRT is a function that relates the probability of a person responding to an item in a specific manner to the standing of that person on trait that item is measuring (Ostini & Nering, 2006).One of the underlying assumptions of IRT is that examinees are all using the same skill or same composition of multiple skills to respond to each of the items in a test.When item response data do not satisfy the unidimensionality assumption, Multidimensional Item Response Theory (MIRT) should be used to model the item-examinee interaction.MIRT enables modeling the interaction of items that are capable of discriminating between levels of several different abilities and examinees that vary in their proficiencies in these abilities (Ackerman, 1994;Reckase, 2009).
The use of MIRT models to handle measuring problems in large-scale educational evaluation has been conducted since the early 1990s (Ackerman, 1992;Camilli, 1992;Embretson, 1991;Glas, 1992;Oshima & Miller, 1992;Reckase & McKinley, 1991).Nevertheless, according to Adams et al. (1997), Hartig & Höhler (2008), Rauch & Hartig (2010), the application of models in a practical test outside of the educational (e.g.Levy, 2011) Soares (2005) implemented it in socio-economic status index, demonstrating the viability of this tool as well as its potential in this field.
Depending on the final objectives and the structure of the data, MIRT can be considered a special case of multi-varied statistical analysis, especially factorial analysis or modeling of structural equations, or even, as an extension of the unidimensional IRT (Reckase, 2009).
Before presenting the models of MIRT, it is important to understand the multidimensional structures involved in a test: First, the relationship between latent dimensions and the items and then, the relationship between the latent dimensions and the respondents.As stated by Adams et al. (1997) and Hartig & Hohler (2009), the standard of relations between the dimensions and the items can be defined by a loading matrix with a simple structure (multidimensionality between the items) or by a structure of complex loading (multidimensionality between each item), and therefore it varies in complexity.Meanwhile, the standard of the relationship between the latent dimensions and the ability of the respondent has a compensatory or non-compensatory interaction.
Hartig & Hohler (2009) believe that the advantage of models that use the multidimensional structure between items is that they are less complex than the models that use multidimensionality within each item and the latent trace can be easily elucidated.Among these models, the estimate of the latent trace scores provides a simple performance measure of a specific set of items.In many cases, these measures will be highly correlated, because the items measure the same set of abilities.Nevertheless, the latent dimensions in the multidimensional model between items represent the necessary combination of all the abilities required to solve the respective items, regardless of how these abilities need to be integrated.Any overlapping is represented in the latent correlations.Hence, if the main interest of the study is to obtain descriptive measures of performance in areas of a determined content, the models between items are more suitable than the more complex models such as those of multidimensionality within each item (Hartig & Hohler, 2009).
The distinction between the multidimensionality within each item and between items is illustrated in Figure 1.

Multidimensionality between items
Multidimensionality within each item Although the type of dimensionality is a basic concept, the distinction between multidimensionality between items and within each item is vital to the correct identification of MIRT models (Babcock, 2009).
From a practical point of view Traditional IRT models assume that the data can be explained by a single latent trait.However, this restriction makes these models inappropriate for data with multidimensional structure as shown in Figure 1 (Hambleton & Swaminathan, 1985).Min (2003) summarizes three different conditions under which the application of the one dimensional model is suitable: (a) the examined capacity and characteristics vary in a dimension as a presumed model, (b) examining the ability varies only in one dimension even if the tested items are measuring more than one skill, (c) the ability of the examined capacity is different in multiple dimensions, but all items are measuring the same compound skills.In line with Lin (2008), there are other conditions that cannot be classified within these three mentioned and applying unidimensional models can be problematic.Studies have shown that when the multidimensional data are modeled based on the unidimensional assumption, measurement errors will increase and the consequences of the results are problematic (Ansley & Forsyth, 1985;Sireci et al., 1991;Ackerman, 1994;Reckase, 1995).
Before deciding which is the most appropriate multidimensional structure of IRT it is necessary to assess the dimensionality of the data set.There are some methods that can be used to make it possible like restricted information and full information methods.Soares (2005), states that the restricted information method is the inspection of the eigenvalues of the tetrachoric correlation matrix.The factor analysis method of full information, instead of using the eigenvalues tetrachoric correlation matrix, creates a multidimensional model using the warheads curve (Bock & Aitkin, 1981;Bock et al., 1988), and is therefore an adaptation of the factor analysis traditional model, that considers the dimensional structure associated with continuous variables set for dichotomous data as in the following equation: and e ∼ N (0, ψ) with ψ diagonal, it is then that: where σ ei is the standard deviation of e i .
The structure of the model can be seen as Parameterizing (6) in the following: where b i is the difficulty of the item, and a i j is the specific discriminating of each dimension, serving as one of the bases for estimation methods of multidimensional models of IRT.Reckase (1985) describes the compensatory multidimensional model of two parameters in the following manner: where: U i j = response of the individual (or website) j to item i (0 or 1); a ik = vector of the parameter of discrimination of the item i in the dimension k; θ jk = vector of the latent trait of the individual (or website) j in dimension m; d i = scale of the parameter of difficulty of item i.
The exponent of e in the model ( 1) can be expressed in the following manner: Equation (2) shows that the exponent is a linear function of elements of θ with parameter d as the ordinate of the origin and the elements of vector a as the parameters of inclination/discrimination.One of the properties of this model is that the expression represented in the exponent defines a line in a space of k dimensions that can generate lines of equiprobability, meaning, this multidimensional form allows the existence of infinite linear combinations that result in the same exponent, thus generating, the same probability of accuracy.This property confirms the model of compensatory characteristic.
Gibbons & Hedeker (1992) developed general concepts based on the classic work of Holzinger & Swineford (1937) and propose the Full-information Bifactorial (FI Bifactorial) model for dichotomous data (Li & Rupp, 2011).This model consists of general factors and groups of factors or of independent dimensions.The model FI Bifactorial assumes the place of a general factor that involves all the items and two or more groups of factors (or dimensions) corresponding to specific subgroups (Gibbons et al., 2007).
Mathematically, the FI Bifactorial model considers cases in which for n items, there is a solution of s factors of which one of the factors is considered a general factor and s −1 is related to groups or factors.The bifactorial solution restricts each item i to having a different load of zero a i1 over the primary dimension and a second load (a ik , k = 2, . . .s) and not more than one of the s − 1 groups of factors.For four hypothetical items, the standard bifactorial matrix can be represented in the following (Gibbons & Hedeker, 1992): where the first column of the matrix represents the main factor and the second and third columns represent the group of specific factors.
As specified by Seo (2011), the dimensional structure in a bifactorial model is pre-determined through previous information.Therefore, the bifactorial model is a confirmatory model.In the perspective of a confirmatory approach, the model allows each item to have loads in a single general factor and only one specific factor for the group.This particularity reduces the number of parameters to be estimated and gives the model greater degrees of liberty.In addition, the bifactorial model can avoid the problem of estimating inter-factorial correlations, because the general factor contributes directly to all items, and the factors secondary to the remaining residual information after the calculation of the general factor are independent from each other.A particular quality of this model is the fact that the secondary factors are necessarily orthogonal between each other and in relation to the general factor (Gibbons & Hedeker, 1992).
For two binaries, the bifactorial model can be defined as a particularization of the compensatory multidimensional model presented in equation ( 1).In the case of the bifactorial model, the restriction of loading the parameters of discrimination is inserted as it can be seen in equation (4).
where θ j g is the ability of individual j in the general factor, θ jesk is the ability of the individual j in the specific factor k, a ig represents the parameter of discrimination of item i in the general factor and a iesk represents the parameter of discrimination of the item i in the specific factor k and finally, d i represents the scale of the parameter of difficulty of the item i referring to the general dimension and to the specific dimension k.In this model, and in the compensatory multidimensional model represented in equation ( 1), the responses are assumed to be statistically independent.
Gibbons et al. (2007) believe that the FI Bifactorial model is relevant whenever the items share a common characteristic.The presence of the subgroups of items typically introduces associations relevant to the test that cannot be verified with total attribution of the loading parameter to the general factor.In addition, according to the authors, this separation of factors improves the error of the estimates.
Gibbons & Hedeker (1992) and Gibbons et al. (2009) believe that the restrictions of the bifactorial model presented in matrix (3) lead to a greater simplification of the equations of probability because they require only the evaluation of the two integral dimensions, which (a) allow analyzing models with a larger number of factors (or dimensions), (b) allow a conditional dependency between subgroups of identified items, and (c) in many cases provide a more parsimonious solution than the full information item factor analysis due to its lack of restrictions.Gibbons et al. (2007) extended the bifactorial model to cases of Polytomous Items Response Theory.
Figure 2 contextualizes the FI Bifactorial model within some multidimensional structures.
Model A is the standard unidimensional model in which the covariance between the responses to the items is explained by a common factor.In model B, the matrix of data contains more than a common dimension, although the dimensions are not correlated.This is a trivial case of multidimensionality and it is easy to solve, forming subscales and then separately adjusting to the unidimensional models of IRT for each subscale.This is essentially equivalent to assuming the dimensions as not correlated.Model C also has more than one common factor among the items, although the factors are correlated.This representation is characterized as a non-hierarchical multidimensional model.
Finally, model D is a bifactorial model, meaning, there is a general factor, which explains the correlations between items, but in addition, there are also the called "group" factors (on the right side of the figure), which are trying to capture the covariance of the items that are independent of the covariance of the general factor.In other words, it is expressed in terms of quality measure of websites, it can suppose that the latent trait quality, considering its conceptual amplitude, represents a general factor explained by other factors (for example, usability, aesthetic, architecture of information, etc.), thus characterizing a suitability to model D (hierarchic multidimensional, etc.), therefore, the scope of this concept can lead to its dissolution into correlated subfactors -Model C (multidimensional non-hierarchical).

Quality on the web
Quality is not a new concept in management of information systems.Researchers and professionals demonstrate that they are aware of the need to improve information systems to react to external and internal pressure and face the critical challenges for their growth and survival (Aladwani, 1999;Aladwani & Palvia, 2002).
From the early 1980s until the late 1990s it was possible to find various studies that tried to conceptualize quality in information systems, demonstrating the concern among professionals and academics to understand and improve these systems.As a result, the concept of website quality adopted is that of a set of technical and non-technical characteristics, allowing the user to proceed to create their objectives on a website in an accessible, efficient and pleasant manner.Technical characteristics are understood as the usability/navigability and presentation of information and the accessibility and interactivity of the system (focus of this study).Non-technical characteristics are understood as design, aesthetics, visual and commercial appeal, reliability, hedonism and empathy.

METHOD
The methodological procedure used in this work, involved the characterization of the study, the preparation of the items, data collection and statistical methodology.
In terms of the characterization, the study is predominantly quantitative although it has a qualitative exploratory base with the objective of understanding the field of study about quality in commercial websites and serves as a base for the elaboration of the items in this study.

Instrument testing
The questionnaire (checklist) has 75 items linked to the quality of websites.The construction of the items was made by the association of concepts from the analysis of 191 articles collected in a systematic analysis of literature.For example, some of the recurring concepts were "information content" associated with "credibility", "accuracy", "completeness", "utility" which is reinforced by Kim et al. (2005).These concepts associations support the following construction: -Is there product basic information?(Information content + utility + credibility).The Appendix shows each item with its reference.The items had their content validated by three expert judges in the area.These items, although they are objective response items independent from the user perception, were based on previous studies that used tests with users and or satisfaction studies.

Data collection
The sample definition used the intentional sampling method, in order to draw a low, medium and high sample quality of the commercial sites used by Brazilian population.Accordingly, in addition to sites with the most diverse genres of market products, there was a variety of design styles, usability, aesthetics and layout contemplating from something relatively primary to overly demanding, which does not necessarily imply high or low quality, only guarantees diversity, a precondition of Item Response Theory.There is no consensus on the optimal sample size for use of item response theory (Downing, 2003;Wongtada & Rice, 2008).The data collection was conducted on 441 Brazilian commercial websites.56 out of 75 items were collected manually and another 19 were collected automatically using the Achecker tool (http://achecker.ca/).

Analysis and discussion
The statistical analysis methodology primarily conducted the evaluation of the number of dimensions of the set of items followed by a verification of quality of each item and then a validation of the dimensions and the verification of the suitability of Item Response Theory multiple unidimensional models; compensatory multidimensional and bifactorial IRT.The dimensional analysis of the original data set (75 items) was made through factor analysis method of restricted information and the method of factor analysis of full information.In the first method, the number of dimensions observed was based on a tetrachoric correlation matrix and parallel analysis, which was used to Psych Package (Revelle, 2012) implemented on software R (R Core Team, 2012), because of its dichotomous responses, the dimensionality of the total set was also verified through the full information method.The approach used is described by Bock & Aitkin (1981) and Bock et al. (1988), in which the dichotomous items treatment and the estimation of the loading factor is achieved by the technique called factor analysis of full information, implemented on software R (R Core Team, 2012) in MIRT package (Chalmers, 2012) and flexMIRT software (Cai, 2012).In this method, the number of dimensions was evaluated based on two information criteria, the Bayesian Information Criterion (BIC) (Schwarz, 1978) and the Akaike Information (AIC) (Akaike, 1973).The use of this method for determining the number of dimensions is discussed by Bartolucci et 1 shows the flowchart of the analysis and targets.

RESULTS AND DISCUSSIONS
The analysis of dimensionality phase revealed the complexity of an analysis of this nature.Depending on the statistical technique used for this analysis, the results can diverge in terms of the number of dimensions.The analysis by the restricted information method suggested the existence of 9 dimensions (Table 2).Meanwhile, the complete information method suggested the existence of 3 dimensions (Table 3) while the parallel analysis technique indicated the existence of more than three dimensions (Fig. 1).  2 indicates that the first eigenvalue is 11.0 which in a set of 75 items that represents 14.6% of the total variations explained by the first factor or first dimension.This result brings evidence that the construct should not be assumed to be unidimensional.In addition, if we use the criteria of proportion of explained variance with > 50% we identify 9 dimensions.However, in the IRT context, the percentage of variance accounted exceeds Reckase's (1979) rule of 20% for an item parameter to be considered stable.Taken together, it is reasonable to conclude that there are at least two dominants factors; this is sufficient to satisfy the IRT assumption (Bortolotti et al., 2013).Ventura et al. (2011) says that gold standard rules of thumb for deciding when a response matrix is "unidimensional enough" or multidimensional for IRT modeling do not exist (see Embretson & Reise, 2000), researchers generally seek a large ratio of the first to the next eigenvalues (e.g.> 3 to 1).Thus, the ratio between the first and fifth eigenvalue have a value > 3. The important criterion is whether if a dominant general factor running through the items exists or not.The way of exploring this issue, as discussed by Reise et al. (2007) and others (e.g., Immekus & Imbrie, 2008), is through an adequate bi-factor model and comparing the results with the unidimensional or multidimensional models (Ventura et al., 2011).
According to Chalmers (2012), the number of dimensions that generate a better adjustment to the data can be verified by comparing models using a generic variance analysis (ANOVA) implemented with the software R (R Core Team, 2013).The result returns the chi-square (χ 2 ) based on the test of verisimilitude, as well as the value AIC and BIC when comparing models.A comparison was made of nine models, the first assumed one dimension (Mod1); the second, two (Mod2); the third, three (Mod3); the fourth, four (Mod4); the fifth, five (Mod5); the sixth, six (Mod6); the seventh, seven (Mod7); the eighth, eight (Mod8); the ninth, nine (Mod9) dimensions.Table 3 presents the results.The parallel analysis method indicates the existence of 24 dimensions.This conclusion can be verified in Figure 3 where the dotted line refers to the set of simulated data and the full line represents the real data.Note that there are 24 eigenvalues above the dotted line.Thus, the equation for these diverging results was based on the empiric analysis of dimensions and of the concepts of items associated to each one.Moreover, the number of dimensions definitions of the construct was based on the theoretical interpretation of each dimension in relation to the items associated to it, resulting in a four-dimensional structure.In this analysis, 31 items were identified that present commonality lower than 0.40 or factor loading less than 0.30; in all dimensions, were assumed to be uninformative items to construct, therefore, they were excluded from the analysis, leaving 44 items.
From a statistical point of view, these items are not correlated with other items, implying that if the goal is to measure quality of websites, these items theoretically are not associated with this goal.From a practical point of view, it can be seen that, in case of item 01, as discussed by Tezza et al. ( 2011), in which the same item was evaluated and discarded in a unidimensional construct, it may not feature a cumulative item.That is, the possibility of a pop-up opening for interaction with the website user is seen in the literature as bad for the quality and confuse the user (Storey et al., 2002;Petre et al., 2006;Nielsen, 2006).However, this feature is complex because it involves commercial and technological maturity issues and it may indicate that this feature purely evaluated as whether an opening pop-up window can or cannot evaluate linear or cumulatively with the quality of a website.
Theoretically, the four dimensions revealed themselves to be associated to concepts of navigability or user conduction-orientation, accessibility and reliability of the system, interactivity and presentation of information.These dimensions found in this work are related to the dimensions mentioned most often in the literature and that are directly related with the definition of quality on websites, which is a set of technical and non-technical characteristics of a web system, allowing the user to proceed to create their objectives on a website in an accessible, efficient and pleasant manner.Technical characteristics are understood to be usability-navigability, presentation of information, accessibility and interactivity of the system.
Based on the definition of a four-dimensional structure, an analysis of the multiple unidimensional, the compensatory multidimensional and the bifactorial models began based on Item Response Theory.The multiple unidimensional analysis, which subdivides the general set of items into unidimensional subsets based on the dimensions defined in the factorial analysis, proves to be more suitable than simply considering the unidimensional construct as a whole.
The unidimensional approach has some advantages and disadvantages.One advantage of a unidimensional approach or a multiple unidimensional approach is the ease of analysis and representation of the resulting scale.Nevertheless, one disadvantage of the supposition of unidimensionality in a multidimensional construct is the fact that the result will be a linear combination of the dimensions, which may not represent the reality.In addition, Ackerman (1991) shows that the estimation of the parameters of IRT model using the unidimensional mode when the data are multidimensional, tends to filter the dimensionality, that is, measuring a multidimensional ability on a unidimensional scale tends to generate larger values of unidimensional discrimination.On the use of a multiple unidimensional structure there is the inconvenience of generating different k scales that are theoretically not comparable in terms of parameters as media and standard deviation.This makes the joint analysis of all the items more difficult.The results of the parameter estimation in the unidimensional models ModU1 (the factorial analysis first dimension), ModU2 (the factorial analysis second dimension), ModU3 (the factorial analysis third dimensions) and ModU4 (the factorial analysis fourth dimension) can be seen in Tables 4, 5, 6 and 7, respectively.The items parameters estimation in model ModU4, unlike those of the other 3 models, become more instable than the estimation of the same items considering the general unidimensional model.
In a unidimensional analysis it is common to eliminate the items with problems of estimation and re-estimate the other items to verify if some change in the items of good quality exists, despite assuming the independence between items.This analysis was conducted in the four models and few variations were found in the estimates.
Meanwhile, the compensatory multidimensional model, in addition to proving to be statistically more suitable than the unidimensional model for this work, according to the analysis of Table 10, has greater possibilities for joint analysis, because it considers the construct as a whole particularized in dimensions.This joint analysis allows generating a series of measures related to the items, such as multidimensional discrimination, multidimensional difficulty and multidimensional location.In addition to measuring on the same scale the proficiencies (degree of quality) of the websites in each particular dimension.
Table 8 shows the estimation of the parameters of the 44 items using the compensatory Item Response Theory multidimensional model by means of the flexMIRT TM software (Cai, 2012).
From this point of view, MIRT exceeds the unidimensional IRT due to the possibility of joint analysis of each item and of each respondent in each dimension, and as a consequence it is possible to identify the probability of possessing a certain characteristic in each website, based on the estimated quality of the parameters of the items.The multidimensional model of IRT offers great analysis and interpretations opportunities, nevertheless, these advantages result in an increase of complexity, particularly by working analytically in the vector space, and therefore with multiple geometric associations that are difficult to be visualized and interpreted in the traditional analytical forms.
The basic difference between the bifactorial model and the compensatory multidimensional model of IRT is that the first presents a general factor in which all the items are loaded and other specific orthogonal factors are analyzed.
The bifactorial model most widely diffused in the literature is the confirmatory nature model.Specific cases of exploratory analysis Jenrich & Bentler (2011) have been developed, although their practical application is still limited.Thus, the approach adopted in this work was the confirmatory one.To do so, the definition of the number of dimensions and of the groupings of the items defined in the factorial analysis were adopted in this work, assuming as a general dimension the quality of the commercial websites.
Table 9 shows estimates of the parameters of the bifactorial model, assuming the confirmatory structure based on the dimensions found in the factorial analysis.In the bifactorial structure it is possible to verify to what degree the items are associated to the general factor, which in this case, is represented as the quality of a website.Analyzing the values of the loading parameter associated to the general factor, it is found that most of the items identified in the previous model as characteristics of the requirements of the system, such as accessibility, present a low load in the general factor, and in addition, some present a negative load, such as, items 57, 58, 60, 69, 74 and 75, this mathematically reflects the negative loading visualized both in the secondary factors and in the compensatory multidimensional model.Nevertheless, if we verify the secondary loadings in these items, there is a uniformity of the intensity and the direction of the parameters, and it is thus possible to indicate what inherent characteristics of the system have an orientation different from the characteristics associated to the organization of the information or direct navigation, and can then represent orthogonal or non-compensatory characteristics and that cannot be treated as part of a general factor.In particular, the fact that the bifactorial model assumes the orthogonality between the secondary dimensions and of those in relation to the general factor, limits the suitability of this model to the construct that clearly possesses a general factor ortohgonal to the other subdimensions, which is not the case of the construct in question.Whereas, it is found that the quality of commercial websites is not a characteristic that can be represented by a general dimension orthogonal to the other subdimensions, at least not for the construct developed in this study.Thus, the comparison of these three models suggests that quality of commercial websites is a non-unidimensional characteristic that can be divided into four compensatory dimensions.
The suitable version of the bifactorial model of IRT in relation to the compensatory model of MIRT, both assuming four dimensions, was evaluated based on BIC and AIC information criteria, Root Mean Square Error of Approximation (RMSEA) and statistical error M2.The results can be seen in Table 10.It can be seen in Table 10, that in both the AIC and the BIC criteria, the compensatory multidimensional model in four dimensions (MIRT), had lower values than the bifactorial model and the unidimensional model, which indicates that this model is more suitable to the data than the bifactorial and unidimensional model.This verification is confirmed by RMSEA, which indicates a lower error for the MIRT model.The statistical error M2, available in the flexMirt software and discussed by Joe & Maydeu-Olivares (2010) and Liu & Maydeu-Olivares (2012), is similar to the chi-squared statistic and has been widely used in the verification of IRT models, given that, the lower its value, the more adjusted is the model in comparison to others.In this way, M2 indicates that the compensatory MIRT model is the most suitable of the three studied models.
As a practical interpretation of the multidimensional model of MIRT to four dimensions, Table 11 shows the estimate of the ability of the 4 first websites analyzed, in the normal scale N (0; 1), that is, an average of zero and variance of one.
We verified that the first website has a greater command of the items related to information presentation, thus requiring improving its quality, mainly in terms of user control or interactivity, which had below average control.On the website number 002, one can verify that there is a good command of system accessibility-reliability, but there was a need for improvement in terms of requirements of control partially by the user-interactivity and information presentation.

FINAL CONSIDERATIONS
In general, this work verified the viability of the use of Item Response Theory in the organizational context.The main contribution of this article is the consideration of the multidimensional structures, which is common in organizational evaluations.It was evident that in this application, that the unidimensional model (non-hierarchical structures) is not always the best choice.This mainly depends, on the nature of the items and on the characteristics of the respondents.A deeper analysis of each element is thus necessary.
This study also verified the unsuitability of the use of a general unidimensional model or of a multiple unidimensional model (both non-hierarchical structures), which utilizes unidimensional models to try to represent the general construct in the question -quality of e-commerce websites.
The multidimensional model suitability of two compensatory parameters (non-hierarchical structures) was then found, as an additional analysis, based on the compensatory multidimensional model, the data adjustment to the bifactorial confirmatory model (hierarchical structures) was verified.This analysis showed that statistically, the multidimensional model, non-hierarchical structures aggregates more information to the construct when compared to the bifactorial model, hierarchical structures and to the unidimensional model.In this way, it is found that the bifactorial model does not represent more information to the construct requiring a possibly different approach than that considered in this work.
and psychological (e.g.Reise et al., 2011; Wilson, 2013) fields are relatively rare.Within the organizational context, the application of IRT is recent, Tezza et al. (2011) applied the logistical model of two unidimensional parameters, Birnbaum (1968) measured the usability in commercial websites, Trierweiller et al. (2013) applied the same model to propose a scale for measuring the disclosure of information of environmental management practices in Brazilian industries, Bernini et al. (2014) used a multidimensional approach of IRT to investigate the heterogeneity in residents' reaction to the tourism industry.Tay et al. (2011) applied a mixed model that considers latent variables of variables observed to measure union citizenship behavior with years of work experience and gender as covariates.Rivers et al. (2009) applied IRT to measure employees' attitude in relation to the directorate of an organization and its overall communication with them.Other studies, such as Carter et al. (2011), LaHuis et al. (2011), Nye et al. (2010), Tay & Drasgow (2012) also applied IRT in the organizational context and

Figure 1 -
Figure 1 -Distinction between the multidimensionality within each item and between items.

Figure 3 -
Figure 3 -Result of the parallel analysis for the 75 items.
Loiacono et al. (2002)Day, 1997;Lindroos, 1997;Xie et al., 1998;Loiacono et al., 2002)s and researchers have strived to define quality in the context of the Internet (for example,Barnes & Vidgen, 2000;Day, 1997;Lindroos, 1997;Xie et al., 1998;Loiacono et al., 2002).Lindroos (1997)uses the perspective of software quality to discuss differences between web-based information systems and conventional information systems.Olsina et al. (1999), proposed a quality model for university sites, called Website QEM based on user opinions.Barnes& Vidgen (2000), Loiacono et al. (2002), Parasuraman et al. (2005) and Ding et al. (2011) also developed similar models more focused on commercial sites.These and various other studies break the quality of the websites into several attributes.The creation of these models is based mainly on experiences over many years in development and maintenance of information and web systems.The validation of these models is made mainly by empiric studies, such as the analysis of data collected in tests with users, satisfaction surveys and interviews.Nevertheless, different types of information systems can have different quality requirements(Worwa & Stanik, 2010).For example, commercial and personal websites are web-based information systems.However, their quality requirements are different, mainly in terms of information security and information searching issues.Thus, any study about the quality issues on the web must have a clear delimitation of the limits of the analysis given the large scope of the theme.This study fits into the classification of Cristobal et al. (2007) as a study of website quality and design.Within this scope, website quality is understood as the quality of an information system, in which, according toLoiacono et al. (2002), focuses on information storage, processing, presentation and transfer.
al. (2012), Nylund et al. (2007) and Rost (1997) the suitability of the bi-factor model and compensatory model of MIRT was evaluated based on AIC and BIC information criteria.Table

Table 1 -
Flowchart of the analysis and targets.

Table 2 -
Values specific to the tetrachoric correlation matrix.

Table 3 -
Comparison of models of one, two, three and four dimensions.

Table 4 -
Parameters estimation of difficulty and discrimination assuming the unidimensional model of two parameters ModU1.

Table 5 -
Parameters estimation of difficulty and discrimination assuming the unidimensional model of two ModU2 parameters.

Table 6 -
Parameters estimation of difficulty and discrimination assuming the unidimensional model of two parameters ModU3.

Table 7 -
Parameters estimation of difficulty and discrimination assuming the unidimensional model of two parameters ModU4.

Table 8 -
Parameters Estimation of discrimination for each dimension and parameters of multidimensional discrimination (MDISC), parameter of difficulty of scale (d) the parameter of multidimensional difficulty (MDIFF) and its respective standard error (se) for the 44 items, assuming the multidimensional model of two compensatory parameters.

Table 9 -
Estimate of the parameters of discrimination (a) for each dimension and in the general dimension and parameter of scale difficulty (d) assuming the bifactorial model.

Table 10 -
Comparison between the unidimensional, multidimensional models of two compensatory parameters and of the bifactorial model based on the AIC and BIC, RMSEA and M2 information criteria.

Table 11 -
Degree of multidimensional quality estimate of the 4 first websites of the sample based on the multidimensional model of two compensatory parameters.