Convergent and Discriminant Validity of the Perceived Risk Scale in Business-to-Business Context Using the Multitrait-Multimethod Approach

Consumer marketing literature is abundant with research on perceived risk. However, little research has investigated the perceived risk measure in business-to-business, specially regarding its various measurement methodologies employed with respect to their validity. A basic goal of marketing as a science is to provide theoretical explanations for buying-selling behaviour. Whoever seeks such explanations normally borrows and develops constructs and theoretical propositions that cannot be promptly generalized. Thus, this research is primarily concerned with testing and discussing two perceived risk measurement scales across two buying situations (business-to-business) using different types of validation techniques. It tests some assumptions and tenets in models of perceived risk by submitting these measures to a convergent and discriminant validation using the multitraitmultimethod approach. The total of firms from two industrial sectors (pharmaceutical and clothing) in the largest States of the Brazilian Federation were consulted. The results indicate that both scales and their variations are valid to assess risk perception. The certainty/seriousness approach proved to be slightly better than the riskiness approach.


A A A A ABSTRACT BSTRACT BSTRACT BSTRACT BSTRACT
Consumer marketing literature is abundant with research on perceived risk.However, little research has investigated the perceived risk measure in business-to-business, specially regarding its various measurement methodologies employed with respect to their validity.A basic goal of marketing as a science is to provide theoretical explanations for buying-selling behaviour.Whoever seeks such explanations normally borrows and develops constructs and theoretical propositions that cannot be promptly generalized.Thus, this research is primarily concerned with testing and discussing two perceived risk measurement scales across two buying situations (business-to-business) using different types of validation techniques.It tests some assumptions and tenets in models of perceived risk by submitting these measures to a convergent and discriminant validation using the multitraitmultimethod approach.The total of firms from two industrial sectors (pharmaceutical and clothing) in the largest States of the Brazilian Federation were consulted.The results indicate that both scales and their variations are valid to assess risk perception.The certainty/seriousness approach proved to be slightly better than the riskiness approach.RAC, v. 5, n. 3, Set./Dez.2001

I I I I INTRODUCTION NTRODUCTION NTRODUCTION NTRODUCTION NTRODUCTION
"A basic goal of social science is to provide theoretical explanations for behaviour.In marketing, this goal includes attempts to explain the behaviour of consumers, salespersons, and others involved in discipline related activities.Marketing scholars who seek such explanations frequently borrow and develop constructs and theoretical propositions relating to them.Although marketing has little in the way of fully developed, formally stated scientific theories, such theories cannot develop unless there is a high degree of correspondence between abstract constructs and the procedures used to operationalise them" (Peter, 1981, p. 133).In accordance to Peter's (1981) point, this research tests the construct validity of some perceived risk scales.It tests some assumptions and tenets in models of perceived risk by submitting these measures to a convergent and discriminant validation using the multitrait-multimethod approach.

P P P P PERCEIVED ERCEIVED ERCEIVED
In the last 40 years, since Bauer's (1960) conceptualisation of buyer behaviour as a risk taking activity, many different researchers have been attempting to operationalise the perceived risk concept.Cunningham (1967) was one of the first to suggest a two-dimensional model designed to measure the perceived certainty of a given event happening and the consequences/losses involved if the event actually happens.Besides the two-dimensional model, another type of perceived risk measure has been used.Dowling (1986) described it as a unideminsional measure.In this measure individuals are asked to rate the riskiness of a product/brand on a single scale such as, How risky is brand A? With answers ranging from not at all risky to very risky (Spence, Engel and Black-Well, 1970;Hampton, 1977;De Chernatony, 1988).
Uncertainty has been conceptualised as the overall likelihood that losses would occur due to poor choice and consequence; the seriousness attributed to the occurrence of each loss.Many decide to conceptualise this measure in this way for several reasons.However, the major reason for using the two-component model is that such a measure has been successfully used (e.g., Cunningham, 1967;Stone and Winter, 1987;Mitchell, 1991) over a long period of time and has shown to be trustworthy.

Convergent and Discriminant Validity of the Perceived Risk Scale
The second reason is that a great number of studies (e.g., Cunningham, 1967;Dash, Schiffman and Berenson, 1976;Boze, 1987;Mitchell, 1991;De Mello, 1997), which have employed the two-component model in the past allow for the making of future comparison.The third reason is that after an extensive metaanalysis (over 100 studies compared), Gemunden (1985) concluded that separate measures of uncertainty and consequences is a better predictor of information search.Finally, there has been a previous report (Lumpkin and Massey, 1983) of convergent and discriminant validity of the two-component model.
After Cunningham (1967) many others (e.g., Hirisch, Dornoff andKernan, 1972;Peter and Ryan, 1976;Hoover, Green and Saegert, 1978;Guseman, 1981;Carrol, Siridhara and Fincham, 1986;Greatorex and Mitchell, 1991) attempted to develop models to measure perceived risk.Constant arguments emerged from these models, especially if the risk model should be multiplicative or additive, however, little empirical work testing the functional forms and benefits of these models was done.Dowling (1986) has been advocating that there are theoretical and practical arguments for selecting a multiplicative relationship between uncertainty and consequences/losses.Accordingly, some of his arguments are (1) the absence of either variable would eliminate the perception of risk and (2) the influence of a non-salient loss on the overall perceived risk is reduced.
On the other hand, some researchers (e.g., Bettman, 1973;Peter and Ryan, 1976;Horton, 1976Horton, , 1979;;Mitchell, 1991;De Mello, 1997) have presented some evidence that an additive model fits slightly better than the multiplicative model.Joag (1985) tested, in an organisational buying situation, a range of different models and concluded that there is only a slight difference between them.Until recent years, the prevailing view maintained the same, that uncertainty and significance of consequences (importance of losses) should combine either in a multiplicative or additive form.Yates and Stone (1992) advanced this knowledge by saying that the two-dimensions of perceived risk are actually combined by an operator that behaves essentially, though not completely, like multiplication.In other words, their view is that uncertainty and consequence should combine interactively.Thus, after reviewing the literature, no strong and convincing evidence indicating which of these models (i.e., multiplicative versus additive) is the most adequate to measure perceived risk was found.Initially, we decided to operationalise perceived risk in both ways and test which model fits better.
Another idea has to do with the conceptualisation of an overall perceived risk.It is believed that in situations which more than one loss might occur, the effect of those losses is independently cumulative (Dowling, 1986;Yates and Stone, 1992).
This assumption can be challenged based on the findings of some research (e.g., Jacoby and Kaplan, 1972;Kaplan, Szybillo and Jacoby, 1974) that found high positive correlation among some types of losses.However, the contribution to overall risk made by one potential loss tends to be the same, regardless of the other potential losses that might accompany it (Yates and Stone, 1992).Consequently, the overall risk implied by a collection of potential losses is an accumulation of the contributions made by each of them.Dowling (1986) has postulated that the majority of measures of perceived risk are positioned at a low level of abstraction and can be found that have used one of the following indices: where n = the number of types of loss i.
After a thorough investigation, no compelling evidence could be found indicating the best overall perceived risk model; therefore it was decided to use the following conceptual equation as a safe approach.The final decision on which model to use was taken after construct validation tests were done using both multiplicative and additive models.
where n = the number types of loss i; ⊕ = can be a multiplication-like or additionlike operator.Cunningham (1967) has also pointed out the weakness of this model that rests on the assumption of both factors in the equation being equally weighted.He has suggested that the consequence dimension may be taken in consideration more seriously by buyers' than the uncertainty dimension.However, despite some effort to determine the weighting relationships between these two factors, the appropriate weight could not be determined.
Most recent authors who have investigated perceived risk in buying situations have been operationalising risk scales based on Cunningham's (1967) formulation.His formulation was a composite of two indirect questions on certainty and danger.Both questions were rated initially on a 4-point scale that he later converted to a 3-point one.These two scales combined in a multiplicative form to develop risk categories.
Generally, studies subsequent to Cunningham's (1967) have been using Likert type and semantic differential scales with questions like: consider yourself involved in a buying situation.Moreover, the possibility this acquisition did not satisfy the acceptance level of the firm you work for.What is the likelihood of the following losses occurring (at least one feasible loss situation must be described per risk type; e.g., you will feel personal dissatisfaction) and how serious would it be if these losses actually did occur?Possibilities of answers varying from a continuum ranging from very certain to not at all certain and from very serious to not at all serious.Guseman (1981) has used a 4-point scale like the one initially proposed by Cunningham (1967); however other researchers have been measuring the components of risk on a wide variety of ranging.Hisrich, Dornoff and Kernan (1972) used a 5-point scale, while Peter and Ryan (1976) preferred a 7-point one.Brooker (1984) amplified to a 9-point scale while Choffray and Johnston (1979) developed a 10-point scale for their study.Mitchell (1991Mitchell ( , 1994)), after analysing 120 different studies on risk, found that the overall risk measure is usually operationalised in a 7-point scale.This scaling technique has proved its validity in many studies (e.g., Henthorne, Latour and Williams, 1993;Stone and Gronhaug, 1993).
Even if a measure can be considered highly reliable, showing little effect of randomly varying measurements, there is no guarantee that the scale is actually measuring the theoretical constructs under investigation.RAC, v. 5, n. 3, Set./Dez.2001 According to Malhotra (1996), the validity of a scale may be defined as the extent to which differences in observed scale scores reflect true differences among objects on the characteristic being measured rather than systematic or random error.Testing validity is not a straightforward procedure.As Singleton, Straits and Straits (1993, p. 122) point out, "if we knew a case's true value on a variable independent of a given measure -then there would be no need for the measure".
To assess validity one must either: (1) subjectively evaluate whether an operational definition measures what it is intended to or (2) compare the results of an operational definition with the results of other measures with which it should or should not be related (Singleton, Straits and Strais, 1993).As can be seen, the sort of subjective judgements and objective evidence, which result, depend on the purpose of the measurement.
Operational definitions include components which are not supposed to be included in calculations, yet at the same time exclude important facets of the construct under consideration.One can never be sure what portion of the construct is being tapped and what is being missed by the operational definition issue.Figure 1 depicts what happens to operational definitions in most cases.The validity of a scale can be assessed in several ways.There are several categorisation systems used (e.g., face, content, predictive, concurrent, criterion, construct, convergent, discriminant, and nomological), however most researchers assess validity of their scale through subjective (i.e., face and content), criterion, or construct validation procedures (Singh and Rhoads, 1991).Ultimately no one of them alone is entirely satisfactory; however, it is possible to have a fair idea of

M M M M METHODOLOGY ETHODOLOGY ETHODOLOGY ETHODOLOGY ETHODOLOGY
To operationalise and test the above proposed perceived risk measure, we used a structured postal self-completion questionnaire as the data collection medium.The questions used to build up the questionnaire for this research were basically attitudinal questions.The measurement of these information areas is now considered.
Many different researchers have attempted to operationalise the perceived risk concept.The prevailing view is still the same; that uncertainty and significance of consequences (importance of losses) should combine interactively to compose perceived risk.After an extensive review of the literature, we decided to use the perceived risk scale in its traditional 7-point semantical measure (e.g., Lumpkin and Massey, 1983).
In order to validate the perceived risk construct, first some product fields had to be selected.To reduce the set of products to a manageable number that could encourage respondents to participate, some qualitative interviews were performed.After in-depth interviews with a selected sample it was possible to select the IT industry as the focus of our attention.The results of the interviews indicated workgroup servers and mid-range laser printers as the two most prominent product fields of the industry.These products were also recognised as common in our days to all middle and large businesses.
Large and medium size Brazilian firms operating in industrial activities were the basis of this research.To reduce the overall sample, some criteria were established to cover the most prominent industrial sectors of the Brazilian economy.After an extensive analysis, pharmaceuticals and clothing were chosen.An analysis of the data published from the latest industrial census available led to the selection of the geographical areas to be covered and the number of firms in each geographical area.The total of these firms is 162.
To determine the relevant individuals to be approached an extensive snowballing procedure was used to scan all possible decision-makers.After identifying the population of individuals, a random selection of one individual per firm took place.A response rate of 48% was achieved and the data showed no significant difference between respondents and non-respondents.Two different reliability assessments (test-retest and internal consistency) were performed in this study; both showing the data's reliability.The scale was validated in two different ways (convergent and discriminant) .
Construct validity is an approach to evaluate a measure based upon how well the measure conforms with theoretical expectation (De Vaus, 1996).According to Singleton, Straits and Straits (1993), the meaning of any scientific construct is implied by statements of its theoretical relations to other constructs.Thus, the validation process begins with an examination of the underlying theory of the concept being measured.This type of validity is the main form of validation upon which the trait related approach to psychometrics is based (Rust and Golombock, 1989).The entity measured by the test is normally not directly measurable, and thus most of the time researchers are limited to evaluating its usefulness by making inferences from the relationship (i.e., correlations) between the test and the various phenomena predicted by the theory.For example, if a particular Hierarchy of Effects Model theorises that in order for someone to buy a product he or she must first like the product better than others, then this preference is expected to be the case.If, on the other hand, it is discovered that people do not know whether they liked the purchased brand better than others before buying the product, then either the theory is wrong, or the results of the study are invalid according to this theory (Block and Block, 1995).Rust and Golombock (1989) purpose that construct validation is never complete, but is cumulative over the number of studies available, and in many respects is similar to Popper's (1969) idea of verification in science.It is thus a reflection of a particular view of the scientific process, and is integrated within the positivist and hypothetico-deductive view of science.
This study conforms with this view of cumulative validation, having fitted with two previous studies which tested for construct validation empirically, using similar scales to the ones utilised in this research.Lumpkin and Massy (1983) examined alternative perceived risk scales for convergent and discriminant validity (i.e., construct validity) amongst a consumer sample.Nevertheless, some doubts remain as to the best method to assess risk perception especially in a business-to-business context, and further investigation into construct validation was carried out to seek Convergent and Discriminant Validity of the Perceived Risk Scale clarification on this issue.These further tests on the perceived risk scale will be discussed in detail later on this section.
Construct validity is the most sophisticated and difficult validation to establish (Malhotra, 1996).According to Cohen (1979), if testing for construct validity, one should examine the scale being used by means of convergent, discriminant and nomological testing of validity.Convergent validity involves measuring a construct with independent measurement techniques and demonstrating a high correlation among the measures.Discriminant validity is exactly the opposite; it involves demonstrating a lack or a very low correlation among different constructs (Kinnear and Taylor, 1996).Last, nomological validity is tested by relating measurements to a theoretical model that leads to further deductions, interpretations, and tests (Spiro and Weitz, 1990).
Even if one decides to examine its scale by assessing all three types of construct validation, in the end there is no ideal way of determining the validity of a scale.The most appropriate method will depend on the situation.As De Vaus (1996, p. 57) suggests: "if a good criterion exists use it; if the definition of the concept is well defined or well accepted use this approach; if there are well-established theories which use the concept which we wish to validate, use this approach.If all else fails we have to say this is how the concept is defined and these measures, on the face of it, seem to cover the concept, and to give the measure to other people (referred to as panel of judges) to see what they think".
In the last four decades, many researchers (e.g., Bauer [1960], Cox [1967], Cunninham [1967], Ross [1974], Lumpkin and Massey [1983], Dowling [1986], Yates [1992], Greatorex and Mitchell [1993], to list only a few) have been engaged in discussions about the perceived risk concept.Neither total agreement nor disagreement can be discerned in this literature; however, there is a consensus among some (e.g., Peter and Tarpey, 1975;Peter and Ryan, 1976) that perceived risk is a multidimensional-multifaceted construct.Nevertheless, what researchers fail to agree on, is a sound operational definition for perceived risk.Another point of disagreement is with regard to the precise nature of the construct (e.g., Bettman, 1973;Horton, 1979).
In view of such conceptual fuzziness, construct validation should be an important part of any attempt to advance knowledge in perceived risk (Dowling, 1986).The perceived risk literature provides evidence of many attempts to validate this concept.Studies such as Bettman's (1973) original and its validation, Bettman (1975) proposed a dichotomous perceived risk concept, in which two types of risk (i.e., inherent and handled) were operationalised.Jacoby and Kaplan (1972) and later its validation study Kaplan, Szybillo and Jacoby (1974) designed and tested the perceived risk concept as a multifaceted (e.g., performance, financial, physical, social, and psychological) construct.In one study among many, Hoover, Green and Saegert (1978) measured the association between an outside criterion (e.g., information search) and risk perception.
Research exists which attempts a type of scale validation.For example, Bettman (1973Bettman ( , 1975) ) analysed the relationship between his dichotomous scale and Cunningham's (1967) certainty/seriousness scale, and found similarities supported by a significant analysis of variance.Woodside (1972) correlated his risk-taking scale with the choice dilemma scale of Kogan and Wallach (1964) and also found support for his scale.
Several investigators (e.g., Hirish, Dornoff and Kernan, 1972;Zickmund and Scott, 1973) have been exploring the discriminant relationship of perceived risk with other related constructs (e.g., self-confidence).These studies have been successful in providing evidence regarding the observed relationships proposed in the predicted direction.However, after digging the literature for this research, only two studies (Lumpkin and Massey, 1983;Mitchell, 1991) were uncovered which furthered an extensive construct validation of the perceived risk concept.Lumpkin and Massey (1983) employed the multitrait-multimethod (MTMM) matrix approach proposed by Campbell and Fisk (1959) which provides evidence for both convergent and discriminant validity.They explored two different perceived risk scales (the riskiness scale as the one used by Spence, Engel and Blackwell [1970]) and Cunningham's (1967) certainty/seriousness scale and found both methods to be valid.
The multitrait-multimethod matrix is a table of correlations that enables simultaneous assessment of both the convergent and discriminant validity of a construct.Dowling (1986) and Churchill (1979Churchill ( , 1995) ) have recommended this procedure as a convenient way of establishing the convergent and discriminant validity of a measure.The matrix is based on the principle that the more features two measurements have in common, the higher their correlation will be.According to Judd, Smith and Kidder (1991), measurements can share two types of features.They are traits and methods.Traits can be understood as the underlying construct the measurement is supposed to tap and methods are the form of the measurement.Ideally, scores should reflect only the intended trait and not be influenced by the method.
In this study a form of the MTMM was used to check if the traits under consideration, here risk perception -network servers and risk perception -midrange laser printers, could be measured by three different measuring methods.
The first method was the riskiness scale presented in a 7-point semantic differential form ascending from not at all risky (1) to very risky (7).
Respondents were asked to consider the risk (overall) in each of the brands presented to them.The scores for all brands in each product field were averaged creating a new score.
The second method was an additive form of Peter and Tapey's (1975) likelihood of loss/seriousness of consequences scale presented in a 7-point semantic differential form where the likelihood of loss dimension was measured by a scale ascending from unlikely (1) to likely (7) and the seriousness of consequence dimension by a scale ascending from not serious (1) to very serious (7).
The last method was a complete replication of the second method, with the exception being that this time, the method of interacting between two dimensions was done in a multiplicative manner.One group of respondents were asked, in method two and three, to consider themselves responsible for buying a workgroup server and the other group a mid-range laser printer (for a fuller discussion on scaling, readers are referred to chapter 5, Methodological Issues).In this study, scores generated by methods two and three should be very similar because both models posit a relationship between the components and perceived risk.However, by submitting both methods to a MTMM matrix one method should correlated higher on both of the trait conditions, and thus provide evidence to substantiate the choice of a method to be used on further analyses in this study.
Table 1 presents the mean perceived risk score for the three perceived risk scales.It indicates a very similar pattern of risk perception among methods two and three as anticipated, and a differing pattern of risk perception for method one in comparison to the two other methods.

Table 1: Mean Perceived Risk Across Traits and Methods
The multitrait-multimethod correlations were calculated and are shown in Table 2. To be consistent with Campbell and Fisk's (1959) analytical requirements for validating a scale using their proposed matrix, the steps taken were are follows: . Heteromethod-monotrait correlation (all values in the diagonal on Table 2) should be high and statistically significant.All but one correlation (0.22, between riskiness and multiplicative methods for the printer trait) correlated well and was significant at the level of p<0.10.What can be said at this point (in the analysis) is that the requirement for convergent validity was met for the riskiness/ additive comparison, the additive/multiplicative comparison, and partially for the riskiness/multiplicative comparison (in this last case only for the computer trait).
. Heteromethod-monotrait correlations should have higher scores than all others in the same row or column in the matrix excluding all monomethodmonotrait scores.This is because any association in a mono situation is based on more shared elements than if it were from a hetero situation.The required condition for discriminant validation was met on all comparison between the additive and the multiplicative methods, although it failed to fulfil the requirements for the comparison between the additive and the riskiness scale method on one comparison (0.29 > 0.17, 0.16, and 0.32) and between the multiplicative and the riskiness scale method on one comparison (0.22 > 0.12, 0.17, and 0.30).
. Another requirement for discriminant validation is that the correlations from the heteromethod-monotrait comparisons should be higher than the correlations from monomethod-heterotrait comparisons.The expectation is that this should show that the trait, not the method is creating the variation.According to this requirement, the following comparisons were found in the matrix: 0.33>0.24and 0.32 -true; 0.29>0.24and 0.32 -inconsistent in one comparison; 0.35>0.24and 0.30 -true; 0.22>0.24and 0.30 -false; 0.97>0.32 and 0.30 -true; and 0.90>0.32 and 0.30 -true.
The inconsistencies found in these comparisons presented an interesting pattern.Each belongs to a different method, which makes it difficult to speculate what method distinguishes more clearly between traits.In the last two comparisons involving the additive/multiplicative matrix, there were no inconsistencies.All three inconsistencies occurred when any comparison was taking place, which involved the riskiness scale.However, only one of these inconsistencies originated within the riskiness matrix.When it did occur, it was the same in the multiplicative scale.The results from the MTMM analysis indicates that all three methods are valid to assess risk perception.Nevertheless, it failed to provide strong evidence about which of these methods is better in distinguishing among traits.Thus, several facts were considered.These were: . The two component model of certainty/consequences has been used for over thirty years by many perceived risk investigators (e.g., Cunningham [1967], Peter and Ryan [1976], Greatorex and Mitchell [1991], Yates [1992], to list several).The long-lasting history and tradition of this conceptualisation is evidence in itself that this form of measure has proved to be of some worth.
. The riskiness scale has also a long history and many advocates (e.g., Spence, Engel and Blackwell, 1970;De Chernatony, 1988, 1989).Nevertheless the number of enthusiasts of this method is smaller than those of the certainty/ consequences model.When faced with a decision of choosing between two methods, one should look first at the fact that studies using the certainty/ consequence method constitute a wealthy data set for possible comparisons, while this is not the case regarding the other method (riskiness).The number of studies in the better case is smaller, limiting any comparability of measure.Thus, it seems logical to choose the two-component model.
. Another long-lasting discussion about the advantages and disadvantages of each of the two major component models (i.e., additive or multiplicative) have been dividing researchers.Most of the work in perceived risk (mainly in psychology and decision-making) has proposed some sort of multiplicative formulation (Sieber and Lanzetta, 1964), nevertheless such mathematical representation for consumer decision processes may be overly complicated (Wright, 1973).Finally, Lanzetta and Driscoll (1968) after empirically testing; supported the liner model.They found a positive correlation between certainty and consequences and inferred that this relationship might lead to an additive model being better by hindering the performance of multiplicative models.Peter and Ryan's (1976) study has concluded that the additive model was correlated more highly with brand preference than the multiplicative form.
The findings indicate that all three types of scales had construct validity but the riskiness scale seemed to provoke some fuzzy effects when compared to other methods.Moreover, it did not perform as well as the other methods in distinguishing among traits.However, it did provide the smallest risk perception across traits and methods.These results may have happened due to several reasons, like: (1) the riskyness scale is simplified and provides a clarity of presentation which facilitates the ability of respondents to associates the measure with risk; (2) the traits are part of one product field and very closely associated along the risk dimension what could in some way favour the more complex and accurate methods

1
Perceived Risk = Uncertainty 2 Perceived Risk = Uncertainty x Adverse Consequences 3 Overall Perceived Risk = ∑ Uncertainty i x Adverse Consequences i 4 Overall Perceived Risk = ∑ Probability of Loss i 5 Overall Perceived Risk = ∑ Probability of Loss i x Importance of Loss i

Figure 1 :
Figure 1: Operational Definition Include Irrelevant Components and Fail to Include all Relevant Portions of the Underlying Construt

Operational Definition 1 Operational Definition 2 Underlying Construct Underlying Construct ERROR Operational Definition 3
Convergent and Discriminant Validity of the Perceived Risk Scale the validity of a scale through an assessment of it using two or more different means.