The Friendship Paradox in the formation of academic committees

Abstract The Friendship Paradox is a phenomenon which states that most people have fewer friends than their own friends, and its generalization has been proposed in the last three decades by several scientific papers. Our study is focused on the academic environment, and seeks to determine whether or not the impression that individuals may have concerning invitations to take part in oral defenses is justifiable. This involved testing two hypotheses with regard to academic committee members: “The Invitee Paradox” (in terms of the person who is invited); and “The Inviter Paradox” (in terms of the person who extends the invitation). The paradoxes were assessed by designing invitation networks, both weighted and unweighted, which represent a dual relationship in which an invitation originates from an “inviter” and is extended to an “invitee”. We then tested the hypotheses with the aid of two real-world open access datasets from online academic repositories: (1) American (Brazilian Capes Catalog); and (2) European (French STAR Deposit). Our results showed that only “The Invitee Paradox” was true. We also explored possible relations between our proposed measurement of the invitation paradoxes and the PageRank metric, as to evaluate the relative importance of members in the invitation networks.


Introduction
The Friendship Paradox is a phenomenon illustrated by Feld (1991), who stated that "most people have fewer friends than their friends have, on average".This sociological idiosyncrasy was observed by the author in a network of high school students.If the structure of a network of friends is considered in its entirety, the statement cannot make sense for everyone at the same time, and thus implies that the phenomenon is a contradiction.Since Feld's seminal article, several scientific papers have been published proposing the generalization of the Friendship Paradox, exploring this perception and awareness of "statistical inferiority" from various empirical findings in social settings.These entail not only assessing connections at individual levels, but also correlating the patterns that emerge from these same connections, together with individuals' attributes, properties and features (Fotouhi;Momeni;Rabbat, 2015).
In this paper, we presented a generalized Friendship Paradox that is applied to an academic context.Our aim is to determine whether or not the impression that someone may have with regard to how often they take part in academic committees is justifiable.To the best of our knowledge, our study is unprecedented, as it includes an analysis of the underlying dynamics involved in the formation of these assessment boards, which, according to Newman (2001), as affiliation networks, constitute a modern practice of scientific collaboration, as the actors are connected by a common membership.
To provide a better analysis of how connections occur, we designed invitation networks, which represent a dual relationship in which an invitation originates from a member who extends it (inviter) to a member who receives it (invitee).Among other applications, these networks can be used to model the formation of academic committees, as proposed in this research, in which an invitation was treated as an offer effectively accepted by a person, making them an official board member.
As we may observe the absence of reciprocity between members, contrary to what happens in friendship relations (since not every committee member who receives an invitation sends an invitation to the member who invited them), an individual notion may prevail between some of them that more invitations have been received or sent by those within their circle of collaborators.
Our study tested two hypotheses that apply to members who receive and/or send invitations for the formation of academic committees: "The Invitee Paradox" (Friendship Paradox in terms of who is invited); and "The Inviter Paradox" (Friendship Paradox in terms of who is extending the invitation).
H1.The Invitee Paradox: I receive fewer invitations than the average number of invitations received by the members who invite me.
H2.The Inviter Paradox: I send fewer invitations than the average number of invitations sent by the members that I invite.
When our hypotheses were applied to real-world open access datasets from online academic repositories, we were able to confirm the truth of "The Invitee Paradox" and to refute "The Inviter Paradox".Additionally, applying measures created from our invitation paradoxes, we found evidence of moderate to strong correlations between some of these proposed measures and the PageRank metric, which assesses a website's importance (Brin;Page, 1998).
In the following subsections, we examined the traditional way in which research projects from graduate students are assessed, and provide some examples of how the Friendship Paradox has been generalized over the past three decades.

Oral defenses and Academic Committees
An oral defense is the culmination of a research inquiry into a particular subject or project.It provides an opportunity for a final improvement to be made to a dissertation/thesis, and allows candidates to demonstrate their understanding of the area of research.A dissertation or thesis defense is, thus, a formality but also a valuable occasion for the researcher (provided all the expected academic requirements are met) (Kemoli;Ogara, 2016).Black (2012) observes that the candidate should clearly indicate how the conducted study is significant, substantial, and contributory to the related body of knowledge.Moreover, when defending their dissertation/thesis, candidates must demonstrate competence in describing, discussing, and supporting all aspects of the study to the committee and, potentially, to a broad academic audience.
Before the dissertation is granted formal approval, it is often recommended that changes should be made to the final document, but the oral defense itself is usually completed with the understanding that, when the recommended changes have been finalized, the dissertation/thesis will be accepted (Brause, 1999).
With regard to the geographical location, the oral defense traditionally occurs on-site, at the researcher's academic institution.However, as a result of the widespread dissemination of communication technologies, the remote version of oral defenses has become increasingly popular in the form of a virtual (online) meeting system.This enables members from distant institutions to participate without having to incur travel and accommodation expenses, with the convenience of being able to hold meetings at short notice (Li;Xue, 2011).
A committee comprises a group of people (members, which may include the advisor) who are responsible for assessing a graduate student's dissertation or thesis, and appraising its worth and quality during candidates' oral defense.The members also have the responsibility to contribute with new ideas, suggestions and insights (Roberts, 2010).After the presentation, the decisions by the examiners should be made based on the work submitted and the way the candidates defended it (Kemoli;Ogara, 2016).Roberts (2010) points out that, although there is no established rule about the maximum number of members in a committee, it usually contains three to five persons, depending on students' desired degree (master or doctorate).Educational institutions around the world define their own policies with regard to the composition of academic committees for oral defenses, including the expected number of examiners, their qualifications and the responsibilities assigned to each of the constituent members.
Committees can be multidisciplinary, which means examiners come from scientific fields other than that of the research project being assessed.Members may likewise belong to the advisor's own institution or to an external one (even from another country).Incidentally, this diversity in the composition of committees often forms a bridge across different subjects and fosters a greater dissemination of knowledge in the scientific community as a whole.
The advisor and/or the researcher must follow the criteria laid down in the regulations and guidelines of their own academic institution.These include making formal invitations to allow other members to supervise candidates while the research project is being undertaken (in some cases, the members are only invited on two separate occasions: the dates for qualification/pre-assessment and oral defense).Finally, on the scheduled day of the oral defense, the committee members summon the researcher to appear before them.They argue with the student after its presentation and make a joint decision about whether to approve or reject the work submitted for appraisal.
As observed by Shen-Miller and Shen-Miller (2012), apart from underlying policies, in the formation of an academic committee, a number of common factors must be taken into account as the advisor and/or the researcher decide whom to invite.Among others, the following points should determine whether or not a research project is successful: the eligibility of the faculty/ university members, time and commitment, expertise, compliance with one's personal needs, and prior relationships with the advisor.
In order to illustrate how the structure of academic committees can be designed, Figure 1 displays a model of an invitation dynamics network.The table on the left shows a set of five hypothetical theses assessment committees, each with its advisor and constituent members (identified by M1 to M6).In the network (graph), on the right, each vertex represents a committee member, and invitations are indicated by arcs, which always connect the inviter to the invitee.The figure linked to each arc represents the number of invitations sent by the inviter (and held by the invitee).

The Friendship Paradox
The Friendship Paradox has been studied, explored and applied in various domains by the scientific community in the last three decades, and attempts have been made to measure its impact on social relations and determine what implications this has for the dynamics of information flow in networks.Most of the current studies, however, are concerned with understanding the role of the Paradox in the configuration and evolution of online social networks (OSN) as a means of opening up new perspectives for its applications and analysis.Feld (1991) applied his study to the frontiers of sociology, but, ever since, the Friendship Paradox has been explored on a multidisciplinary basis.Later work, for instance, ventured into fields like scientometrics (Benevenuto;Laender;Alves, 2015), epidemiology (Bagrow;Danforth;Mitchell 2017), and psychology (in the domain of social media platforms like Facebook, and their influence on behavioral changes (Bollen et al., 2017)).
Adopting a more theoretical approach, Lee et al. (2019) studied the impact of perception models on the Friendship Paradox and opinion formation, concerning an ego's neighborhood (or surroundings) and peer pressure.The authors concluded that these perception models can have a crucial impact on people as well as the network structure.
In a scientometric study on scientific collaboration, Eom and Jo (2014) also generalized the Friendship Paradox.The authors analyzed complex co-authoring networks by extracting indicators, in order to measure the probability of maintaining the Paradox, both on an individual and collective basis.In addition to existing relationships, they also took account of scientific research outputs, such as the numbers of citations and publications.Using the Friendship Paradox as a groundwork for their study, Benevenuto, Laender and Alves (2015) corroborated the existence of the Paradox applied to the H-index metric, which measures a researcher's productivity and citation impact.Also concerning academic performance, Kong et al. (2019) proposed to validate the generalized Friendship Paradox from researcher's academic level, as an aspect of researchers' influence.Bagrow, Danforth and Mitchell (2017)  Also centered on OSN and regarding social network participation, Medhat and Iyer (2022) explored the social comparison that occurs among users who, upon realizing that their content was less appreciated than that of their friends, modify their sharing behavior.Although focused on synthetic graphs, the authors discussed the practical implications of their work.
An analysis of the generalized Friendship Paradox in clustered networks was provided by Jo, Lee and Eom (2022).Employing a vine copula method, the authors obtained the analytical solution for generalization in clustered networks to find that the peer pressure of individuals with high (low) attributes is increased (reduced) by connections between their neighbors.
All of the previously mentioned scientific studies used the Friendship Paradox to broaden the understanding of relationship dynamics in specific domains, with many of them measuring the weight of participants' attributes and properties, and their ensuing and enduring impact on the topology of networks as well in their number of interconnections.However, although some studies have included indicators of academic productivity in their analysis (Eom;Jo, 2014), to the best of our knowledge, no research has been conducted so far addressing issues of the Paradox applied to the formation of academic committees.
In a similar way to Pal et al. (2019), our goal is to assess a new version of the Friendship Paradox by adopting a quantitative (statistical) approach: analyzing its occurrence in terms of incoming/outgoing connections between the members who form academic committee networks.
To achieve this goal, we made use of empirical data from registered assessments of Brazilian and French theses.

Methodological Procedures
In this section, we outlined the experimental data used and provided a brief explanation of how they were collected, in addition to a summary table that highlights some key features.
Following the Datasets Subsection, we described the applied method that provided us with the means to measure the Friendship Paradox in both of our modeled invitation networks.

Datasets
We tested our hypotheses in two distinct real-world open access datasets from online academic repositories: one American and one European.A comparative analysis provided a refined interpretation and led to a discussion about whether it would be possible to generalize the results.
The availability of the periods in which the committees were registered differs between the two repositories, even though the extracted data submitted for processing are derived from the same fields: date of defense, name(s) of advisor(s), and names of the assessing members.Therefore, in order to find a common means of assessing the paradoxes, we only took account of (i) the period of intersection on their registered years (from 2013 to 2018), and (ii) the doctoral research projects (i.e., PhD thesis assessment).
The Brazilian Coordination for the Improvement of Higher Education Personnel (Capes, Coordenação de Aperfeiçoamento de Pessoal de Nível Superior) Catalog of Theses and Dissertations, hosted on the Sucupira Platform, houses the concluding works of graduate programs from higher education institutions in Brazil (https://sucupira.capes.gov.br/sucupira/public/consultas/coleta/trabalhoConclusao/listaTrabalhoConclusao.xhtml).Data were collected over a period of 6 years, parsing HTML tags directly from detail pages, which also included information regarding the composition of the committees of the registered defenses.
French STAR Deposit, The Dépôt national des thèses électroniques françaises (STAR) is a part of HAL (an open archive where authors can deposit scholarly documents from all academic fields (https://hal.inria.fr/STAR)).Raw data of viva was collected by means of the provided research API.
It is worth mentioning that, in the overall data collection from the STAR Deposit committees, we found that approximately 46% of them comprise multiple advisors.In the Capes Catalog, however, when the advisory board consists of two or more members, only the first is available.For this reason -and to avoid discrepancies in the final analysis -, in the STAR generated networks, we only considered as an inviter the first registered advisor.
After our data collection had been carried out, an additional stage of data normalization and cleaning was deemed necessary so that the data could provide the integrity and suitable formatting before being submitted to the final extraction and processing.For our purposes, however, we believe that the number of corrupted or incomplete records was negligible in both datasets.
Figure 2 [table above] provides a quantitative summary of the extracted data used in our network graph modeling, including metrics and statistics.A Connected Component is a set of vertices in a graph that are linked to each other by paths (French committees are more fragmented, with fewer connections between members).A Giant Component is a connected subset of vertices whose size scales extensively (Newman, 2001).The diameter of a graph (or subgraph) is the longest distance among the shortest distances between any two vertices.The mean value corresponds to the average number of members per committee in our time window.

Assessing the invitation paradoxes
Our generalized paradoxes were assessed after data were submitted to pre-processing consisting of text (NFD) normalization, removal of graphic accents, conversion to lowercase and whitespace trimming.Finally, a disambiguation treatment considered all homonymous cases as a single member, thus ensuring the uniqueness in the global networks, i.e., each vertex that was created only represented a single person.
Relations were established between members in accordance with the invitations received and/or carried out between them regarding the composition of the committees.A named tuple was created for each occurrence of an invitation and the direction of the invitation was determined by the member's role as an invitee or as an inviter.
A graph was then constructed in which each vertex represented a committee member, with arcs posing as invitations from the inviter to the invitees and labeled with the number of occurrences.Finally, two invitation networks were designed for each dataset.The first one was unweighted, i.e., it disregarded the possible multiplicity of invitations between any two members (where a relationship indicates a single offer of an invitation).The second was a weighted network (that included the number of all recorded invitations).It should be noted that the sum of invitations received over a period does not imply a symmetrical correspondence with the number of members that form academic committees (or its size).
Table 1 displays the assessments of the invitation paradoxes for each vertex (member) in the graph of Figure 1, both in the Unweighted Invitation Network (UIN) and the Weighted Invitation Network (WIN): "True" indicates a valid paradox, and "False" indicates an invalid paradox.In the inequalities, the term on the left is linked to the vertex in evidence, and the term on the right corresponds to the average of the in-degree or the out-degree of the vertex that represents the inviter or invitee (depending on the role played in the modeled networks).
In the example of Table 1, the Inviter Paradox is only valid for the M6 member, in both UIN and WIN.The Invitee Paradox is only valid for M1 in UIN, and M1 and M3 in WIN.We can state that the number of invitations between any two members did not substantially influence the assessment of the paradoxes in either of the networks, except for the Invitee Paradox for the M3 member.By calculating the number of invitations associated with M2, M4 and M5, both offered and received, these members would not be able to justify a perception that they are in an inferior situation in relation to the other members, since the Paradox was not detected in any of the four columns.M3, however, could argue that his friends receive more invitations than he does.

Results
Table 2 shows the final results of the assessment of the invitation paradoxes for our two datasets, applied to both networks (UIN and WIN).Just as the Friendship Paradox is a phenomenon that determines a measurement in a network on an individual basis, our paradoxes also correspond to each member of our academic committees (returning true or false, based on the member's measurement).In this way, when testing our hypothesis in a global scale network, we adopted the majority principle to find out if a network displays the corresponding paradox.This meant that, as the percentage on the first two lines of Table 3 was above 50%, we proved The Invitee Paradox (H1) for both datasets.Likewise, as the percentage in the last two lines was below 50%, we refuted The Inviter Paradox (H2) for both datasets.

Dual-Role Criteria
Alternatively, we modeled a scenario where we tested a dual-role criterion, applied to the computation of a member's received invitations.This criterion is based on the notion that members should only receive invitations as long as they can also offer them.
In this scenario, we assumed that members should be considered eligible to be inviters depending on the year when they played their first role as advisors (i.e., the year of the assembled academic committee for their first advisee).In light of this, invitations received prior to the year of this event were disregarded when designing the invitation networks for a member, thus not reflecting in the invitation paradox's corresponding assessment.
The results for our dual-role criteria applied to the Invitee Paradox (UIN) are as follows: Capes Dataset fell from 77.39 to 54.29%; STAR Dataset fell from 58.45 to 35.72%.
The results for our dual-role criteria applied to the Invitee Paradox (WIN) are as follows: Capes Dataset fell from 77.84 to 55.97%; STAR Dataset fell from 59.24 to 36.73%.
We were thus able to refute The Invitee Paradox (H1) for the STAR Dataset in this particular scenario, for both UIN and WIN networks, again on the basis of the majority principle.

The Invitation Paradoxes Measurement and The PageRank Metric
In order to find explicit relations between the number of invitations received or offered, and the relative importance of the members in a network, we analyzed the available data versus the ranking obtained with the application of the PageRank algorithm.Applied to the web, this metric gives some approximation of a page's importance or quality, with an intuitive justification that a page can have a high PageRank if there are many pages pointing to it (Brin;Page, 1998).
The algorithm output is a probability distribution that represents the likelihood that a person clicking on links at random will ultimately arrive at a particular page.According to Brin and Page (1998), the metric can be calculated using a simple iterative algorithm, and corresponds to the principal eigenvector of the normalized link matrix of the Web.
Applying the PageRank algorithm on citation networks, Massucci and Docampo (2019) obtained academic rankings, resulting from the assessment of the prestige of academic institutions.
Similarly, in our study, we aimed at verifying if a frequent academic invitee (with a higher in-degree) or inviter (with a higher out-degree) also has a proportional figure in this ranking of reputation or prestige.
Since the invitation paradoxes are not measures per se, first we had to find a way to calculate their values.
The proposed measurements are described below, one for each invitation paradox: The Invitee Paradox Measure: the number of invitations received by a member, divided by the mean number of invitations received by all their neighbors.This measure equals 0 when there are no neighbors, and equals to the number of invitations received by a member when their neighbors received no invitations at all.The Inviter Paradox Measure: the number of invitations sent by a member, divided by the mean number of invitations sent by all their neighbors.This measure equals 0 when there are no neighbors, and equals to the number of invitations sent by a member when their neighbors sent no invitations at all.
The following charts (Figure 3) display the plotting from the values obtained from our measurements, compared with the values returned from the PageRank algorithm, for each member on our datasets.

Discussion
In order to graphically analyze an occurrence of the invitation paradoxes, Figure 4 compares subgraphs from the datasets graph models, where the largest proportion of the average for the paradoxes were found: a) The Invitee Paradox: weighted STAR subgraph containing 51 vertices.The dark gray vertex received only 1 invitation from the only member on its network (the light gray vertex).The light gray vertex received a total of 76 invitations from members of its own network (white vertices).
b) The Inviter Paradox: weighted Capes subgraph containing 29 vertices.The dark gray vertex made only 1 invitation to the only member on its network (the light gray vertex).The light gray vertex made a total of 60 invitations to the members of its own network (white vertices).
The charts in Figure 5 display the in-degree and out-degree values of both datasets by means of the Empirical Cumulative Distribution (ECD) function (Katchanov;Markova, 2017).We can clearly see that most members have low degree values, while only a few of them have high degree values (as should be expected), with similar curve patterns between the two datasets.
In Figure 5, it is possible to note a trend in the increase of the ranking as the measure of the Invitee Paradox also progresses (subplots (a) and (b)).The results suggest that, as members receive more invitations in relation to their peers, the greater their relative importance in the network.This evidence is corroborated by Pearson's correlation coefficient, for both the Capes chart (strong correlation with 0.68) and the STAR chart (moderate correlation with 0.54).As for the correlation between the measure of the Inviter Paradox and the PageRank metric (subplots (c) and (d)), we found that it is weak for the Capes chart (0.30) and very weak for the STAR chart (0.19).P-value was very close to zero in all cases, meaning that our tests were statistically significant.

Conclusions
The Friendship Paradox continues to arouse the interest of researchers who seek to explain, among other factors, how connections are established, topologies are formed and interactive frequency occurs in complex networks.Another important research topic is the role played by individual properties in these environments to determine which actors prevail or feature prominently in generalized settings.
Like many scientific papers in recent years, our study has put forward a new generalization of the Friendship Paradox, which remains a significant tool in the analysis of social phenomena where people regard themselves as being at a numerical advantage or disadvantage with regard to their peers.It is a phenomenon of unequal distribution of opportunities in a field of power (in our case, academic authority).
Using the PageRank metric, we also found evidence that, in connection with our generalized paradoxes, members who receive more invitations to participate in committees enjoy relatively greater prestige.We hope we have made a contribution in this work by providing a more in-depth understanding of the dynamics involved in the composition of academic committee networks.
Although our research included elements that provide evidence of the occurrence of the Invitee Paradox in academic datasets from two countries, we should mention as a caveat that generalizations will only be possible after a further quantitative exploration of diversified data, preferably from numerous sampled international sources.Furthermore, as we considered the pre-pandemic period of 2013-2018, we recognize that the board members' geographical location was an important factor for the formation of academic committees, as people tend to invite those who belong to their same institution, or are from institutions in their own region.
With regard to future directions that could broaden the scope of our research, we suggest carrying out work that could give an insight into how the Friendship Paradox in the academic community is affecting the way researchers relate with each other, collaborate in paper authoring and research projects and, most importantly, new studies that seek to address the following question: if the impression that researchers once had of inferiority (or even superiority) to their peers is in fact justified, can this discovery influence their careers?
Finally, it should be mentioned that, in the field of human sciences, psychological studies have led to the Friendship Paradox having serious online implications, particularly in social media websites: changes in patterns of behavior and attitudes that may be aroused when people, especially teenagers, find that they are "numerically impaired" in comparison with others (e.g., as to the number of likes, followers, viewers and subscribers).
As we are constantly increasing our connections, notably in online environments, it is likely that the Friendship Paradox will continue to substantially affect our lives and social relationships.

Figure 1 -
Figure 1 -Composition of five academic committees and the corresponding invitation network representation.Source: Elaborated by authors (2022).
investigated how the Friendship Paradox's strength is related to the volume of contacts between individuals on OSN like Twitter, by highlighting the role played by the Paradox in the dissemination of information.More recently, in a paper centered on Instagram, Serafimov, Mirchev and Mishkovski (2019) endeavored to demonstrate the application of the Friendship Paradox in several of its intrinsic network properties such as followers, likes, posts, hashtags and comments.

Figure 2 [
Figure 2 [below] displays histograms of the most frequent committee sizes.

Figure 3 -Figure 4 -
Figure 3 -Scatter plots showing the correspondence between the invitation paradoxes measurement and the PageRank metric, for both datasets.Source: Elaborated by authors (2022).

Figure 5 -
Figure 5 -Distribution of the in-degree and out-degree values of the invitation networks from both datasets.Source: Elaborated by authors (2022).

Table 1 -
Tests for the Friendship Paradox on unweighted (UIN) and weighted (WIN) networks of the invitation graph from Figure1.

Table 2 -
Percentages of the invitation paradoxes for each dataset, in unweighted (UIN) and weighted (WIN) networks.