A classification and summarization method for analysis of research activities in an academic faculty

Nowadays, more and more scientific research activities are carried out in different laboratories and universities, which not only play an important role in the development of science and technology, but also show a significant inference on education. The improvement of the research capability of an academic faculty can directly impact the quality of education, bring innovations to Industrial Engineering curriculum proposals, and guarantee the subjects are up to date. The investigation of the existing issues in the current research activities is usually considered as the primary and challenging step. As the output of research activities, academic articles are often considered as a kind of evidence-based resources for the investigation. Despite some methodological efforts have been made by existing article review methods, less attention has been paid to discover the implicit academic relationships among the academic staffs and to investigate their research expertise. The objective of this study is to address this existing drawback through the proposition of an Academic Information Classification and Summarization method. A case study is carried out in the Industrial and System Engineering Graduate Program (PPGEPS), PUCPR, Brazil. The result not only highlights the advantages that can be obtained from this proposition from the education perspective related to Industrial Engineering, but also can be used as evidence to balance and compare an academic staff’s research expertise and his/her teaching disciplines.


Introduction
In recent decades, investments in scientific capacity have been viewed as a critical precursor for creating economic growth in the knowledge economy by governments worldwide (Feldman et al., 2014).However, because of the limited public budgets and increased competition during the research funding applications, numerous academic faculties have been focused on finding some ways to improve their research capability (Smeby & Try, 2005).In order to support the academic faculty development, the investigation of the existing issues in the current research activities is usually considered as the primary and challenging step.Those research activities are influenced by many factors, such as personalized characteristics and specific knowledge background of academic staffs, and also the diverse resources that are available for researches (Adler et al., 2009).As the output of research activities, academic articles are often considered as the approved and the evidence-based resource for this kind of investigation (Arundel & Constantelou, 2006).
Along with the fast growth in the number of publications, an increasing variety of article review and evaluation methods are proposed.In general, these methods can be divided into two kinds: (1) the systematic review methods that focus on investigating the contents in the articles, such as the rapid review (Khangura et al., 2012), the scoping study (Arksey & O'Malley, 2005), and the meta-analysis (Mullen, 1986); (2) the research productivity evaluation methods that based on some pre-defined criteria, such as volume, quality, impact, and utility (King, 1987;Geuna & Martin, 2003;Higher Education Funding Council for England, 2009).For the first kind, it shows the capability of analysis the "inside view", namely the summary of the known and unknown knowledge that relates to a specific research topic.For the second kind, despite there is still debate (Smith et al., 2011), it focuses more in the "outside view", namely the research production assessment of one or more academic staffs.
As pointed out by Bercovitz & Feldman (2011), research activities are generally carried out in teams, which bring together numerous academic staffs from different knowledge domains and organizations.We found despite the big methodological efforts have been made by above-mentioned article review methods, less attention has been paid to the discovery of the implicit academic relationships among the academic staffs and neither the investigation of their research expertise.Several researchers (He et al., 2009;Hoekman et al., 2010;Li et al., 2014) have begun to realize this issue and started to discover its corresponding solution.However, these existing studies were mainly designed for analyzing the collaboration issues in some special example of cases, where there is still a lack of a general method that can be easily adopted.
The objective of this paper is to address this existing drawback through the proposition of an Academic Information Classification and Summarization method.The rest of this paper is organized as follows: Section 2 presents an academic information classification table and a formalized procedure for the summarization.Based on the proposed method, Section 3 illustrates a case study to show the advantages that can be obtained from the proposition.Section 4 concludes this paper and highlights further works.

Academic information classification and summarization method
This section presents a gradual strategy.It firstly classifies the academic information from the collected academic articles, and then it uses this result to generate summaries from different perspectives within an academic faculty.An overview of the procedure is given in Figure 1.This workflow is divided into three main phases: the collection phase, the classification phase, and the summarization phase.

The collection phase
During this phase, through the definition and application of the article selection parameters, the academic articles that are needed by the next phase (the classification phase) are collected.Six main parameters, which we used in this paper, are introduced as follows: • "The Name of the Academic Faculty": the academic faculty that has the intention to analyze the research activities within it; • "The Selected Article Database": the commonly approved database, which can cover the main research output (academic articles) in that academic faculty; • "The Range of Years": the publication period of those academic articles; • "The Names of Academic Staffs": the academic staffs in that academic faculty; • "The Publication Languages": the languages of those academic articles; • "The Publication Types": the types of those academic articles, such as journal articles, and conference papers.
Of course not only limited to these preliminary parameters, the article selection parameters can be extended from both the completeness aspect and the specificity aspect.For the former aspect, more new parameters can be employed.For example, adding a "The Subject Areas" parameter to limit the domains of those academic articles.For the latter aspect, more restrictions of the existing parameters can be further specified.For example, refining the "The Publication Types" with a restriction that limits the article selection within one or more appointed journals.

The classification phase
The full text review of academic articles is a time consuming work, even if the analysis boundary within a research group.Due to the keywords usually can cover most of important concepts that included in an academic article (Chen et al., 2008), we propose to firstly classify the keywords, and then use them to construct a topic sentence that can represent the essence of an academic article.
As can be seen from Table 1, each column in that table shows one aspect of academic information that is extracted and classified from an article.To be more specific, the semantics of each column are introduced as follows: • "No.": the identifier of an academic article; • "When": the publication year of that academic article; • "Who": the author names of that academic article; • "What": the objects of study in that academic article; • "Why": the issues that the academic article tries to address; • "How": the methods that are applied on the "What" to deal with issues about the "Why"; • "Where": the application contexts of the "How", "What" and "Why".
In order to capture the essence of an academic article and also reconfirm the correctness of preliminary classification, a topic sentence construction formula is proposed as follow: "In [When], [Who] applied the [How] on the [What] to deal with issues about the [Why] in the context of the [Where] ".
The contents in each row are firstly filled with the existing or extracted keywords from an academic article.And then the formula takes the classification results from the table and constructs a corresponding topic sentence for each collected article.In some specific cases, publications might not be able to provide an appropriate list of keywords to express their topics.In this paper, this kind of insufficient situation means that the keywords of an article are unable to fulfill all the grids in a row of the classification table.In this case, for the empty grids, the corresponding keywords need to be extracted from the title, the abstract, or even the full text of that academic article before the classification.Because a variety of keyword extraction methods have already been proposed by many researchers, such as Hulth ( 2003), Matsuo & Ishizuka (2004) and Yih et al. (2006), the development of the related technology for the keyword extraction is not the focus of this paper.

The summarization phase
During this phase, the investigation of the scientific research activities in an academic faculty is performed based on the academic information classification table (the output of the classification phase).The contents in each row of the table show the five main perspective ("When", "What", "Why", "How", and "Where") of an article ("No.") that is contributed by one or more academic staffs ("Who").Through the combination of these seven elements, various kinds of summaries can be exhibited differently.As examples in this paper, we list five combinations as follows: • "No." & "Who" & "When".This combination shows the collaboration frequency in different years ("When") among the academic staffs ("Who") in an academic faculty; • "No." & "Who" & "What".This combination shows the main research objects ("What") that the academic staffs ("Who") worked on; • "No." & "Who" & "Why".This combination shows the main research problems ("Why") that the academic staffs ("Who") dealt with; • "No." & "Who" & "How".This combination shows the main methods or technologies ("How") that the academic staffs ("Who") can handle; • "No." & "Who" & "Where".This combination shows the main research domains ("Where") that the academic staffs ("Who") took part in.
Let N be the set of the article identifiers that are described by the "No." in the classification table, and n N ∈ be the article identifier in row n.Let P be the set of academic staffs in an academic faculty, and n P P ⊆ be the set of academic staffs that are described by the "Who" in row n.In the remaining of this section, we present two formal summarization methods: (1) the academic relationship summarization, and (2) the research expertise summarization.

The academic relationship summarization
The summaries of the academic relationships among the academic staffs in an academic faculty can be based on the combination of the "No." & "Who" & "When" (from the time and frequency perspective) and the combination of the "No." & "Who" & "Where" (from the domain and frequency perspective), are presented as follows.
For the first combination, let Y be the set of years that are described by the "When" in the classification table, and n y Y ∈ be the year in row n.In order to summarize the collaboration among those academic staffs from the time and frequency perspective, the figure is suggested to satisfy the following conditions: • Each academic staff in P is represented as an entity in the figure ; • A no repeat relation ( ) ∈ that all appear in a same row n, is attached to it.
For the second combination, a generally accepted domain framework should be firstly employed as its background.In that framework, different research domains are placed at different positions, together with a certain form of relationships between each other.Let RD be the set of research domains and DP be the set of positions in the employed framework.Let D be the set of research domains that are described by "Where" in the classification table , and  f , which records the number of repetitions of that academic staff l x p and the researcher domain x rd that both appear in a same row, is attached to it.

The research expertise summarization
The research expertise summarization of the academic staffs in an academic faculty can be based on the combination of the "No." & "Who" & "What" (from the research object perspective), the combination of the "No." & "Who" & "Why" (from the research challenge perspective), and the combination of the "No." & "Who" & "How" (from the research method perspective).In this paper, we represent each of above-mentioned combinations through a set of pie charts respectively.Each summary represents an academic staff i p P ∈ as an entity and is attached by its corresponding pie chart i pc .Let i S be the set of sectors in the pie chart i pc .For the first combination, let O be the set of research objects that are described by "What" in the classification table , and n

O O
⊆ be the set of research objects in row n.In order to show an overview of the research expertise from the research object perspective, the figure is suggested to satisfy the following conditions: • Each research object in i O is represented as a sector in i S ; • The numerical proportions of the sectors in i S are based on the number of repetitions of the academic staff i p and its corresponding research objects in i O that both appear in a same row.
For the second combination, let C be the set of research challenges that are described by "Why" in the classification table, and n

C C
⊆ be the subset of research challenges in row n.In order to show an overview of the research expertise from the research challenge perspective, the figure is suggested to satisfy the following conditions: ⊆ be the set of research challenges that is dealt with by an academic staff i p .If a i p and a research challenge nu n c C ∈ that both appear in row n, then nu i c C ∈ ; • Each research challenge in i C is represented as a sector in i S ; • The numerical proportions of the sectors in i S are based on the number of repetitions of the academic staff i p and its corresponding research challenge in i C that both appear in a same row.
For the third combination, let M be the set of research methods that are described by "How" in the classification table, and n M M ⊆ be the set of research methods in row n.In order to show an overview of the research expertise from the research method perspective, the figure is suggested to satisfy the following conditions: be the set of research methods that can be handled by an academic staff i p .If a i p and a research method nk n m M ∈ that both appear in row n, then nk i m M ∈ ; • Each research method in i M is represented as a sector in i S ; • The numerical proportions of the sectors in i S are based on the number of repetitions of the academic staff i p and its corresponding research method in i M that both appear in a same row.
After the summarization phase, an evidence-based result that outlines the scientific research activities of a research group from five different perspectives will be generated.Due to diverse academic faculties might have various kinds of evaluation or assessment criteria, the way to make conclusions based on the summarized results will be an object of continuous exploration.

The case study
In this section, a case study is taken place in the academic faculty named PPGEPS (Graduate Program in Production Engineering and Systems, Pontifical Catholic University of Paraná, Brazil) and is used to show the advantages that can be obtained from the proposed method.Based on the workflow of this method, this section is divided into three corresponding subsections.

Case study: collection phase
In this academic faculty, we would like to analyze the research activities of 14 academic staffs in the last three years.A well-accepted platform in Brazil, named "Cnpq Plataforma Lattes", is chosen as the literature source.In this platform, all of the academic staffs in this faculty have already updated their publications in their resumes.As can be seen from Table 2, the details of the article selection parameters are shown.
At the end of the collection, totally 76 journal articles and 86 conference proceedings that satisfied these conditions are discovered.They are acting the basic resource for the classification phase.

Article Selection Parameters Contents
The Name of the Academic Faculty PPGEPS The Selected Article Database Cnpq Plataforma Lattes The Range of Years From 2012 to 2014 The Names of Academic Staffs Totally 14 academic staffs (the letters from A to N are used to represent their real names) The Publication Languages English The Publication Types Journal articles and conference papers

Case study: classification phase
According to the proposed method, the keywords of the collected articles are studied, extracted and classified.And then, the preliminary classification results are used to construct the topic sentences and to reconfirm their correctness.As can be seen from Figure 2, one classification result of an example article (Liao et al., 2014) is shown.Among these collected academic articles, we found that 75 of them do not contain sufficient keywords and 11 of them do not contain keywords, which request for a keyword extraction process.After the preliminary classification, the keywords in each row of this table are used to create a corresponding topic sentence for each collected academic article and to reconfirm their correctness.At the end, a classification table that contains 162 rows is generated.Together with the six parameters in the previous phase, they are used inputs for the summarization phase.

Case study: summarization phase
Based on five combinations and their corresponding summarization procedures, the summaries of the academic relationships and the research expertise are generated.

The Summaries of the Academic Relationships
As can be seen from Figure 3, the summary of the academic relationships among the academic staffs in PPGEPS from time and frequency perspective is shown.This summary is based on the contents of the column "No.", "Who" and "When", from which we can discover two extremes: (1) There is a strong collaboration between the academic staff "F" and the academic staff "A", who always carry out scientific research activities together; (2) There is lack of collaboration between the academic staff "E" (or "I") and the others in this academic faculty.
As can be seen from Figure 4, the summary of the academic relationships among the academic staffs in PPGEPS from domain and frequency perspective is shown.Considering the research domain of this group, a domain framework that describes the life cycle of a product (Institute of Electrical and Electronics Engineers, 2008) is employed as its background.This summary is based on the contents of the column "No.", "Who" and "Where", from which we can find one issue: Most of the academic staffs are only contributing to the Concept stage, the Development stage and the Production stage.There is a lack of contributions to the Utilization/Support stage and the Retirement stage.Furthermore, we also can discover that the academic staff "D" and the academic staff "J" have most of their publications in the Development stage and the academic staff "A" and the academic staff "F" show many scientific research activities in the Production stage.As can be seen from Figure 5, the summary of the research expertise in PPGEPS from the research object perspective is shown.This summary is based on the contents of the column "No.", "Who" and "What", from which we can discover the different research objects that those academic staffs worked on.For example, the academic staffs "G" and "H" were mainly working on the Process Model, which accounts for 38% and 42% of their total research objects respectively.As can be seen from Figure 6, the summary of the research expertise in PPGEPS from the research challenge perspective is shown.This summary is based on the contents of the column "No.", "Who" and "Why", from which we can find out the different research challenges that those academic staffs intended to address.For example, the academic staff "D" was mainly working on the research challenge that lies in the subject of Product Modeling (28%) and the subject of Sustainability (28%).
As can be seen from Figure 7, the summary of the research expertise in PPGEPS from the research method perspective is shown.This summary is based on the contents of the column "No.", "Who" and "How", from which we can discover the different research methods or technologies that those academic staffs can handle.For example, the academic staff "B" utilized many different kinds of algorithms in his publications, in which, the method that related to the Evolutionary Algorithm (21%) is the most used.
After the discussion and conformation with the research manager and several academic staffs in PPGEPS, these results not only match to the reality in a great measure but also provide evidence-based summaries rather than the experimentalism.Moreover, based on these five summaries, the research manager gave two suggestions to improve the research capability in this academic faculty: (1) advising the academic staff "E" and the academic staff "I" to discover some potential collaboration possibilities with others in this faculty.For example, according to the summaries of their research expertise from the research method and challenge perspectives, the manager suggested the academic staff "E" to find the potential collaboration possibility with the academic staff "B"; (2) advising the academic staff "H" and the academic staff "M" to explore their new research activities in the Support stage and the Retirement stage of a product, according to the summaries of their research expertise from the research object and challenge perspectives.

Conclusion and future works
In the course of improving the research capability of an academic faculty, the analysis of its research activities based on their output (academic articles) are usually considered as the most practical and evidence-based way.Although various kinds of article review methods have already been proposed, little work has been done in the discovery of the implicit academic relationships among those academic staffs and the investigation of their research expertise.The objective of this paper is to propose an Academic Information Classification and Summarization method that deals with this existing issue.It provides a classification table to categorize the academic information inside the articles based on the existed or extracted keywords.This method also formalizes a procedure to summarize the research activities within an academic faculty from five different perspectives.A case study, which based on the proposed method, was taken place in a real academic faculty (PPGEPS) and its results were used by the undergraduate course coordinators to identify new advances in the industrial engineering field and knowledge areas in order to revisit curriculum proposals while promoting a better expertise identification to course subjects To our knowledge, this method is capable of providing reliable and evidence-based summaries for the development of academic faculty rather than the experimentalism.Moreover, some future works are worth noting from the completeness aspect.It includes: (1) To employ or develop a domain ontology that can be used to distinguish and classify the existing or extracted keywords.For example, a knowledge base to deal with the semantic synonymous and subsumable keyword issues.(2) To propose more summarization combinations (other than the existing five kinds) for more precise investigation of scientific research activities.For example, the combination of "No.", "Who", "When" and "Why" can be used to show the trends of research challenges that those academic staffs intended to address in several years.(3) To analyze feedbacks from students and professors about the new proposed courses.

Figure 1 .
Figure 1.The workflow of the academic information classification and summarization method.

•
For each element in ij Y , a frequency number ij f , which records the number of repetitions of i

∈
be the set research domains in row n.In order to provide an overview of the Production, 27(spe), e20162163, 2017 | DOI: 10.1590/0103-6513.2163165/11 collaboration among those academic staffs from the domain and frequency perspective, the figure is suggested to satisfy the following conditions: the set of research objects that is worked on by an academic staff i p .If one i p

Figure 2 .
Figure 2. The classified academic information from an example article.

Figure 3 .
Figure 3. Academic relationships among the academic staffs in PPGEPS from the time and frequency perspective.

Figure 4 .
Figure 4. Academic relationships among the academic staffs in PPGEPS from the domain and frequency perspective.

Figure 5 .
Figure 5. Research expertise of the academic staffs in PPGEPS from the research object perspective.

11 Figure 6 .
Figure 6.The research expertise of the academic staffs in PPGEPS from the research challenge perspective.

Figure 7 .
Figure 7.The research expertise of the academic staffs in PPGEPS from the research method perspective.

Table 1 .
The academic information classification table.

Table 2 .
Article selection parameters in the case study.