Expert Systems and knowledge management for failure prediction to onshore pipelines: issue to Industry

: The paper aims to propose an Expert System to predict the failure of onshore pipelines. Knowledge Management supports expertise sharing throughout the organization. The Expert System Prototype Model proposed is classified as Empirical Descriptive Research, and may support maintenance management. The findings evidence that Expert System proposed grounded in employee knowledge may be considered a promising solution to support Industry 4.0 implementation. The Expert System facilitates the decision-making of the experts so that the employees’ expertise can be better used in the implementation of Industry 4.0 and to face the new challenges related to the daily work in the organization. In this context, the Expert System can be considered as an innovative approach to manage maintenance processes and supporting reliable, consistent decisions during I.4 implementation.


Introduction
The paper aims to propose an Expert System (ES) to predict the failure of onshore pipelines. The ES development is grounded on the work of Castellanos et al. (2011a, b), who also built a system to predict failure of onshore pipelines, but based on Artificial Neural Networks (ANN) instead. Unlike ANN, ES are systems based on semantics and knowledge representation. The premise of the present paper is that the use of such systems may be well-suited in Industry 4.0. Lee et al. (2018) analyses the role of Artificial Intelligence for Industry 4.0-based manufacturing systems, providing an insight into the current state of AI technologies -expert systems included -and the ecosystem required to harness the power of AI in industrial applications. The importance of semantic analysis for Industry 4.0 is emphasized by Rivas et al. (2018), proposing the use of case-based reasoning which extracts information from the reports written by operators about the faults they resolved in machines.
Industry 4.0 is a new paradigm to increase productivity (Rüßmann et al., 2015). On the other hand, Dalenogare et al. (2018) have pointed out potential barriers to the implementation of new technologies in emerging economies. In Brazil, for example, the success in implementing Industry 4.0 is subject to cultural factors, worker skill, and IT infrastructure. Moreover, new tasks become more complex, shifting the employee focus on complex problem solving, and the ability to deal with new technologies. Sié & Yakhlef (2009) indicate that the sharing and dissemination of knowledge and learning depend on people. As such, it should be carried out through dialogue in order to understand why things happen in the organization, which includes the judgment and experiences of each employee, as observed, for example, in maintenance teams.
Failure analysis of onshore pipelines is a complex activity, which involves physical, chemical, mechanical and human factors. This analysis requires multidisciplinary skills to support field inspections and laboratory tests. Collaboration among experts in different areas is important to integrate information (Castellanos et al., 2011a, b). KM allows the experiences and knowledge generated during organizational processes to be shared throughout the entire organization, so these processes are continuously improving (Moradi et al., 2013).
Knowledge management is the set of systematic, formal and deliberate actions to capture, preserve, share and reuse tacit and explicit knowledge created and used by people during routine and improvement productive processes, generating measurable results for the organization and for the individuals (Muniz et al., 2009).
KM deals closely with people, so knowledge creation and socialization are developed from interaction among the different staff of the organization to solve organizational problems, which allows keeping tacit knowledge, know-how and expertise. Recognizing that knowledge belongs to the organization experts, KM should manage: i) expert as part of the organization; ii) where the expert apply her/his knowledge; iii) expert availability (busy, traveling, sickness, etc.); iv) retaining the expertise/experience within the organization; v) learning and knowledge sharing (Castellanos et al., 2011a, b;Giarratano & Riley, 1998). In this context, knowledge can be dealt under two paradigms: i) organic paradigm (management), which relies on implicit knowledge and communication of people; ii) computational paradigms (information systems), which relies on information technology and databases (Nakano et al., 2013). This paper is ground on the latter and discuss the following research-questions: • Which knowledge management strategies should support the implementation of I4.0 in the light of new technologies and employee competencies? (Erol et al., 2016;Burzlaff, 2017)?
• How will managers apply employees' expertise through the implementation of complexes, flexible and efficient processes?
• Which innovative approaches may support operational processes to yield competitiveness (Burzlaff, 2017)?
In order to respond to these questions, the authors discuss decision-making processes and KM in the oil and gas industry context (Grant, 2013;Van Hinte et al., 2007). The system has a knowledge base regarding failures, the ANN to manipulate knowledge, and user interface. A system like this assures reliable and permanent availability of the expert knowledge regarding failure analysis; furthermore, it results in a powerful training tool for new failure analysts (Castellanos et al., 2011a, b).
ANN is based on the premise that intelligent behavior is obtained through massive parallel processing between simple individual processing structures (neurons) connected as a network, like the human brain to a certain extent. ANN belongs to a branch of artificial intelligence known as connectionism and can be briefly described as a layer-organized network so that each node is an artificial neuron. The artificial neuron is reduced to a mathematical model that emulates the behavior of a natural neuron. Neurons from the input layer receive a set of data, which are processed in several intermediate layers, widely connected to adjacent layers. This processing results in output data, which are returned in the output layer because of the overall ANN processing. ANN's main characteristic is the ability of automatic learning from training based on a large amount of data. On the other hand, ANN does not manipulate explicitly represented knowledge. The knowledge manipulated by an ANN is implicit in the weight of the connection arcs between the network nodes of all layers, so numbers with little semantic meaning are manipulated. Therefore, it is quite difficult for an ANN explains the solving process.
Another branch of artificial intelligence comprises the methods based on a logical and symbolic representation of problems, such as Expert Systems (ES). ES development requires expert knowledge to be elicited, represented, implemented and validated, so ES's are very useful in a KM context. Since it manipulates knowledge formally represented, semantically validated, ES is capable to explain the reason why a solution is proposed, making explicit its intelligence and credibility (Giarratano & Riley, 1998). An explanation mechanism that can interpret and justify the results produced is a key function in ES. It allows the evaluation of the results issued and their validity and applicability (Diederich, 1992;Lopez-Suarez & Kamel, 1994). Furthermore, ES can reliably retain the experience of the organization's experts in order to provide favorable context for the appropriate handling of expert's skills, knowledge and lessons learned. The knowledge base must be designed to allow efficient access to the required knowledge and maximize knowledge storage. When large amounts of data and knowledge are required, a shared database should be considered to avoid losses of information and minimize search difficulties (Brahan et al., 1998;Kiriş et al., 2010;Viliam & Michal, 2010).
The paper presents the theoretical background related to Section 2. Section 3 presents the data sources and methods, focusing on the development of the expert system model. Section 4 presents selected applications of the developed model and discusses its relevance to technological changes and their impact on work organization. The major conclusions are drawn in Section 5.

Theoretical background
The fourth industrial revolution appears to be largely recognized by practitioners before its widespread implementation. It appears that Germany, Japan, China, and The United States are pioneering the infrastructure to support this revolution, including government programs.
One key technological development with a likely profound impact on manufacturing is cyber-physical systems (CPS). These entail the convergence between the physical and virtual industrial environments by means of business networks that integrate equipment, production, and inventories. They can also be defined as collaboration systems between entities with online and intensive connection with the physical environment and ongoing process in order to provide simultaneous access to data processing dedicated to decision-making. CPS´s are constituted by micro components capable of controlling sensors and actuators to collect and allow data exchange between computers, cable or wireless networks or even with cloud-based servers. The complexity, dynamic behavior, and integration promoted by CPS´s are capable to collaborate with planning, analysis, simulation, implementation and maintenance of high-performance manufacturing systems (Lu, 2017).
This dedicated machine-to-machine network (M2M) is known as "Internet of Things" (IoT) and allows the collection of data in all embedded connected equipment, turning feasible the interaction, data analysis and decision making between "the Things", between machine-to-machine interactions in a production assembly line. IoT is the base of Industry 4.0 and defines a new value horizon in the complete supply chain, creating new services by the distribution channels.
In this context, besides the process integration, it is also possible to use brand new technologies such as big data, cloud-based servers (focused on an industrial application for equipment data), augmented reality and simulation. Of course, this also has implications for data security discussed elsewhere (Lu, 2017). Regarding companies employees´ development programs, factors should be taken into consideration in the human resources training strategies in order to prepare them for this new set of technologies (Prinz et al., 2016;Shamim et al., 2016;Baena et al., 2017).
The papers reviewed revealed opportunities for new frontier-based research to be overcome, as follows: integration of the JavaBean and Jess tools to obtain the inference engine more efficiently (Shiue et al., 2008); development of new validation methods to measure the performance of the medical recommendation proposed and also in other domains, e.g., clinical care treatment and healthy lifestyle choices to lower disease occurrence probability in public health (Chi et al., 2008); tacit knowledge transfer and the success factors affecting the ES applications from knowledge users' point of view (Chang & Chuang, 2011;Chen et al., 2009;Feng et al., 2009;Muniz et al., 2011;Nakano et al., 2013;Patil & Kant, 2014); extension of reasoning modes and automatic provision of analyses to create automatic deductions on the basis of the knowledge expressed by the expert (Ruiz-Mezcua et al., 2011); improve practices that enhance and promote KM in the problem-solving process (Vacik et al., 2013); combine knowledge expressed by fuzzy cognitive maps and sets, and neural networks with traditional knowledge (Kim, 2014); develop different methods for tacit knowledge acquisition under complex and uncertain environments (Li et al., 2018).

Expert systems
Expert system (ES) is a field within Artificial Intelligence that has now been around for several decades. Since its inception in the 1970s and its rise to popularity in the 1980s and 1990s, many case studies have been published containing a wealth of knowledge about what worked and what did not work for that particular application (Wagner, 2017). An ES is an intelligent computational program that emulates the decision-making ability of a human expert. It is an artificial intelligence technique that extensively uses specialized knowledge to solve complex problems, with intellectual performance comparable to that of a human expert in a specific domain (Giarratano & Riley, 1998). Knowledge manipulated by an ES comes mainly from acknowledged and experienced experts in their domain, and it can be complemented through technical literature of the same domain. The knowledge of a human expert is heuristic in nature, in contrast to the algorithmic nature of problems that can be solved through a systematic and finite sequence of steps. Transferring heuristic knowledge and experience from human experts to the computer is not only a key factor in the development of any ES, but it is also a valuable resource in a KM context. The main elements of an ES are depicted in Figure 1. Despite a module for the explanation (ME) is optional, it makes the ES intelligence explicit to the user, which strongly contributes to increasing its credibility and acceptance. It also enables great progress in terms of ES development, especially during system verification and validation. Although the explanation ability is highly desirable, this is not a feature easily found in some expert systems. García et al. (2013) provided a new perspective on how to formalize dialectical explanation support for argument-based reasoning and show how it can be fleshed out in an implemented rule-based argumentation system. Horridge et al. (Horridge et al., 2013) described an exploratory study that forms the basis for a cognitive complexity model that predicts the complexity of OWL (Web Ontology Language) justifications and present the results of validating that model via experiments involving OWL users.
Lacave, Onis´ko, and Díez (Lacave et al., 2006) show how the explanation capabilities provided by Elvira, a software tool for editing and evaluating probabilistic graphical models, have helped the debugging of two medical Bayesian networks: one for diagnosis of prostate cancer and other for diagnosis of liver disorders.
Abu-Hakima & Oppacher (1990) built an ES for diagnostics systems based on the knowledge that explains reasoning explicitly. They found it helpful during the knowledge representation stage, on which they performed self-explanation in every step of the stage, forcing the system designer to represent knowledge explicitly.
For Dhaliwal & Tung (2000), building a ME from the knowledge of several domain experts enables a significant time reduction in knowledge acquisition, obtaining more complete information, synergy, simulation and learning to be held. Moreover, the fact that the domain experts worked together instead of separately reduced production blocking and cognitive inertia since they could build upon each other's comments. Goud et al. (2008) emphasize on the necessary and multidisciplinary care of ME construction, discussing the technical issues concerning the delivery of active decision support, and the provision of advice rationales to users while taking into account of dynamic contexts and changing guidelines.

Methodology and ES Prototype Model Development
The Expert System Prototype Model is classified as Empirical Descriptive Research (Bertrand & Fransoo, 2002), which is primarily interested in creating a model that adequately describes the causal relationships that may exist in reality, which leads to understanding of the processes going on and it was built follows cycle of conceptualization, modeling and validation. It assumes that objective models can be established either to explain (part of) the behavior of real-life operational processes or to capture (part of) the decision-making problems that are faced by managers in real-life operational processes (Bertrand & Fransoo, 2002).
The literature review evidenced that the process for solving problems is influenced by tacit knowledge of experts in a given domain, their performance for decision-making and the application context. Among various activities in KM, the discovery, capture, storage, share, and reuse of the knowledge add value to the organization and are a critical issue in the modern management field (Shiue et al., 2008).
In terms of KM in decision-making, the ES has proved to be capable to emulate the reasoning abilities of human experts, so the ES excels at capturing, sharing and allowing users to reuse the organizational decision-making experience in difficult problems. From this point of view, the tracking of the problem-solving process becomes an important connection between knowledge and decision-making performance.
The ES prototype model proposed here for failure prediction in onshore oil and gas pipelines is developed based on the knowledge elicited by Castellanos et al. (2011a, b) for their Artificial Neural Networks (ANN). The main distinction between the ANN of Castellanos et al. (2011a, b) and the ES is the explicit intelligent character of the prototype through a module for an explanation. Development stages regarding knowledge elicitation, representation and implementation are detailed, and validation is performed by comparing the prototype solutions to those obtained by the ANN of Castellanos et al. (2011a, b). ES development cycle goes through four stages: knowledge elicitation, knowledge representation, and computational implementation and validation. Each stage is detailed as follows.

Knowledge elicitation
The knowledge required for the ES prototype is essentially the same Castellanos et al. (2011a, b) used to build their ANN and it is fully reused here. According to the authors, knowledge was obtained from 854 reports regarding actual failures observed in onshore oil and gas pipelines. A multidisciplinary panel of experts analyzed the reports and also considered attributes identified in typical visual inspection of failures and definition of variables found in the literature. The panel identified 65 input variables (divided into 11 categories) and 22 output variables divided into two categories: failure mechanism and recommended tests to confirm the diagnostic of the failure. The information is organized according to the two categories of variables previously described and those variables are coded according to as presented in Table 2. Except for those categorized as general information, input variables represent questions that only admit yes or no as an answer. Compilation presented in Figure 2 shows that the conclusion for a failure mechanism (column A) and recommendation for some tests (columns XX, YY,…, JJJ) are obtained through the combination of different answers yes/no related to the input variables (columns K, L,…, QQ).

Knowledge representation
Since the compilation presented in Table 2 is a series of questions that only admits yes or no as an answer, a binary decision tree is a proper mode to represent elicited knowledge. Because there are many possibilities, it is unpractical to present here the complete tree. Figure 2 presents the tree (in the form of a graph) that leads to the failure mechanism third parties accident. Similar trees are established for each failure mechanism and each recommended test. THEN (mechanism = A4) Thus, a set of rules representing decision trees composes the knowledge base. Rule representation is convenient because it makes possible to track back the reasoning path from the solution (failure mechanism) to the causes, generating explanations. The possibility to explain the reasons why failure occurs is the major advantage over the ANN of Castellanos et al. (2011a, b).
The module for explanation consists of backtrack the decision trees from the solution, justifying every question made by the prototype. The ME implemented in the prototype backtracks each category of input variable separately. For the tree depicted in Figure 2, backtracking results as follows: The failure mechanism is THIRD PARTIES ACCIDENT (A4) because: 1. Regarding cathodic protection: The pipeline does not present CATHODIC PROTECTION (PP = 0).

Regarding failure surface conditions:
The failure surface present METAL LOSS (W = 1); Failure surface is not SLOTTED (X = 0); The failure surface does not present CRACK PATTERNS (Y = 0).

Regarding coating state:
The pipeline surface is not COATED (U = 0).

Regarding general surface conditions:
The pipeline surface is DENTED (K = 1); The pipeline surface is not CORRODED (M = 0); The pipeline surface is not PITTED (N = 0); The pipeline surface is not FRACTURED (O = 0); The pipeline surface is not PERFORATED (Q = 0); It is interesting to note that the explanation would be different if the pipeline surface is perforated (Q = 1) because, in this case, questions from the fracture characteristics category should be performed (see Figure 2). This is the reason why each input variable category is backtracked separately.

Computational implementation and validation
The ES prototype is developed in a shell. A shell contains the inference engine, so the effort to develop an ES is concentrated in order to build the knowledge base. Thus, the programming of the inference engine is avoided, which is quite convenient. In this work, the shell used is the clips (Clips Rules, 2018). Clips was originally developed by NASA, but nowadays is a public domain software kept by an independent community.
The user communicates with the ES prototype through the dialog window, a basic text-oriented interface executed in clips environment. The prototype presents questions to the user about the pipeline's condition. Depending on the answers, new questions are made until a conclusion regarding the failure mechanism and recommended tests is reached. The explanation for the proposed solution is also available.
Verification consists of several tests in order to assure that every possibility presented in Table 2 are covered by the ES prototype. Validation is performed by comparing the results provided by the ES prototype to those presented by Castellanos et al. (2011a, b).

Expert system model application
The prototype is applied to evaluate real-world applications reported by Castellanos et al. (2011a, b). The results are then discussed in order to evince the ES suitability to support KM in an Industry 4.0 context. In the context of the new frontierbased research to be overcome previously presented, the contribution of this paper is mainly in tacit knowledge transfer (Feng et al., 2009) and practices that enhance and promote KM in problem-solving processes (Vacik et al., 2013).
• Application 1 corresponds to a report describing failure in a 14" pipeline, presenting corroded and slotted surface. The surface also presents cracks pattern with multiple origins and tree texture.
• Application 2 corresponds to a report describing failure in a 16" pipeline, presenting a holed surface. Pipe coating is degraded and there is metal loss in the failure surface. There is cathodic protection and welding is not in a good state.
• Application 3 corresponds to a report describing a failure in a 20" pipeline, presenting corroded surface and degraded coating. There is metal loss in the failure surface, which presents beach marks crack patterns.

Applications
Application 1: The following dialog is obtained from the interaction with the ES prototype when solving this case (in each question, the code within brackets corresponds to the variable name described in Table 2 From the previous dialog, ES prototype infers the following solution: Failure mechanism: Extern corrosion (A2) Recommended tests to corroborate the diagnostic: Micrography (XX); Thickness measurement (DDD).

Applications discussion
Results obtained by the ES prototype and by the ANN from Castellanos et al. (2011a, b) are equal in all three cases presented before. Since the same results are also found in the failure reports, ES and ANN techniques are able to propose highquality solutions for a complex problem. The ES prototype is considered validated because it reproduces the solutions of three cases reported in Castellanos et al. (2011a, b). Therefore, both techniques are useful tools for knowledge management in an organizational environment because the ability to diagnose pipeline failures does not rely solely on the human experts, but it is permanently available in the computers of the organization.
Besides the solution ability, ES prototype also provides the reasoning process that leads to a solution and, as a module for explanation is implemented, the diagnostic rationale is available as well. These features make the ES prototype a very powerful and reliable way to compose a permanent corporative memory. This is a major advantage related to ES in this context, which is possible because, unlike ANN, ES manipulates explicitly represented knowledge. By making the human expertise explicit, knowledge belongs to the entire organization, not only to the experts. On the other hand, ES is not capable of automatic learning, which is a major advantage related to ANN: the more an ANN is used, the more it learns and the better are the solutions provided. Generally, learning is not automatic in expert systems, so new knowledge should be represented in a way that further knowledge base expansion and validation are facilitated (Silva et al., 2014;Matelli et al., 2009;Matelli, 2016).
It is also apparent that the ME proposed here is quite simple, and an ANN like the one developed by Castellanos et al. (2011a, b) could also generate explanations in this level. However, expert systems have a higher potential to generate more sophisticated explanations because the cause-effects relations encompassed by the heuristics can be deepened if further knowledge is acquired. For instance, human experts know why fracture characteristics should be investigated if the surface is dented but not perforated. This knowledge could be elicited, represented and implemented in the knowledge base, and could be manipulated in the ME in order to generate more complete explanations. In this work, knowledge acquisition was limited to what was presented in the papers of Castellanos et al. (2011a, b); there was no direct contact with the human experts or failure reports referred in those papers, so it was not possible to acquire further knowledge to generate deeper explanations. Depending on how deep the explanations can be, an ES capable of explaining its reasoning is certainly a valuable resource for training new personnel, assuring the evolution of learning in a KM context.
It is important to emphasize that the ES prototype presented here presents characteristics that make it useful in a KM context. According to Giarratano & Riley (1998), these characteristics are: a) experience and knowledge are no longer concentrated on the experts, but are available to the entire organization; b) cost to access knowledge is significantly reduced because the knowledge availability is higher, which financially justifies the investment to develop an ES (in particular, the cost to develop the ES presented here was quite low because the effort to acquire and elicit knowledge from human experts was not necessary); c) knowledge is indefinitely preserved, which implies in a coorporative memory that is not dependent on human experts and their limitations, such as health problems, retirement or resignation; d) multiple points of view, since knowledge from several experts compose the knowledge base, which makes the ES intellectual performance higher than one single expert; e) higher reliability, because human experts are subjected to factors that compromise their best judgment, such as tiredness, health problems, and stress; f) explanation capacity, because a human expert may be tired or may not be able (or simply may not want) to provide explanations about a proposed solution; g) fast solutions, because the ES prototype provides practically instantaneous failure diagnosis and recommended tests, which is faster than any human expert; h) consistency and impartiality are provided by the ES under any circumstance, as long as human expert judgment may be compromised by prejudices or pressure of an emergency situation.
Research results are aligned with research opportunities identified in recent literature, such as identifying factors that influence flexibility and industrial operating performance on this oil gas production system, as well as having practical implications related to the improvement of industrial competitiveness behind the oil and gas industry.
Finally, the research questions posed in the introduction are answered as follows: • Applications evidence that ES may be considered a promising alternative to support new technologies implementation grounded in employees' knowledge. This is specially aligned with the current Industry 4.0 implementation, which answers the research questions proposed by Erol et al., 2016;Radziwon et al., 2013;Burzlaff and Bartelt, 2017. • The ES facilitates the decision-making of the expert employees so that the employees' expertise can be better used in the implementation of Industry 4.0 and to face the new challenges related to the daily work in the organization. In this context, the ES can be considered as an innovative approach to manage maintenance processes and supporting reliable, consistent decisions during I.4 implementation, which answers the questions made by Burzlaff (2017).

Conclusion
The paper proposed the application of an expert system as a tool to support knowledge management in organizational environments, which may support Industry 4.0 implementation. In order to demonstrate the suitability of this scope, an expert system prototype to predict failures in onshore pipelines was developed. Knowledge required was elicited from a paper found in the literature, on which several actual failure reports were critically analyzed by a multidisciplinary panel of experts in failure analysis. The panel identified the main failure mechanisms and the recommended tests to corroborate the failure diagnosis. All this knowledge was represented through a decision tree and each tree node was represented as a rule. These rules composed the knowledge base of the prototype implemented in clips. Case studies executed in the prototype shown both the ability to quickly provide reliable solutions and the usefulness of expert systems in a knowledge management context, especially regarding the creation of a corporate memory independent on the availability of human experts and the possibility to use it as an intelligent tutor for new failure analysts. Thus, knowledge is permanently available to the entire organization with relatively low cost and the ES technique is a useful tool to support knowledge management in organizations.