The effects of big data over the analytical activities of strategic intelligence professionals in Brazil

The strategic use of information is at the core of the intelligence activity whose focus has been largely directed to the use of technology for data analysis. However, the cognitive aspects are the ones that most seem to challenge the transformation of information into intelligence, as they are associated with the psychological dimensions. This paper aims to examine how the big data phenomenon is perceived and influences the activities of strategic intelligence (SI) professionals in Brazil. The research was conducted as a qualitative study. Semistructured interviews were conducted with 18 leading individuals in the field of strategic intelligence in Brazil followed by a content analysis technique for the data analysis. The results show the effects of big data are less influential over the cognitive dimension of the analyst, particularly in the skills to keep their open-mindedness and in the self-confidence to admit and learn from analytic errors.


Introduction
This paper examines how the big data phenomenon affects the analysis process in the strategic intelligence activity. Historically, half a century after computers became part of the routine of people and businesses, information rapidly began to accumulate. Modern information technology systems enabled the current deluge of data, named big data. However, analysis evolution started during the analogue era when data analysis was expensive and time consuming, then through digitalization, when analog information became meaningful to computers. Eventually, it reached the origin of what is now called datafication, which consists of collecting everything existent and processing it into quantifiable data (MAYER-SCHÖNBERGER;CUKIER, 2013). The definition taken as the basis for our big data study -the data set whose size goes beyond the capacity to capture, store, manage and analyze -is typical of conventional database software tools (MANYIKA et al., 2011).
Big data has the potential to be different from other times due to the possibility of analyzing data in its original form, unstructured; and analyze not only what happened in the past but indeed predict in great detail what will happen around the world (MINELI; CHAMBERS; DHIRAJ, 2013). This change not only involves the amount and frequency with which data arrives but also the diversity of sources and the availability of different types of data (BETSER ;BELANGER, 2013).
The technologies that support big data can be analyzed from two points of view: the infrastructure technologies such as NoSQL database which stores and processes petabytes of data, and the ones involved in analysis such as Hadoop and Map Reduce. Nevertheless, in order to provide the company with the expected return, investment in technology needs to be aligned with the existing business model as well as developing a synchronized transformation process between people, processes and technology. This is because it is necessary to explore in great detail scientific aspects such as testing, processing and storage of vast volumes of data.
Even if all companies, large or small have the same access to information, an analysis process is required to convert the available required information into intelligence. The analysis is supported by rating processes and data documents charts to produce maps, graphs and other communication tools (FLEISHER;BENSOUSSAN, 2007).
The analysis also takes into consideration methodological and cognitive aspects to perform the interpretation of information (FACHINELLI; ALBERDI, 2014). The cognitive aspects are the ones that most seem to challenge the transformation of information into intelligence as they are less instrumentalized than the methodological ones and are associated with the psychological dimensions (HEUER, 1999). This work has focus on the cognitive aspects of analysis in SI.
One of the biggest challenges in the coming years will be the lack of professionals with adequate analytical skills to the big data context. According to Manyika et al. (2011), the United States alone will face up to the year 2018, a shortage of 140.000 to 190.000 people with analytical skills, which represents a gap of 50% to 60% on the demand. A study conducted by Leeflang et al. (2014), portrays the gap in analytical prowess as a major improvement opportunity for companies of all sectors.
While analyzing the databases of indexers Scopus and Web of Science (WoS) to identify the number of types of publications produced in big data and analysis between 2010 and 2015, we could observe that the number of publications in these areas grew on average by more than 250% per year. However, few published papers with big data issues and analysis were identified related to strategic intelligence. These results are confirmed by Lim (2016) when he says that the open source literature on Big Data is extensive in relation to commercial applications, bus scant with regards to strategic intelligence. However, the value that big data analytics brings to intelligence work, a realm whose primary stock-intrade is accurate information, is immense (Lim, 2016).
This research contributes to the big data and SI literature aiming to provide insight into the analysis practices behavior of professionals in the field. A better understanding about the big data influences in the SI could also contribute to improve the analytical skills and, consequently the results of intelligence products for the organization. The following sections provide a literature overview and the most relevant concepts, a discussion of the research methodology, and the results (i.e. main themes that emerged from the study reported here) and conclusions with future research recommendations. Lim (2016) argues about the relationship between big data and strategic intelligence in the sense that the hard core of intelligence analysis includes a human specialist, around qualitative subject-matter content. Some of these assets have over the years come to leverage, on the increasingly massive collection and machine analysis of quantifiable, if not necessarily quantitative, data. Thus, big data apply across such diverse asset classes in the intelligence toolbox and distinguished by an unprecedented order of magnitude when applied to collection and analysis.

Literature review
According to Akerkar (2014), the emergence of big data is due to the rapid expansion of the universe of machines and highly skilled users. As a result, the traditional composition of durable and structured data, serving a specific purpose, shall be modified. This fact, coupled with the use of advanced tools of exploratory data analysis, data mining learning / machine and data visualization, offers a new way of understanding the world. This compounds the volume, velocity and variety (3 V's)considered the main differential factors of this phenomenon compared to other times (MCAFEE; BRYNJOLFSSON, 2012).
McAfee; Brynjolfsson (2012) describe that the volume refers to the large number of transactions, events or stories. The main problem related to the large volume of data is scalability -ability that a system has to adapt and meet the increased demand. The velocity refers to the speed in which data is created, accumulated, internalized and processed. The streamprocessing is a major challenge associated with this dimension, because the selective storage is required for the practice of volume management and real-time response. The variety refers to the complexity of the data sources that feed the analysis processes. The main challenge of the great variety of data is to achieve an effective mechanism for binding of several different classes of data in the internal structure.
Thus, the data that were considered static and banal become raw material of business. For the firm performance daily, as well as firm advantage, data are an essential element. A firm will be able to make more accurate decisions with data that are of higher quality than its competitors (PRESCOTT, 2014). However, the author clarifies that data by itself may provide information about what is being watched, but is incomplete as far as the context that is being observed and in the sense making.
Thus, through the active search for SI and learning how to use it, the organization can transform information into a key element that will contribute to the achievement of competitive advantages. According to Fleisher and Bensoussan (2003), the intelligence is information that is analyzed, interpreted and infused with developed implications. It is the refined product by an analyst that will meet the unique needs of a decision maker to understand a competitive aspect of the internal or external environment. Thus, it is the SI, not information that helps managers to correctly answer specific questions and make long-term decisions (FULD, 1995). The standard intelligence cycle consists of the process to plan, collect and process data, analyze, disseminate and evaluate intelligence and control to customer needs have been accepted (FLEISHER; BENSOUSSAN, 2003). Analysis is the decisive part of the intelligence cycle, because it produces insight for policymaking (BRUCE; GEORGE, 2008). The analytic process demands more than a just a well-educated individual who can write concisely. The complete intelligence analyst must combine the skills of a historian, a journalist, a research methodologist, a collection manager, and a professional skeptic.
In this context, according to Heuer (1999) and Bruce; George (2008) the attributes to the analyst are: mastery of the subject matter as well as related policies, understanding of research methods to organize and evaluate data, imagination and scientific rigor to generate as well as test hypotheses, understanding of unique intelligence collection methods, self-awareness of cognitive biases and other cognitive influences on analysis, open-mindedness to contrary views or alternative models that fit the data, and self-confidence to admit and learn from analytic errors.
The mastery of the subject matter as well as related policies relates to the ability of the analyst to know the various dimensions that may involve the subject or theme, which is the object of analysis in its relationship with the organization. The understanding of research methods to organize and evaluate data refers to the ability of the analyst to grasp and handle the distinct research methods suitable for the focus of observation and analysis. The imagination and scientific rigor to generate, as well as test hypotheses, concerns the ability of the analyst to imagine hypotheses on the theme to direct the focus of observation. The understanding of unique intelligence collection methods concerns the ability of the analyst to comprehend the possibilities that the collection method provides and evaluate their suitability in relation to the focus and purpose of strategic intelligence actions. The self-awareness of cognitive biases and other cognitive influences on analysis concerns the ability of the analyst to identify his own mindset and the cognitive aspects that can influence his analysis. The Open-mindedness to contrary views or alternative models that fit the data, and the self-confidence to admit and learn from analytic errors are other elements that compose an analyst's cognitive aspects.
These characteristics, as well volume, velocity and variety of big data constitute the dimensions of analysis that will be used in this study.

Method
This research was conducted as a qualitative study to examine how the big data phenomenon affects the analysis process in the activity of strategic intelligence. We adopted the social constructionist approach for its association with the postmodern era in qualitative research and its concern with the nature of knowledge and how it is created (ANDREWS, 2012). Semi-structured interviews were conducted with 18 leading individuals in the field of strategic intelligence in Brazil, using the content analysis technique for the data analysis. Per the content analysis method, the categories of analysis were defined a priori based on theoretical assumptions about the topic of the study. For strategic intelligence, categories were taken as required attributes to the analyst (HEUER,1999;BRUCE;GEORGE, 2008). The adopted dimensions for the analysis of these categories were the 3 V's of big data (BETSER; BELANGER, 2013).
The 18 interviewees were selected through the following agents: Brazilian Association of Competitive Intelligence Analysts (ABRAIC), IBM Brazil and Brazilian Society of Knowledge Management (SBGC). Respondents work as professionals, professors or researchers in the field of Strategic Intelligence and were selected via the snowball sampling technique (GOODMAN, 1961) in accordance with the social approach adopted in our method. Thus, it generated more than 18 hours of dialogue on the influences of the big data phenomenon in analysis for strategic intelligence. Table I   Each interview lasted between 45 minutes and three hours. Open questions were roughly based on dimensions of analysis, with a focus on reality on each analyst. (ROWLEY, 2012)

Results and implications
The interviewees were questioned in relation to the effects of the 3 V's in each of the attributes of the SI analyst used as categories in this study. Figure I shows  The quotes of the conceptual categories were grouped per the interviewees' perspective. The total number of analyzed quotes provided the basis for the construction of the scale on the graph. Thus, the conceptual categories with the highest number of quotations were "understanding of research methods to organize and evaluate data", "understanding of unique intelligence collection methods" and "selfawareness of cognitive biases and other cognitive influences on analysis". The conceptual category with the lowest number of citations was "openmindedness to contrary views or alternative models that fit the data".
Having identified the perception of respondents about the effects of big data 3 V's on the SI process of analysis, we proceed to the individual analysis of the proposed categories in relation to the dimensions: volume, variety and velocity.

Dimension 1: Volume
According to the respondent's view, the dimension volume generates benefits to the SI analysis process when enabling decisionmaking based more on data than on intuition. Wheelan (2013) describes that data can serve the analysts in terms of three aspects: (1) they request a representative sample of data about some group or larger population; (2) they ask for the data that provides a source of comparison; or, in some cases, (3) they have nothing specific about what to do with the information, but they have the intuition that this might be useful at some point. In this sense, the art of making numerical conjectures about puzzling questions is statistics (FREEDMAN; PISANI; PURVES, 2014). Statistical methods help a lot if the people want to think about difficult issues.
The benefits cited by volume pass through the possibility to validate a hypothesis with a larger set of data, reducing the statistical error. Furthermore, the volume enables the construction of more refined views and therefore provides better decisions. These factors contribute to the analyst attributes in general, but especially to improve the imagination and scientific rigor to generate as well as test hypotheses.
On the other hand, volume affects SI in terms of quality when the information content is inaccurate, irrelevant or misleading, causing confusion to the analyst in certain areas of specific knowledge. The result can be even more compromised if intelligence products are generated based on low quality information. According to Wheelan (2013), there is no volume of sophisticated analysis that can compensate fundamentally flawed data. Hence, to verify the quality of the information, the volume demands greater attention from the analyst.
When it is more complex, it is more difficult. The whole point is to have tools. Quantity, everyone will say is good, but if you do not have means to treat the amount, a so-called overcharge will occur, you will get lost in pieces of information (case no. 7).
The interviewees also mentioned that currently the pressure of the organizations productivity and the achievement of organizational goals are great. Thus, if the analyst is well prepared in methodological and cognitive terms, and if they have the appropriate technological tools, they have a good result using large volumes of data. This is because the information is not organized. Big data is confusing, varies in quantity and is distributed in countless servers around the world (MAYER-SCHONBERG; CUKIER, 2013).
To mitigate the impact of big data, analysts apply filters and restrict the focus of the analysis so that they are not lost in a flood of data. Per Mayer-Schönberger; Cukier (2013), this kind of reasoning is an environmental influence of "small data". The authors explain that before big data, the analysis is generally limited to a small number of hypotheses that were defined prior to data collection. However, analysts now could let the data speak for themselves. But this requires a change of mentality and, per evidence from the interviews, is not largely used by professionals: I could have more chances to test and think. Now in practice we have a specific time to do. So, the time that I have, which I can do now, with the information I have is that I will recommend (case no. 8).
In general, the use of technology is more data collection related than the analysis itself. The following is an interview excerpt that talks about this methodological issue involving SI: For some interviewees, the volume of big data is stimulating the development of new research methods to organize and evaluate data. However, these new methods are not yet present in the day-to-day of strategic intelligence activity. Thus, it is possible to identify a taxonomy among analysts: those focused on standard intelligence activities and those that explore projects and unique products of intelligence. In some cases, consulting activities can be hired to work on specific projects. In general, the use of technology is more related to the data collection methods than the other analyzing attributes. The following quotes talk about this methodological issue involving SI: I believe when analysts working methods of investigation they still do not know how to work with big data. Analysts are still very much based on investigative methods from hypotheses. We are not doing these hypotheses from the large volume of data (case no. 6).
What we are faced is precisely with a difficulty to identify and to aggregate in our teams' analysts that have a deeper and conceptual knowledge about data, so that we can use and become data in information (case no. 1).
Thus, although volume to be a dimension of big data more clear, it is not adding value for the analysis in the SI. This occurs because the focus of the analysis is restricted to limited data and specific searches. At this point the analyst as a person becomes indispensable in SI activity, because only he can manipulate large volumes of data with the use of the complete set of attributes as proposed by Heuer (1999) and Bruce; George (2008).

Dimension 2: Variety
Although less cited than the volume, variety is a recognized dimension of big data. It is possible to identify that variety contributes to the activity of strategic intelligence, since it enables different analysis dimensions to the issue being investigated. Moreover, the internet has meant that professionals have a better ease of access to data and information globally. It also can run their own data crossings, with more foundation, generating intelligence products more accurately. This fact makes them more confident professionals regarding the analysis.
A variety of big data, along with volume, makes inferences made from the study of many phenomena more effective. In this case, the professional is not part of a priori assumptions, allowing data from different sources help identify a taxonomy for what they want to study (MAYER-SCHÖNBERGER;CUKIER, 2013).
Additionally, Lohr (2012) clarifies that the power, the predictive power of big data is being explored -and shows promise. Thus, big data is modifying the form that analysis are made by means of predictive and prescriptive techniques. While the descriptive statistics became popular initially by SAS and SPSS, reporting past events, predictive analysis uses past information to predict future outcomes with any degree of probability, with a view to the best results. Each of these techniques have been used for decades, however there are large changes in progress when they are combined with big data (MINELI; CHAMBERS; DHIRAJ, 2013). Some of these major changes are: use all or more data to create a predictive model; combine various analytical models and techniques to improve results; create a loop where new learnings are used to adapt the production models; use the closest prediction models possible to real time; focus the application of model predictive techniques (algorithms), instead of inventing new techniques.
However, for the interviewees this is a dimension that is still little explored in the activity of strategic intelligence: They (talking about analysts) cannot have a predictive model, a model that tells them what will be the size of the market and what will be the company's sales projection. Because they are not considering a variety of information that they have today. They are thinking that this problem is volume, then three years of historical series is bad, let me get more information. Let's take five years, now is in six years, they did not see that the problem is not volume, it is variety. There are other informational dimensions that greatly improve the forecasting model, but they are not seeing it. (case no. 6).
Some of the problems identified pass the loss of focus on data collection. a variety requires the professional to have the ability to distinguish a weak signal from a strong signal, reliable information from unreliable, and this causes the activity of strategic intelligence to become a cognitive task. In some cases, practitioners ignore the diversity of data available. Another point raised in the interviews relates to cognitive limitations not only of analysts, for the selection of data analysis of sources, but also for the decision makers. In the absence of appropriate technological tools to the analysis process, interviewees claim that it is not feasible to use all available sources. On the other hand, even having these tools available, respondents claim that human labor in some cases cannot be replaced.
One way found to address these issues is, again, in defining the focus of analysis. In not having prepared analysts, nor systems that can help the process and make the analysis more flexible and dynamic data, companies develop information governance policies and rules for the use of different data sources. These filters play a key role so that the cycle of intelligence can be completed in organizations. It also makes the professional not to lose from different data sources.
This diversity of data sources and information can also turn into a lever for the development of professional in SI. It can develop the attribute mastery of the subject matter more quickly than in the past, having greater access to industry knowledge. However, the analyst must be "curious", "proactive" and "questing" for this to be achieved. In addition, an interviewee claims that the variety will require professional development of technical and interpersonal skills to capture the value of the data: Each individual today is a source of content in the world. So, I think the variety impacts yes, relationships with the generation of channels and processing information. This will require technical, relational and emotional skills to be an analyst to deal with it (case no. 11).
Thus, analysts recognize that some of the supplies currently held for organizations today meet their basic demands. It is necessary for the professional to develop and learn to operate using the full potential that diversity of data sources should offer to improve the generated intelligence products.

Dimension 3: Velocity
The velocity was the least cited by respondents' dimension. Some thoughts on this category to reveal the positive aspects regarding the ease of access to updated information on an industry. This would be a factor contributing to the activity of strategic intelligence. However, it depends on the kind of industry being analyzed. That is, some segments have more speed in the generation and dissemination of information at the expense of others. With this, the most current information is taken as a raw material of better intelligence, which will be the quality of decisionmaking.
On the other hand, there is a difficulty regarding this category in relation to intelligence products that are generated. In the case occurs which analysis are prepared under a given set of data. Before disseminating information can occur changes resulting from velocity, this makes the analysis become obsolete even before being used to generate intelligence.
Another view reflects doubts about this dimension. The question is whether velocity is a real characteristic of big data phenomenon, related to the generation of data and the latest information, more quickly or only the repetition of the data or information previously cited by other sources: I do not know how the velocity impacts on analysis. I have a very particular view of velocity, that it has no relation with big data. The world is producing the same volume of information, but how the growth of producers was exponential, the velocity of production is exponential too. Now, you're talking about information or useful information. There is a tremendous difference (case no. 6) Surprisingly, velocity does not seem to be experienced as a big data dimension, but rather as a fact resulting from the variety of sources of content generation. This research highlights the adoption of big data in the practices of SI, through the analyst attributes. The ideal analyst behavior to use of big data involves the right combination of methodological and cognitive aspects. In turn, this leads to the creation of a sense more refined about how the analysis process is being made.
While some analysts are focusing on the daily activities of SI, using the same methods and mentality, other analysts are developing a new approach to use new volumes of data, by a variety of searches, with a strong velocity of update. These practices being conducted with the support of the correct technological resources are becoming the analysis process to a competitive advantage to organizations. This is the big data promise, which will only be achieved by the hand of the analyst.

Conclusions and recommendations
In this study, we analyzed how the big data phenomenon is perceived by Brazilian strategic intelligence professionals in their analytical activities. This was based on 18 in-depth interviews conducted with experts in the field. The results show that the challenges presented to handle large volumes of data are not only technological, but also cognitive and can be observed in the analytical activity. Other results indicate that the big data phenomenon is not yet a reality for professionals in traditional SI organizations. The results show that the effects of big data are less influential over the cognitive dimension of the analyst, particularly in the skills to keep their open-mindedness to contrary views or alternative models that fit the data and in the self-confidence to admit and learn from analytic errors. Table II  On the other hand, we identified that volume contributes to the understanding of research methods to organize and evaluate data, even if to some respondents, it troubles the mastery of the subject matter, as well as related policies. To better deal with the amount of information associated with the big data phenomenon, we identified that many professionals resort to adjusting the focus and adopting structured and robust analytical methods supported by information technology. The research also identified that even in the big data context, the intelligence products produced by professional analysts compete strongly with the analysis based on intuition. Notwithstanding, this proves that even without an artificial technological limit, our mindset is not geared to work with big data. Hence, endowed with the same mentality and methods, professionals of intelligence have not yet realized the full potential advertised by the phenomenon.
If we consider that the big data phenomenon is still not a reality for SI professionals of traditional organizations, it is possible to suppose that this deficiency tends to increase with the same velocity as the actual complexity. Thus, making SI professionals dependent on the few companies that have data scientists capable of dealing with big data.
The fact is that fundamental aspects that were before pursued by analysts such as the access to rare and still undivulged information, allied to information confidentiality and experience in the focus of observation, no longer represent the only ways to generate intelligence products.
In the digital environment with big data, the access to primary information is already shared among competitors, and the knowledge command of the analyst that comes from experience, if not implemented or updated in a short time, it tends to become obsolete. Currently, with analysis support and appropriate technological tools, professionals can identify unknown patterns amid large volumes of data. So, it is not just about identifying rare information prior to competition, but indeed, from the information available, to try to identify relevant patterns until now unperceived by others. So, besides the attributes required for the analyst, here referred to as categories, the ability to handle the set of technological tools and multidisciplinary training emerge as increasingly key issues for the scope of SI activity in relation to big data. It is a fundamental complement of analytical activities, and not a replacement of the existing ones.

Limitations and future studies
The main limitation of the study refers to the understanding of big data as a phenomenon by the interviewees. For a study of qualitative nature this aspect is a barrier as it makes data interpretation harder, and often answers should be disregarded because their focus does not belong to the study. On the other hand, this limitation is characteristic of studies dedicated to emerging phenomena not yet circumscribed by a visible theoretical framework.
In terms of future studies, we identified that the analysis process currently permeates the organizational structure and reaches a range of roles in the organization. Another area becomes prominent in this context although only observed theoretically in this study: the data science. With this vision persons or areas within the organization would begin to exercise the function of creation and implementation of policies necessary in data processing and information and its dissemination. Future studies would not need to be restricted to SI, but to the perception of analysts from various areas. Academically, for example, the capacity of the researcher in monitoring the publications could be understood. The issue of time availability to perform the analysis, mentioned in the interviews, could also be better explored in future studies.
Finally, it was possible to observe that amid the current complexity of big data, the traditional methods of analysis and treatment of information are no longer appropriate and require updating.
Considering this, as a future study, the construction of a new approach can be proposed, allowing the analyst to further explore the cognitive question for the analysis and treatment of large volumes of data.