Policy learning and policy change: exploring possibilities on the Advocacy Coalition Framework

This article aims to advance the discussion about the influence of knowledge and policy learning on policy change, taking the Advocacy Coalition Framework as reference. We propose unlinking the comprehension of learning through change in two perspectives. First, we suggest apprehending the relation between knowledge and policy learning, through the use of knowledge, assuming that different forms of learning are possible, depending on the context of decision-making. Then, relying on the contributions of the theory of gradual institutional change, we suggest using the notion of institutional dynamics, in order to capture the explanatory power of knowledge and policy learning both in stasis and change situations. We aim to contribute to diminish the skepticism presented in the literature about the influence of knowledge and policy learning in the policy process.


INTRODUCTION
Although many theories and frameworks have tried to explain the causal mechanisms that lead to policy change (Baumgartner, Jones, & Mortensen, 2014;Kingdon, 2014;Mahoney & Thelen, 2010), for more than three decades, the Advocacy Coalition Framework (ACF) authors have shown a special interest in the role of knowledge and policy learning in the policy process (Jenkins-Smith, Nohrstedt, Weible, & Ingold, 2018;Sabatier, 1988;Sabatier & Jenkins-Smith, 1999) Despite the hundreds of empirical studies using the framework, however, the ACF researchers still face the challenges of studying learning in policy contexts. The difficulties include the need to advance both theoretically and methodologically, in order to better understand the learning process, how to measure it and how it influences policy change, amongst all the other variables considered in the framework (Jenkins-Smith et al., 2018;Jenkins-Smith, Nohrstedt, Weible, & Sabatier, 2014). Additionally, researchers also point to the need of better understand which and how personal characteristics, policy contexts and institutional arrangements hindrance or trigger policy learning (Moyson, 2017;Weible, 2008).
This essay aims to offer theoretical and methodological contributions to the ACF, assuming that policy change is not the only variable for the apprehension of policy learning and that the concept of policy learning as commonly used by ACF researchers must be expanded in order to capture other possibilities, which go beyond alteration in the beliefs system. For this purpose, we seek for support in theories that discuss the use of knowledge by the policy actors (Weiss, 1979(Weiss, , 1998 and the different forms of learning according to the context of decision-making (Dunlop & Radaelli, 2013, 2018. Moreover, we explicit attention to the relationship between policy learning and stasis, by exploring the advantages of considering the policy dynamics with a more refined view of policy change and stability (Mahoney & Thelen, 2010). We do not intend to develop a new framework, but to advance from a consolidated one, capitalizing on the common set of premises and concepts already established, to provide new insights for ACF researchers interested in policy learning.
The remainder of the text is as follow. Section 2 discusses the notion of policy learning in the ACF and its impasses, based on recent ACF applications. Section 3 presents an expanded concept of learning and discusses its relationship with knowledge, based on different types of learning and the notion of uses. Section 4 explores how the notion of institutional dynamics helps to map the relationship between policy learning and change proposed by the ACF. In addition, we present a conceptual model based on the dialogue between the theories analyzed that should serve as support for future empirical works. The last section summarizes the contributions of the proposed model and indicates some propositions for empirical testing.

POLICY LEARNING AND POLICY CHANGE IN THE ADVOCACY COALITION FRAMEWORK: IMPASSES AND ADVANCES
The ACF is a framework for policy analysis that seeks to understand the policy process, with a central focus in policy change and policy learning processes. The model adopts the premise that the policy process is complex, and for this reason, those who wish to influence it must specialize in policy subsystems and form coalitions. Thus, the ACF offers an alternative vision to the policy cycle model, which allows to analyze the policy process more dynamically and consider the processes of agenda setting, formulation, implementation and evaluation of policies in a non-linear perspective (Jenkins-Smith et al., 2014;Sabatier & Jenkins-Smith, 1999, 2007. On the ACF perspective, individuals hold bounded rationality and tend to overly simplify the world, through a beliefs system. ACF uses a hierarchical tripartite structure, which aggregates the beliefs according to their degree of resistance to change, including deep core (most fundamental values), policy core (normative and empirical aspects of the policy) and secondary beliefs (policy instruments) (Sabatier & Jenkins-Smith, 1999;Sabatier & Weible, 2007).
The ACF establishes four pathways for policy change: significant external or internal changes to the subsystem, policy-oriented learning and negotiated agreements between rival coalitions. External shocks refer to changes in socioeconomic conditions, public opinion, the governing coalition, and other subsystems, while internal shocks include disasters or major failures occurring in the policy subsystem itself. The effect of these shocks is the change in the distribution of critical political resources, such as financial or public opinion support, which can shift agendas and lead to the replacement of the dominant coalition. Policy-oriented learning is conceptualized as "enduring alternations of thought or behavioral intentions that result from experience and which are concerned with the attainment or revision of the percepts of the belief systems of individuals or of collectives" (Jenkins-Smith, 2014, p. 199). It is more likely to occur in the secondary beliefs, leading to minor changes. Negotiated agreements cover situations in which rival coalitions seek an agreement specially when there is a "hurting stalemate" that requires a solution. This pathway includes the notion of learning itself and presupposes the existence of an institutional arrangement that favors the regulated discussion between the coalitions towards an agreement (Sabatier & Jenkins-Smith, 2007) Theoretically, the ACF emphasizes four categories of explanatory factors for learning: the attributes of the forums, the level of conflict between coalitions, the attributes of the stimuli and the attributes of actors. More specifically, ACF assumes that policy-oriented learning is more likely to occur when 1) there is a forum with enough prestige to force the participation of professionals from different coalitions and which regulation is made by professional standards; 2) the level of conflict between coalitions is intermediate; 3) the problem is more tractable, because there is supportive theories and data; and 4) policy actors have moderate beliefs (Jenkins-Smith et al., 2014).
Although ACF does not establish a concept for policy change, it is associated with changes in government policy or program rules, resource allocation, decision to continue or terminate a program, and changes in the policy outputs and impacts. ACF also establishes levels of change, which may be minor or major, depending on the extent of deviations from previous policy. As changes require alterations in the beliefs system, for the ACF, it is extremely difficult for them to occur voluntarily, so technical information can lead to learning regarding secondary beliefs, leading to minor changes. Sabatier & Jenkins-Smith (2007, p. 198) affirm that "because major change from within the subsystem is impossible, it must come from an external source". Figure 1 presents a diagram of the set of possible relations between knowledge, learning and change established by the framework.  -Smith (2007, p. 198) affirm that "because major change from within the subsystem is impossible, it must come from an external source".  The description of the pathways that lead to change in the ACF reveals some issues that must be addressed in order to deepen the understanding about the relationship between learning and policy change. One of criticisms made to the ACF is that, although it gives prominence to values and beliefs, they work much more as the glue that gathers the members of a coalition than as factors of change (Capano, 2009). The model favors shocks as the main agents for major changes because they allow a reconfiguration in the power distribution, that is, power, more than the ideas, explains major changes in ACF.
Similarly, negotiated agreements after years of conflict between coalitions only occur in response to stalemates that affect external factors and creates pressures for change. In The description of the pathways that lead to change in the ACF reveals some issues that must be addressed in order to deepen the understanding about the relationship between learning and policy change. One of criticisms made to the ACF is that, although it gives prominence to values and beliefs, they work much more as the glue that gathers the members of a coalition than as factors of change (Capano, 2009) policy change is a very ambiguous area of academic study, and one full of pitfalls. All aspects of policy change have been dealt with, all the possible independent variables have been examined (ideas, interests, institutions, socio-economic structures, political institutions, internationalization, individual entrepreneurship, social culture and values, and so on. The model favors shocks as the main agents for major changes because they allow a reconfiguration in the power distribution, that is, power, more than the ideas, explains major changes in ACF. Similarly, negotiated agreements after years of conflict between coalitions only occur in response to stalemates that affect external factors and creates pressures for change. In both cases, changes are the result of the weakening of the dominant coalition or the replacement of the governing coalition. In fact, ACF assumes that individuals' deep and policy core beliefs are very resistant to change, so it is the redistribution of resources that allows a different beliefs system to have more power to influence the policy towards the desired changes. Therefore, the alteration in the individuals' beliefs occur only as a result of learning. The members of the coalitions learn when they change their ideas about a policy instrument, for example, but as this learning process involves mainly secondary beliefs, it leads to minor changes. Assuming that learning occurs when knowledge alters the beliefs' system, ACF delimits the notion of learning, suggesting that when knowledge reinforces pre-existing beliefs there is no learning. Empirical applications of the ACF, however, do not always support the theory. A study conducted by Leach, Weible, Vince, Siddiki and Calanni (2014) in ten partnerships related to marine aquaculture policy in the United States, for example, indicates that the relationship between knowledge acquisition and change in policy beliefs is not automatic. They suggest that although there is a relationship between them, the acquisition of knowledge was more common among the participants of the research than the alteration in their policy beliefs.
Moreover, even when there is an alteration in the beliefs system, it does not always lead to policy change. Two reasons can be claimed to explain this: the inconsistency of the learning process and the fact that both change in power distribution and change in beliefs, resulted from the learning process, are necessary but not sufficient conditions to trigger policy change. It also depends on the mobilization of resources and strategies to achieve the desired objectives.
The inconsistency of the learning process is evidenced by Moyson (2017). He argues that even when the actors review their policy beliefs, they do not always revise their policy preferences consistently with their new beliefs. For example, individuals can believe that the new evidence proves the environmental inefficiency of the policy (change in policy beliefs), but also believe that the policy must remain because the economic gains overcome environmental damage (maintenance of policy preferences) (Moyson, 2017).
Studies also show that no matter how accepted and consolidated the scientific information is, it may be insufficient to promote the desired changes, when the coalitions are not able to properly organize themselves to influence the policy debate (Barnes et al., 2016;Khayatzadeh-Mahani, Breton, Ruckert, & Labonte, 2017;Smith et al., 2015;Ulmanen, Swartling, & Wallgren, 2015). Khayatzadeh-Mahani et al. (2017), for instance, evidence that despite the amount of scientific information about the health problems caused by tobacco, pro tobacco coalitions were able to avoid the ban of shisha smoking in public places in Iran. They conclude that the pro-ban group was not able to exploit opportunities, like stronger public support, for enforcement of the shisha smoking ban. But they also argue that there was a lack of policy learning due to a lack of agreement over evidence. We wonder that if they had considered Moyson's notion of inconsistency between policy beliefs and policy preferences, maybe, they had come to different conclusions. What if the anti-ban coalition in fact learned from the evidences but did not change their preferences because of the economic and social consequences of implementing the ban? Also, how much learning had the pro-ban coalition, even though the desired changes were not achieved?
On the other hand, even when the stable parameters favor stasis, the proper mobilization of resources and strategies can lead to policy change. Mosley & Gibson (2017) show that different types of information combined with the right narratives were able to promote policy change in California's law to extend foster care. Studies tend to show that technical information is a crucial resource, but dependent on the coalition's capacity of strategically act to influence the policy process.
As observed, studies that use ACF as the main framework of analysis, apply different research designs and variables to apprehend the learning process, coming to different results and interpretation.
From this, some questions are relevant: does learning only occur when there is change in policy beliefs?
When knowledge serves to reinforce pre-existing beliefs, is there no learning? Does learning only occur when there are policy changes? When policies remain unchanged, is there no learning? These questions offered insights into the possibility of distinguishing between two sets of relationships: first, the relationship between knowledge and learning, and then between learning and policy change.
These possibilities will be addressed in the next sections.

KNOWLEDGE AND LEARNING: REVIEWING THE USES OF KNOWLEDGE
Policy actors are in constant contact with different kinds of data, information, and knowledge, but certainly not all of it can be taken in consideration in the policy-decision process. It is not simple to know if something was really learned, and that is probably why policy theorists tend to use policy change as the main evidence of learning. Although policy change is a good evidence of learning, for logical reasons, it does not mean that no lessons are learned when nothing changes. Then, in the absence of change, what could be considered as an evidence of learning? In this section, we will explore the possibilities of considering the use of knowledge by policy actors as an evidence of policy learning.

Broadening the concept of learning
The concept proposed by ACF indicates the enduring alternation of thoughts or behavior, impacting the belief systems. Hall (1993) points out that learning, as conventionally understood, "occurs when individuals assimilate new information, including that based on past experience, and apply it to their subsequent actions. [...] Learning is indicated when policy changes as the result of such a process" (Hall, 1993, p. 278). Although he considers the assimilation of new information, the concept attaches the meaning of learning to policy change. Dunlop and Radaelli (2013), in an effort of systematizing the literature on policy learning, adopt a simpler but more embracing concept, indicating learning as an "update of beliefs" (Dunlop & Radaelli, 2013, p. 600). The term update, although including change, is more comprehensive because it contemplates other meanings such as correcting, revising, improving, or renewing, based on a more recent information on a given subject. In this conception, we understand that learning can include the assimilation of information that changes or reinforces existing beliefs, which allows its expansion. It can also be thought in terms of what is learned, or simply the lessons of learning. In Rose's (1991) words "lessons constitute what is learned […] and do not require a change in behavior as a condition of learning (Rose, 1991, p. 7).
Based on the previous discussion, we propose that learning consists of the assimilation of knowledge that promotes an update in the beliefs of the individual. We consider the general notion of knowledge used by the ACF, which includes any knowledge that contributes to understand the policy issue, its problems, and its possible solutions, including technical information, scientific studies, or past experience. In our analysis, however, for methodological reasons, we will only consider the knowledge that is organized in a cohesive and coherent manner, in scientific basis, as a technical report, a scientific paper, a book, or a newspaper article, which can be used by the policy actor. We understand that, especially when it comes to experience, unsystematic knowledge is also part of the learning and decision-making processes, as when legislators talk to their electors directly, but these forms of knowledge are not going to be considered in the discussion.

Evidencing learning through the uses of knowledge
Based on the work of Weiss (1986Weiss ( , 1988, ACF assumes that to understand the role of technical information in processes of change, it is necessary to consider a period of one decade or more in order to capture its enlightenment function (Sabatier & Jenkins-Smith, 1999)which are actors sharing a set of basic beliefs within a policy subsystem or policy community. The advocacy coalition framework has five basic premises: 1. Weiss' notions of use will be reconsidered here.
The countless experiences of policy evaluation during the 1960s and 1970s showed that the results of the evaluation did not exercise the expected influence on decisions on governmental programs, such as resource reallocation or the selection of programs for expansion or reduction (Shadish, Cook, & Leviton, 1991). In this context, Weiss began to reconsider how the results of the evaluation in public policy could be used, a question that became central to her work. For this, she expanded the notion of "use" of social research in the policy process, offering other meanings that go beyond the more linear perspectives of the influence of science in the decision-making process, as a provider of an absolute truth (knowledge-driven use) or solutions to existing problems (problem-solving use) (Weiss, 1979).
Weiss understands that knowledge produced by social research can be also used politically to support predetermined positions (political use) or in a tactical manner, when the research is contracted by bureaucrats to pass the idea of responsibility or that the problem is already being studied and thus responding to pressures and delaying the action (tactical use) (Weiss, 1979). Certainly, the form of use that marks the author's work is the so-called enlightenment, which occurs when the concepts and theoretical perspectives developed by the sciences permeate the political decision-making process, in a non-linear, disconnected way and in long-term, generating knowledge that influences decision-making by influencing the way people think (Shadish et al., 1991;Weiss, 1979). This contribution does not occur in an instrumental way, because it does not need to be compatible with the values and objectives of decision makers and can challenge acquired truths and overcome consolidated values and patterns of thought (Weiss, 1979).
In this sense, although the ACF assumes that policy learning tends to motivate changes in secondary aspects of belief systems, Weiss' notion of enlightment use of knowledge suggests that deeper ideas, values, and objectives of policymakers can also be modified as a result of learning processes. This type of learning, however, inhibits its measuring and tracing because it occurs in a non-linear way, permeating the political subsystem through a kind of bricolage that results from a dispersed process of diffusion and assimilation of knowledge.
Based on the work of Weiss, Weible (2008) incorporates the uses of scientific knowledge in coalitions studies, proposing a three-type simplification: 1) political use, when policymakers use the knowledge to legitimize previously established positions; 2) instrumental use, when knowledge is used in a rational perspective, to solve a problem based on the results of the research; and 3) learning use, based on the idea of enlightenment, when the accumulation of scientific knowledge gradually alters the beliefs of political actors. The association between the use of learning and the notion of enlightenment, however, brings conceptual limitations, as it relates learning to only one of the types of knowledge use (learning use). As it is, the possible learning occurred in the other types of use (political or instrumental) is not contemplated.
Although Weiss works with scientific knowledge, as she is mainly worried about the use of external policy evaluation reports, policy implementation itself also produces knowledge. There is an inherent ambiguity in any policy, because of the gaps between the rules and what is interpreted and implemented. As the rules do not cover all the complexity of the real world, the actors who implement them must make decisions, based on their implicit assumptions and on the resources available (Mahoney & Thelen, 2010). So, during the implementation, they map processes, develop methodologies, protocols, and information systems, in short, they develop routines, in which they become the greatest specialists (Lupia & McCubbins, 1994;Rose, 1991). This knowledge is institutionalized through capacity building and training of new generations of bureaucrats, who become important information providers. When policy actors use this kind of knowledge derived from the implementation processes, we are going to assume that there is a "procedural use" of knowledge. By far, we propose that these four categories of use can be considered evidences of learning: enlightening, political, instrumental, and procedural. Radaelli (2013, 2018) provide theoretical support for the expanded concept of learning we propose and also to the relationship between knowledge and learning through use. The authors argue that the context in which decision-making occurs allows different types of learning, considering two dimensions of analysis: the level of problem tractability and the authority/legitimacy of key actors associated with the problem, as shown in Figure 2. Reflexive learning occurs when the level of tractability of the problem and the certification of the actors are low. It means that the degree of uncertainty about the problem is high, that is, knowledge is not yet consolidated in the community. In this context, knowledge is used in order to deepen the discussion and to facilitate the argumentation. Learning in this quadrant is generally profound and complex, as it should allow actors to explore their policy preferences, identities, and strategies (Dunlop & Radaelli, 2013).

Considering different types of learning according to the decision-making contexts
For learning to happen, actors must be predisposed to hear what others have to say and reconfigure their preferences, and there must be institutional arrangements that favor socialization, deliberation, and co-production, among others. In this case, the open dialogue allows learning about social norms, the reconfiguration of the actors' identity and the definition of what is appropriate or not (Dunlop & Radaelli, 2018).
We understand that in contexts of reflexivity, learning can be evidenced by the enlightenment use of knowledge, since in this quadrant, knowledge allows a deeper comprehension of issues that are not yet so clear and for which the specialists do not yet have answers. As it favors in-depth discussions about the issues, policy actors have an opportunity to reflect about their own beliefs and preferences along the policy process. This helps to explain, for example, why policy learning does not necessarily change values in the short term but exert crucial pressure on those values in the long run  Alternatively, epistemic learning occurs when the level of tractability of the problem is low, but the certification of the actors is high, that is, there are specialists with authority on the issue and legitimacy in the community. In this case, knowledge is disseminated by a group of experts with the aim of narrowing the discussion and achieving solutions to the problem (Dunlop & Radaelli, 2013). This is the context of evidence-based decision making, allowing learning about causal relations between evidences and outcomes (Dunlop & Radaelli, 2018).
Epistemic learning contexts favor the instrumental use of knowledge, since the specialists act as teachers for the decision-makers, offering knowledge to solve existing problems. Naturally, as ACF assumes, specialists are not neutral, as they also have political judgements about policies, and knowledge can be used by coalition members in a biased manner, only as ammunition to the debate. We encourage researchers, however, to explore the instrumental use of knowledge as evidence of learning specially amongst the members of minority coalitions.
Fenger and Quaglia (2016), for example, in a comparative analysis of learning in the regulatory sector of the European Union, United Kingdom and the Netherlands, evidence that learning was limited, because it has resulted in changes in the policy instruments, but not in the policy objectives. They admit, however, that many stakeholders in the financial sector had evidenced the risks of a light-touch approach to regulation, but structural factors may have prevented the translation of these insights into more restrictive regulation. At the end, they wonder that "decoupling learning and policy change seems to be an important next step in understanding policy learning" (Fenger & Quaglia, 2016, p. 14).
Epistemic learning can also be facilitated in policy subsystems where scientists are the most interested actors in a specific policy and therefore the most active coalition members. Souza and Secchi (2014), for example, analyzed the role of the local scientific community in the formulation of Science and Technology policy in the State of Santa Catarina, in Brazil, and concluded that the scientific community played a prominent role in the formulation of the S&T policy, by occupying political and administrative positions in the government.
Learning through bargaining occurs when the level of tractability of the problem is high, but the certification of the actors is low. In this case, the public administration has a repertoire of solutions and ways of doing things, so it does not look for truths to solve problems, but to understand the environment and the limits of negotiation. In this context, individuals learn about the extent to which policy preferences can be negotiated, which boundaries cannot be crossed (red lines) and the costs of cooperation, in terms of resources redistribution. For this to occur, it is necessary that the barriers to contract (agreements) and to aggregation of preferences are low and that there are conditions that favor the mutual partisan adjustment, transparently to those involved. It is important that the negotiation is permanent, and that winners and losers are not always the same, to not discourage the interaction (Dunlop & Radaelli, 2013, 2018. We argue that learning through bargaining and social interaction favors the political use of knowledge, not in a pejorative sense of manipulation of knowledge to reinforce or refute arguments, but as a form of mutual understanding of the coalitions' preferences, the limits of negotiation and the possibilities of consensus. Therefore, the political use of knowledge is more likely to occur in this quadrant, precisely because there is room for competing visions and for negotiation, including among non-adversarial coalitions. Vieira (2020), analyzing the case of Belo Monte hydroelectric power plant in Brazil, evidences how the coalitions favorable to the power plant tried to join a common identity, sharing discourses on controversial issues and aligning interests and expectations.
Additionally, because not only beliefs but also functional interdependence and resource dependence may influence the ways coalitions interact and cooperate (Weible & Sabatier, 2005). Elgin (2015), for instance, examining the Colorado climate and energy policy subsystem, confirmed that belief extremity had a strong positive effect on whether an individual interacted primarily with friends than with foes, but also concluded that individuals with lower levels of organizational resources were more likely to interact with foes (Elgin, 2015). A deeper look inside these interactions can be an opportunity to evidence learning about the policy subsystem itself.
Finally, learning in hierarchy occurs when the level of tractability of the problem and the certification of the actors are high, that is, the problems have a low level of uncertainty and risk and there are experts recognized in the issue. In this case, there is little room for reflection and bargaining and knowledge is institutionalized in formal and informal processes, leading to the compliance of the actors. In this context, individuals learn about the scope of the rules, their flexibility, and the costs of non-compliance. For this to occur, it is necessary to have a predisposition of the actors for compliance, whether by the logic of interest, normative or cultural (Dunlop & Radaelli, 2013, 2018. We suggest that learning in contexts of hierarchy favors a procedural use of knowledge, related to the maintenance of rules, procedures and roles already institutionalized. This is the kind of knowledge developed and used by public bureaucrats and technicians to maintain complex routines involved in the policies implementation, which requires constant training. The knowledge used as a way to reinforce the existing processes contributes to the institutionalization of the rules and can be explored to avoid an also to support the desired changes. Principal-agent problems can interfere in this process, as the bureaucrats' expertise can turn them less affected by legislators design contracts and other institutional features (Lupia & McCubbins, 1994).
Box 1 presents a summary of the types of learning and their main characteristics, as discussed above. Cause-and-effect relations. Connection between evidence and desired outcomes.

BOX 1 CHARACTERISTICS OF THE TYPES OF LEARNING ACCORDING TO THE DECISION-MAKING CONTEXTS
Policy preferences (limits and possibilities) and costs of cooperation (resource allocation).
The scope of the rules, their flexibility, and the costs of noncompliance.

Decision-making Based on communication Knowledge-based Based on negotiation Based on institutions and their ambiguities
Institutional and personal triggers for learning Institutional arrangements that favor socialization, deliberation, and coproduction. Norms that privilege moral and not political or hierarchical competence.
Predisposition of the actors.
Statutory rights of consultation. Pluralistic approach to the use of expertise. Normative entrepreneurs. Informal collaborative institutions.
Barriers of contract and low aggregation of preferences. Transparency that favors mutual partisan adjustment.
Predisposition of actors to renegotiate subjects and swap arena. Permanence of negotiation and alternation between winners and losers.
Actors predisposition to compliance whether by the logic of interest, normative or cultural.

Hindrances
Cultures without deliberative tradition, in which the agreement is regarded as loss of honor or reputation.
Fragmentation of the epistemic communities.

Consolidation of winners and losers.
Low cost of defection, leaving the environment unstable.

Many veto players.
Knowledge use a Enlightenment Instrumental Political Procedural a Adapted from Weiss (1998). Source: Adapted from Dunlop and Radaelli (2018).
In terms of institutional settings, the main ACF worries remain. It is important to look at the coalition's long-term opportunity structures. Studies are far from conclusive about the influence of the openness of political system in facilitating learning. A lack of arenas for learning is evidenced both in authoritarian (Khayatzadeh-Mahani, Breton, Ruckert, & Labonte, 2017) and in democratic systems (Anderson & MacLean, 2015;Ulmanen, Swartling, & Wallgren, 2015). On the other hand, the experience of Guangzhou, China, is considered a landmark achievement in urban management and governance, because of the creation of consultative committees, which allowed to solve social disputes and enhance communications between citizens and local government (Wong, 2016).
In terms of collaborative and adversarial subsystems, it could be said that learning would benefit from a collaborative subsystem, but this is not clear yet. Rietig (2018) evidenced how a collaborative policy subsystem shifted to an adversarial one after new scientific evidence emerged on the negative environmental and climate change impacts of crop-based biofuels. Finally, there may be some relation between contexts of learning and the maturity of the subsystems, but this is something to be explored. Theoretically, at least, it is affordable to say that reflexive contexts and the enlightenment use of knowledge are only possible in mature subsystems. The continuity of empirical studies that test the ACF learning hypotheses, especially those related to the level of conflict and the existence of forums will certainly allow to better understand these issues.

POLICY LEARNING AND POLICY DYNAMICS: SEARCHING THE MISSING LINK
After uncoupling the apprehension of learning through change, it is necessary to understand how learning influences the policy process. To do so, we propose to observe the institutional dynamics, including the processes of stability and change as interrelated parts of the same development process (Capano, 2009;Real-Dato, 2009) policy change is a very ambiguous area of academic study, and one full of pitfalls. All aspects of policy change have been dealt with, all the possible independent variables have been examined (ideas, interests, institutions, socio-economic structures, political institutions, internationalization, individual entrepreneurship, social culture and values, and so on. More recently, the understanding of change as a rupture with stability has been reformulated. Mahoney and Thelen (2010) have an important role in this discussion as they assume that changes also occur in a continuous and gradual manner. In a sense, the linear view of the equilibrium punctuated by moments of rupture gives space to a more dialectical process of institutional change, in which the action of the actors in dispute is constant. Thus, both stability and change are due to the result of tensions in permanent dispute by spaces of power. Dominant interpretations of the rules need to be constantly legitimized. They do not remain stable in a peaceful manner, but because they continually legitimate themselves in the face of alternative interpretations. This expands the possibilities of understanding the role of learning in the political process.
The first possibility is to observe learning as one of the explanatory factors of stasis. ACF studies tend to offer explanations for change, but not for stasis, usually understood as the result of the dominance of a certain beliefs system. As the dependent variable is change, when it does not happen, the same explanatory factors are then considered absent, like, lack of shocks, lack of learning, or lack of resources. As discussed, however, as learning does not imply necessarily change, there is space to consider policy learning as one of explanatory factors for stasis. This happens when the policy actors use what is learned and properly mobilize resources to keep the status quo.
The second possibility is to consider learning as one of the mechanisms that lead to major changes, something that is not very explored by ACF researchers. One of the criticisms made to the ACF is that it gives little space to internal movements for coalitions to manipulate external factors or generate internal crisis that result in major changes (Capano, 2009) policy change is a very ambiguous area of academic study, and one full of pitfalls. All aspects of policy change have been dealt with, all the possible independent variables have been examined (ideas, interests, institutions, socio-economic structures, political institutions, internationalization, individual entrepreneurship, social culture and values, and so on. The theory of gradual change has showed, however, that internal movements of coalitions are important factors of power redistribution, and, consequently, of institutional change (Mahoney & Thelen, 2010).
The shift in Brazil's health care system, from centralized and narrow to decentralized and universalistic, provides an example of this (Falleti, 2010). Although most previous analyses considered only critical-juncture explanations for institutional changes, like the nation's transition from authoritarianism to democracy, Falleti (2010) demonstrates that they were also the result of gradual changes, promoted by the sanitarista movement, which began long before the democratization, "infiltrating" the State. This study does not explicitly explore policy learning as a causal mechanism for change, but intrinsically there are many evidences of the importance of learning in this process. As Falleti argues, when the constitutional convention discussed the new health care rules, the sanitaristas had already established both the networks and the expertise that put them in a position to exercise strong influence, so that the codification and institutionalization of the practices they had perfected over the years of military rule were politically feasible (Falleti, 2010, p. 58). This expertise was notably the result of long and intense learning processes, derived from the creation of preventive medicine departments in Brazilian universities, the implementation of health and sanitation programs, involving scientists and public officials, and the existence of forums for debate, like national health conferences.
Based on the previous discussion, Figure 3 presents a proposal to review the set of possible relationships between knowledge, learning and change evidenced by the ACF (Figure 1).

FIGURE 3 EXPLANATORY FACTORS OF INSTITUTIONAL DYNAMICS
the implementation of health and sanitation programs, involving scientists and public officials, and the existence of forums for debate, like national health conferences.
Based on the previous discussion, Figure 3 presents a proposal to review the set of possible relationships between knowledge, learning and change evidenced by the ACF (Figure 1).  This diagram includes all ACF relationships, but it opens space for others. Starting from the end, institutional dynamics include change and stasis as possible dependent variables to be explained. Considering stasis as a dependent variable requires paying attention to the coalitions' movements in order to keep the status quo. This has been done, for example, by researchers who combine ACF with the Narrative Analysis approach, as narratives are as important in explaining policy change as they are in explaining policy stagnation (Menahem & Gilad, 2016;Mosley & Gibson, 2017). We suggest that learning also have this explanatory power, but its comprehension must be expanded, in order to consider any kind of beliefs' update.
The beliefs' update keeps the dual meaning the alteration in the beliefs system has in the ACF. It can be the result of a) a power redistribution, which can bring a minority coalition beliefs' system to a strategic position (individuals do not update their beliefs), or b) a learning process, when members of a coalition change their understanding of a policy problem and/or solutions (individuals update their beliefs). The difference occurs in the latter, because the concept of learning as beliefs' update covers the situations in which the members of a coalition reinforces their policy understandings.
External and internal shocks continue to be important explanatory factors of policy change. Knowledge, however, gains power to promote learning processes related to any level of beliefs. Therefore, the proposal does not differentiate the explanatory power of external and internal shocks and knowledge, as all factors can lead to the update of the three levels of beliefs and influence the institutional dynamics. This update, as said before, can be done by changes in power distribution and/ or learning, which can influence each other or occur simultaneously, as different pathways leading to the same outcome (equifinality).
Many different combinations of pathways that lead to a specific policy configuration are possible. In terms of learning, which is the focus of this paper, the general pathway considered in the ACF is still contemplated (knowledge -policy learning -minor policy change), but basically four other possibilities are added. The hierarchical learning that can inform policymakers; the epistemic learning that reinforces or reinvigorates existing beliefs; the bargaining learning, that helps to reach consensus; and the reflexive learning that allows a reconfiguration of the actors' identity and of their way of seeing the policy problems and solutions. These possibilities can only be opened if we consider learning as beliefs update and look at stasis and change dynamically.
In sum, this new set of relationships is an attempt to open space for learning as a plausible and important pathway to change or stability, with and beyond the external and internal shocks. The explanatory factor 'negotiated agreements' is not explicit in the diagram, because we understand that it is a consequence of other factors incorporated in the process. It can be a result of an external shock, in which the power reconfiguration leads to new agreements or a result of learning, especially in bargaining contexts, when the coalitions' members, after learning about the political preferences of each other and the costs and limits of cooperation, are able to reach some consensus.

CONCLUSION
This essay aimed to offer some theoretical and methodological contributions to the discussion about the influence of knowledge and learning in policy change, a research gap pointed out by the scholars of the ACF. The presented proposal considered contributions from the literature on policy learning and policy change, taking the ACF as a baseline reference, on which the arguments were constructed.
Although policy change is a powerful and traditional variable to evidence policy learning, we tried to explore which other possibilities could help to capture learning in political contexts. For this, based on the works of Weiss (1998) and Radaelli (2013, 2018), we proposed to apprehend the relationship between knowledge and learning, through the uses of knowledge and assume the possibility of four different forms of learning according to the decision-making context (reflexivity, epistemic, bargaining, or hierarchical).
The notion of reflexive learning offered the possibility of understanding the power of learning to reconfigure deep core beliefs. The notion of epistemic learning helped to comprehend that learning may also happen when policy actors reinforces or reinvigorates their own beliefs. For this, it was crucial to widen the concept of learning as commonly understood by ACF researchers (belief alterations), to an update of beliefs. The notion of bargaining learning helped to clarify that during the negotiation process, the members of coalitions can also learn and produce consensus, even if this does not result in policy change. The notion of hierarchical learning gave space for all the knowledge produced during the policy implementation, and that makes public bureaucrats important information providers for policy decision makers. Finally, the notion of institutional dynamics helped to highlight the explanatory power of learning, both in stasis and institutional change situations.
Therefore, in order to guide the application of the model in future studies, we present the following propositions that summarize the discussion previously addressed: 1. Contexts of reflexivity are more prone to learning by the enlightening use of knowledge, while in bargaining contexts, learning is more likely to be observed by the occurrence of the political use of knowledge. 2. Epistemic contexts potentialize evidence-based learning, revealing an instrumental use of knowledge, while in hierarchical contexts, learning is more likely to be observed in the procedural use of knowledge. 3. Learning involves an update on the beliefs of individuals and influences the policy dynamics, both to promote changes and to avoid them.
Methodologically, when we propose to consider the uses of knowledge in different decision-making contexts as evidences of learning, we understand that this kind of evidence can be more difficult or easier to find, depending on the data available in the policy subsystem investigated. This is why we recommend focusing on organized or systematized knowledge, in some kind of product which can be used as evidence of knowledge. We also understand that the four decision-making contexts can occur simultaneously in the same subsystem, which may require attention to different and dispersed kinds of data and also different methods of analysis.
Finally, we hope that these theoretical adjustments allow capturing elements of the policy dynamics that may be underprivileged in the analysis of how coalitions use knowledge, learn, and mobilize resources to achieve their goals. Without disregarding the explanatory power of the shocks and the distributive character involved in the conflicts between the coalitions, we hope that these contributions may help to diminish the skepticism present in the literature on the influence of learning in the policy process.