A GENERIC TACTICAL PLANNING MODEL TO SUPPLY A BIOREFINERY WITH BIOMASS*

The supply chains which bring biomass to biorefineries play a critical role in biofuel production. Optimization models can help decision makers to design more efficient chains and minimize the cost of biomass delivered to the refineries. This article based on a French national research project on biomass logistics considers one refinery, able to process several crops and several parts of the same crop, over a one-year horizon divided into days or weeks. A network model and a data model are first developed to let the decision maker describe the supply chain structure and its data, without affecting the underlying mathematical model. The latter is a mixed integer linear program which combines for the first time various features, either original or tackled separately in the literature. Knowing the refinery demands, it determines the activity levels in the network (amounts harvested, baled, transported, stored, etc.) and the required equipment, in order to minimize a total cost including harvesting costs, transport costs and storage costs. Numerical evaluations based on real data show that the proposed model can optimize large supply chains in reasonable running times.


INTRODUCTION
The three last decades have seen a growing interest in the potential of biofuels as a means of reducing dependence on fossil fuels and in the development of clean and renewable energy.In this context, efficient logistic chains are essential to supply biorefineries regularly, reliably, and with sufficient quantities of quality biomass at a reasonable price.This latter aspect is particularly important since a significant percentage of biomass cost at the refinery gate lies in logistic costs.
Optimization models and algorithms constitute precious tools to design a biomass supply chain, evaluate possible structures a priori, and quantify the resources required, the associated costs, the energy consumptions and the environmental impacts.Beyond a simple simulation of a future reality, design choices can be rationalized and optimized.Moreover, a good model allows to anticipate the impact of important investments and thus reduces the risk of wrong decisions.However, such an optimization raises many theoretical and practical issues, such as the relevance of the mathematical model and the need for reliable data.In particular, the supply chain must be carefully analyzed to come to a model as close as possible to reality.Indeed, a biomass supply chain is a complex network, controlled by a huge number of parameters and decision variables.For instance, it is necessary to determine the types of biomass to mobilize, the production areas, the locations of refineries and their capacities, the associated storage sites, the required harvesting and transportation equipment, etc.This kind of chain often covers a vast territory and involves a time horizon of several months, to take into account seasonal fluctuations in supply and demand.Moreover, numerous constraints must be satisfied, such as harvesting windows, biomass degradations, resource capacities, and refinery demands.This article presents a mixed integer linear program (MILP), developed in the frame of a French national project to optimize the biomass supply chain of one biorefinery with a typical supply radius of 50 km.This multi-biomass and multi-period tactical planning model combines several features, either original or considered separately in the literature.It determines for each period the amount processed by each activity (harvesting, packing, transportation, preprocessing...), the flows in the logistics network, the required equipment and the stock levels.The goal is to satisfy the needs of the refinery while minimizing the total cost of the system.The modeling approach is generic and flexible enough to be extended to most biomass-to-bioenergy supply chains at the tactical decision level.This remainder of the paper is organized in four sections.Section 2 briefly recalls the principles of biomass supply chains (structures, activities), reviews the literature on tactical optimization models for biomass supply, and position our problem vis-à-vis the existing works.Section 3 introduces the network model used to describe biomass supply chains.The data model is quickly presented in Section 4. Its aim is not only to encode the network model, but also the other required information, such that refinery demands and equipment characteristics.The mathematical model is defined in Section 5. Section 6 is devoted to case studies while concluding remarks are brought in Section 7. generating electricity, heat, and/or cooling.The distribution stage from conversion facilities to end-users is sometimes added but the actors involved in the upstream and downstream parts are very different, which makes a global optimization extremely difficult (for instance, reliable data are hard to find).Compared to industrial supply chains, several differences must be underlined: • Biomass supply chains cover a vast collection territory, with many scattered cultivation areas.
• Long planning horizons must be considered, because most crops have a one-year cultivation cycle.
• Because of degradations, the crops cannot wait and must be harvested quickly when ready.
Both industrial and biomass supply chains can be modeled as networks (graphs) where the nodes correspond to the locations of interest while the arcs represent product flows.However, biomass supply chains involve specific activities that require various resources: • Harvesting activities are possible in a limited time window at the input nodes devoted to biomass production (farms), when the crop is ready, and they compete for a limited fleet of machines like combine harvesters.
• Storage is required in practice to synchronize the biomass production calendar with the production plan of conversion plants.It can take place in the fields or forests as simple stacks, in the farms, in centralized storage sites, or before the processes in conversion facilities.
• Pre-processing is useful to improve preservation (drying) and handling (baling, pelletization) and to reduce transportation costs by increasing density.The simplest treatments like baling can be done on the field.Stronger compressions and other transformations like torrefaction are possible, but using heavier equipment and/or dedicated sites.
• Transport.Like in industrial logistics, several transport modes can be used, the fleet of vehicles is often limited and the number of travels per period is restricted by vehicle range and driving time regulations.Road transport is often the only solution for biomass production areas with limited accessibility (forests), and truckload operations are systematic due to the large amounts handled.
The designers of such chains need modeling tools to cope with this complexity.Before coming to a total cost, they must understand the dynamics of the chain and determine many variables, like the amounts harvested (which crop, where, when, in which amount), the network flows (quantities transported), the advisable stock levels, and the resources consumed (harvesting equipment, vehicles, energy, manpower).Subtle tradeoffs must be found.For instance, densification can be done on the field, using balers, or in a more efficient densification plant.The second option adds a transport step to the plant, but then transport costs are reduced, due to higher density.

Literature review
The literature on biomass supply chains is exploding but covers very different topics, such as the design of new harvesting machines, densification techniques and biomass degradation issues.The papers devoted to quantitative modeling represent only a subset and involve various approaches which include simple cost calculations (spreadsheet-based), geographical information systems (GIS), performance evaluation (e.g., simulation), and mathematical optimization.
As even a review restricted to optimization models would take too much space, we prefer to recommend two recent and up-to-date surveys, and then cite only a few representative examples of research works which address, like ours, tactical planning problems.The reader will find in the two surveys other examples of optimization models dealing with strategic or operational decisions.
The first recommended review (De Meyer et al., 2014) addresses deterministic optimization models and solution methods for biomass supply chains.99 publications from 1997 to 2012 are analyzed and classified according to the decision level at hand, the objective to optimize, and the kind of mathematical model used.A general description of biomass supply chains for bioenergy is presented, with the decisions related to their design and management.The authors underline the fact that most publications optimize an economic objective and use a commercial solver to solve the models.Very few dedicated algorithms like metaheuristics (e.g., genetic algorithms) are developed up to now.
We recently published in Renewable Energy a complementary survey (Ba et al., 2016) which comments 124 references, including 72 after 2010.In addition to the deterministic optimization models of the previous review, it covers simple cost calculation models, GIS-based techniques, simulation models, stochastic optimization, and multi-objective approaches.
Tactical models involve medium-term decisions, typically over a one-year horizon divided into elementary periods (days, weeks or months).In biomass supply chains, the decisions in each period concern for instance the amount of each type of biomass to harvest, the stock levels, the product flows in the network and the required number of harvesting machines and transportation vehicles.The published models are designed to supply biorefineries, but also heating and power plants.Some of them include strategic decisions, like the optimal location of refineries, but the exact schedule of activities in each period is never addressed since it concerns the operational decision level.2010) extend the previous study to different transport modes.The objective is to identify locations for refineries, transport modes to use, transport planning and biofuel production scheduling, to minimize the total cost for delivering the fuels to end-customers.
Shabani & Sowlati (2013) focus on the supply of a power plant with forest residues (bark, sawdust, pruning products), purchased from different producers.They propose a mixed integer nonlinear program, in which nonlinearities are induced by two complex constraints that bind the types of biomass consumed, their quantities, their moisture content and the amount of electricity that can be obtained.For 6 suppliers over 12 months, the model has 260 variables and 333 constraints and can be solved using the Outer Approximation (OA) algorithm.

Position of our problem and closest works
Our mathematical model, a MILP, is described in Section 5 but we can already underline its features: • The chain supplies a single refinery, already located and whose biomass demands are given.
• It ranges from the fields to refinery gates but ignores cultivation systems and refinery processes.
• The activities in the chain are planned over one year divided into periods of one day.
• The biomass producers can be selected in a given set of farms.
• The model can switch to larger periods (e.g., weeks) and larger production areas (e.g., counties).
• Several parts of the same crops can be employed, e.g., seeds, straw and chaff for rape.
• Several harvesting chains are possible for the same crop and the equipment can be selected for each task of these chains (for instance, one tractor among several commercial models).
• Shared stocks (containing several products) and storages with opening periods can be dealt with.
• A network model and a data model make the MILP independent from the supply chain structure.
No published model includes all these features simultaneously but we comment below the three closest papers, which share several characteristics with our contribution:

NETWORK MODEL
The network model is a graphical description of the supply chain, from which the equations of the mathematical model can be automatically generated.Its aim is to enable the user to describe a wide range of supply chains, without having to modify the underlying mathematical model.We selected as a basis a modeling framework called state-task network, which is more precise than a classical graph.A state represents a feed, an intermediate, or a final product.Symbolized by a circle, it can correspond to a true storage, a work in process (WIP) or even a product in rapid transit between two manufacturing units.A task, represented by a rectangle, is a processing operation which transforms one or more input states, in fixed proportions, into one or more output states, also in fixed proportions.An arc corresponds to a flow of a single product, always from a state to a task or vice versa.A state can be produced and consumed by one or more tasks.In each period, each task pulls simultaneously the required materials from its input states and sends the resulting products to its output states.A processing time in number of periods can be specified on each outgoing arc.

State
Figure 1, adapted from Kondili et al. (1993), illustrates the flexibility of STN on a petrochemical process with 9 states, 5 tasks and 1-hour discrete time slots.A storage capacity, sometimes unlimited, is indicated in each state.The percentages before a task define the proportions of the recipe, while the ones after a task specify the output yields.These percentages are implicitly equal to 100% when omitted, for instance for the tasks with a single input state.We can notice that the STN can model tasks with multiple inputs and outputs (Reaction 1), loops (Reaction 3 → Intermediate Y → Separation → Intermediate Z → Reaction 3), and processing times.For instance, the input products consumed by Reaction 2 in period t yield a new product available on output in period t + 2. Note that changing a single characteristic of a product (state) makes it another product, e.g., "Feed A" becomes "Hot A" after heating.
Last but not least, the same authors show that a MILP can be generated from the STN, to determine the timing of operations for each resource and the product flows through the network, so as to optimize a given objective function.All these features make the STN a powerful modeling tool for multi-period planning problems over discretized time horizons.Only simulation models can be more precise, but it is difficult to use them for optimization purposes.

Application to biomass supply chains and example
As mentioned in subsection 2.3, the STN has been already applied to biomass supply chains in two works, Dunnett et al. ( 2007) and Samsatli et al. (2015).The first one brought the following extensions to the original STN, designed by Kondili et al. (1993) for process industries: • Transportation tasks are added, the product before and after such tasks giving two distinct states.
• The final product (heat) is also considered as a state.
• Humidity is an additional attribute associated with the states.
• Tasks can be period-specific, e.g., the ambient drying rate is higher in August than in January.
• Utilities (manpower, gasoil, etc.) are defined for each processing unit to deduce CO 2 emissions.
Samsatli et al. ( 2015) added another feature in their Resource-Technology Network (RTN).The RTN is in fact similar to the STN (the resources correspond to the STN states and the technologies to the STN tasks) but it becomes possible to select different modes for each technology.
Figure 2 shows how a more complex supply chain like the one considered in our study can be represented as an STN.We add to the basic STN an upper level, the geographical locations, to model the set of states and tasks of each important site (gray rectangles).Three locations are considered: one farm producing rape (colza), one centralized storage with one silo for seeds plus one platform for bales, and one refinery with buffer stocks.The sites are composed of states S e (stocks in a broad sense, including the rape field S 1 ) and tasks T i .Each state contains a single product P k and is either a cultivated field, a work in process (like the straw on swath) or a true stock (silo for instance).
Only one crop is considered to keep the figure readable, but it is easy to add other crops.The whole rape (product P1) is cut by a combine harvester (task T 1 ) which yields three products: seeds (product P2), chaff (small particles, P3) and straw (stalks, P5).Each product is defined by a crop of origin (here, rape only), a crop part (seeds, chaff, straw), a presentation (bulk or square bales) and a moisture content.
The seeds are discharged in a trailer drawn by a tractor and brought immediately to the silo.The straw, available after 3 days of drying on swath, is packed in square bales using a tractor pulling a baler.These bales temporarily placed on the ground are either kept in a shed on the farm or sent to the refinery by a semi-trailer truck.The chaff is collected by the harvester in a special compartment which is emptied on the ground, giving a loaf which is then taken over by a square baler equipped with a special feeder.Chaff bales are carried by a tractor with a flatbed trailer, to be stored with the straw in the same shed.Then the same transport mode transfers the straw and chaff bales to the platform of the storage site.Finally, semi-trailers transport the products from the on-farm shed and the centralized storage to the buffer stocks of the refinery.
The STN indicates the storage capacity in each state, the distances (often negligible inside the farm) for each transport task, and the required resources (over each task).The only separation process is here the harvesting task, while the only blending process is the refine-task.Percentages are only indicated on the outgoing arcs of the harvesting task.They are implicitly equal to 100% on the other arcs.The refine-task is particular because its inflows (biomass demands) are given period per period: the percentages are not mentioned since they depend on the period.
The only notable wait is the passive drying of the straw on the ground (3 days) on exit from the harvesting task.This drying could also be modeled by a dedicated task.For the other tasks, there is no waiting time, which means that the inputs consumed in period t are transformed into output products which can be used in the same period.
This example shows that the STN describes well the structure of the supply chain and its different activities.However, this graphical representation cannot be directly interpreted by a computer and many auxiliary data must be added, for instance the number of periods of the planning horizon, the length of a period in days, the costs and throughputs of equipment, the density of the different products to compute their transported volumes, etc.The role of the data model presented briefly in the next section is to encode the STN and all other required data as database tables.

DATA MODEL
The purpose of this section is to explain, without entering into details, how the STN and all auxiliary data can be structured efficiently as a set of database tables.Indeed, the data management layer is a critical component in optimization software, although it is often under-estimated in OR papers.We first present the list of tables used, the special way in which resources (equipment) are handled, and the list of data symbols from which the mathematical model of Section 5 is automatically generated.

Tables used
We selected the entity-relationship approach to analyze data.An entity is a set of things or persons sharing common properties, called attributes or fields.A relationship describes the way the entities are linked.For instance, states and tasks represent two entities while state-to-task arcs constitute a relationship, more precisely a subset of the cross product of states and arcs.A relationship may have its own attributes, for instance an arc may have a length and a transported product.
Entities and relationships can be implemented as a set of tables stored in a database.We selected Excel, which is simpler and lighter than a true database management system.Each entity gives one Excel worksheet with one column per attribute and one row for each member (record) of the entity.The identifier (ID) or key is a mandatory attribute to identify each member of an entity unambiguously.All tables for one run of the model are stored in one Excel workbook.The user can evaluate different scenarios by modifying existing workbook or creating new ones.
The proposed worksheets are listed in Table 1.They include those describing the state-task graph: STATES, TASKS, ST ARCS (state-task arcs) and TS ARCS (task-state arcs).The biomass is described by three worksheets: CROPS, CROP PARTS and PRODUCTS.What we call a product is defined by one plant of origin, one part of this plant, and one form (e.g., bulk, bales, or pellets).
The other worksheets describe the resource management system, described in Section 4.2.Such a data model is very flexible and new attributes can be easily added in each entity.

Resource management system
Our resource management system deserves some explanations.We call resources the motorized equipment (harvesters, tractors, trucks...) and their accessories (rakes, balers, trailers...) used by the tasks.The table RESOURCES contains various commercial models with their characteristics, e.g., the productivity in hectares per hour (for a harvester), or the average speed (for a truck).
The resource families (FAMILIES worksheet) are categories of equipment with similar functions but which do not refer to specific models.Examples of families are mowers, combine harvesters, balers, trucks, farm trailers (drawn by a tractor), truck trailers, etc. Families serve in the LOCA FAM worksheet to limit the number of equipment which can be employed on each site, for example no more than two tractors on a farm.Each tractor can be any commercial model defined in RESOURCES.
A resource combination (worksheet COMBINATIONS) is a collection of resources working together for a task.For instance, the transport task T3 in the state-task network of Figure 2 involves one tractor and one flatbed trailer.Combinations are necessary to properly calculate resource working times, because the productivity of a resource depends on the combination that uses it.In France for instance, the maximum speed allowed for a tractor on the road depends on Pesquisa Operacional, Vol.38(1), 2018 the size of its trailer.The combinations have the same attributes as the resources but, because of special cases like our tractor example, only an expert in agricultural machinery can specify which combinations are possible and compute their attribute values, in order to fill the worksheet COMBINATIONS.
Finally, the combination groups or generic combinations (worksheet GROUPS) play for the combinations the same role as the families for the resources: the exact characteristics are not specified.For example, a 200 hp tractor with a 15-ton flatbed trailer is a resource combination belonging to the combination group "tractor + trailer".
In fact, the state-task graph specifies one group for each task, so the optimization module is free to choose any combination of real equipment in COMBINATIONS.The resource level is mainly used to cumulate the working hours of each selected equipment (since the same machine can be used by several tasks and in several combinations) and estimate fleet size.

Data and variables used by the mathematical model
The data contained in the Excel workbook are listed below.The database contains other fields not described here, e.g., to format the results (full site names) or prepare the distance matrix (GPS coordinates).The general parameters listed in Table 2, like the number of periods, are loaded from worksheet PARAMETERS.As refinery demands are often specified per week or half-month, we introduce a longer period Re f Per for the refinery.Its use is explained in subsection 5.3.Large positive constant to initialize some fields to infinity K gCO2 Kg of CO 2 per liter of gasoil, to compute CO 2 emissions Table 3 gives the indexing sets.MILP solvers can read Excel files with one row per record and one column per attribute, including one for the identifier (key) of each record.These keys which may be integers or character strings are gathered in a set used to allocate dynamic arrays.We used character strings to improve model readability.For instance, if the solver finds 8 products in worksheet PRODUCTS, it stores their keys in a set P I D and creates arrays of 8 elements, indexed over P I D, for product attributes like density.A model generated in this way is said data-driven: data and model equations are completely separated, and even set cardinalities are discovered by the solver when loading the worksheets.Figure 2 contains examples of location, product, state and task identifiers.Note that a few indexing sets do not correspond to worksheets listed in Table 1.RC R Pairs is required to define the different resources involved in each resource combination, while the last sets as from Z are introduced to improve model readability.

MATHEMATICAL MODEL
This section introduces the equations of our mixed integer linear program in this order: declarations of variables, inventory and flow constraints, satisfaction of refinery demands, resource constraints, objective function, computation of energy consumptions and CO 2 emissions.Ratio between the flow on the arc and the total flow consumed by the task STY ield e,i Yield in metric tons per hectare (used only for the outgoing arcs of Field states)

For each task-state arc (i, e) ∈ T S A read from worksheet TS ARCS T S Frac i,e
Ratio between the flow on the arc and the total flow consumed by the task  Number of units of resource r (fleet size) required on site l

Declaration of variables
A general principle is to restrict the indexing of variables listed in For each task i and each period t , the variables Amp (amount processed) are defined only for the resource combinations c which may be used by the task, see constraints (4).The Re f ine task of the refinery is not concerned since its amounts processed are fixed and equal to the demands.
In constraints (5), the integer variables for the number of vehicle rotations are generated only for transport tasks and resource combinations usable by each task.
Pesquisa Operacional, Vol.38(1), 2018 Finally, the integer number of units of resource r to purchase for geographical site (location) l is defined only for the resource families allowed on the site.

Inventory and flow constraints
Storage capacity.We added an attribute S Rep e to model a group of states as one shared storage, for instance a platform storing bales of different products.In the data, one representative r is defined for the group and all members e of the group (even r) are such that Definition of flows on task incoming arcs.Constraints (12) specify the fraction of the total input flow taken on each incoming arc.For the Re f ine task, they are replaced by demand constraints, see 5.3.
∀ (e, i) ∈ ST A | T T ype i = "Re f ine , ∀ t ∈ [S Beg e , S End e ] : X e,i,t = ST Frac e,i • (e, j )∈ST A| j =i X e, j,t (12) Flow conservation for each task.In constraints (13), the left sum is the total flow consumed by task i, taken in parallel on all incoming arcs (e, i).It is multiplied by the sum of T S Frac, which can be less than 1 if the yield is not perfect.The resulting flow appears on outgoing arcs (i, u), T S Dura i,u periods later.For the refinery, these constraints have a different form explained in 5.3.
Final inventory constraints.They are generated for the last period of the opening period of each stock e if a final stock is specified.Constraints (14) for minimum final inventories are mainly used for the input stocks of the refinery, to chain successive years without having to stop activities.Constraints (15) deal with maximum final inventories.For instance, S Q End Max e = 0 is used to empty a stock after the last period S End e of its window.

Demand satisfaction constraints
The supply chain is analyzed over base-periods of 1 or 7 days.Refinery demands use a longer period, in general 1 or 2 weeks, equal to Re f Per base-periods.Moreover, they are expressed in dry tons of some crop parts.The exact products are not imposed, e.g., a demand of 185 dry tons of rape straw can be satisfied using 100 tons of bales with 10% humidity and 100 tons of pellets with 5%.
As refinery production variations are unknown, we assume that the need in dry tons for crop part k in refinery period w is consumed at a constant rate in each base-period.For instance, a demand of 70 tons in week w is consumed at a rate of 10 tons/day.Note that this does not imply to receive the same amounts daily, since the refinery has input stocks for 2 weeks of production, on average.
• X e,i,t = D N eed w,k /Re f Per (17) In constraints (17), the demand D N eed w,k for refinery period w and crop part k covers baseperiods (w − 1) • Re f Per + 1 to w • Re f Per.It is defined only for the pairs (w, k) contained in set D Pairs.This set was introduced to reduce memory requirements: indeed, most products are requested a few weeks or months per year.The sum cumulates the flows of products X e,i,t (converted in dry tons) for crop part k, from the input stocks in each base-period of the refinery period.It must be equal to the uniform consumption D N eed w,k /Re f Per in the refinery period.

Resource management constraints
The principles of our resource management system were introduced in subsection 4.2.Recall that in worksheet TASKS a group of resource combinations is specified for each task i, e.g., "one tractor + one trailer".The task may use several combinations in the same group, e.g., one 200-hp tractor with a 6-meter trailer, plus one 240-hp tractor with a 8-meter trailer.
Constraints (18) mean that the total input flow of each task i in each period t must be processed by the resource combinations c allowed for this task.

∀ i ∈ T I D| T T ype
The next step is to convert the amounts processed in resource combination working times.For farm tasks H arvest , Rake and Bale (set of task types Z H RB), constraints (19) compute the variables T C H RB i,c,t (working time of combination c for farm task i en période t ), which are necessary to count the operational costs of farm equipment (Farm OpCosts) in the objective function.We have two cases because the productivities RT hru are in hectares per hour for H arvest and Rake (set Z H R) tasks but in bales per hour for Bale tasks.We get metric tons per hour using the crop yield for H arvest and Rake tasks, and the volume of a bale and product density for Bale tasks.
In constraints (20), the working time T CT i,c,t for a transport task is based on the number of vehicle rotations RT i,c,t , the distance travelled Dist i and the average speed of the combination RC S peed c .
In constraints (21), the working time T R l,r,t of resource r on site l in period t is deduced from the working times of combinations in which the resource intervenes.Here we do separate constraints for H arvest , Rake and Bale tasks from the ones for T ransport tasks, because a tractor (for instance) can be used by all these tasks.
To help the solver, simple bounds are applied to the integer variables RT i,c,t , the number of rotations in period t for the resource combination c used by task i, e.g., a semi-truck with a road trailer.In the lower bounds of constraints ( 22), Amp i,c,t is divided by the minimum of vehicle capacity and maximum weight (resulting from vehicle volume and product density).As the product is defined for states but not for tasks, the sum is used to get the state e of origin, which is unique for a transport task.The quotient cannot be rounded up to the closest integer if we wish to keep the model linear.
RT i,c,t cannot exceed the previous lower bound plus 1.For instance, if the quotient in ( 22) is 6.3, the integer variable RT i,c,t must be greater than 6.3 and less than 7.3: the solver has no choice and sets it to 7.3.If the quotient is integer, say 6, the solver has the choice, but only among two values, 6 and 7.
In constraints (24)-( 25), the number of units of resource r required on site l, F S l,r , can be bounded using the maximum resource working time and the number of opening hours of the site (per period).
Finally, the total number of resource units per family is limited via constraints (26).These constraints were introduced after preliminary tests where all fields were harvested in too few periods, implying an unrealistic number of combine harvesters.

Objective function
Over the planning horizon, the objective function defined by Equation ( 27) includes operational costs of farm equipment (FarmCosts), transport costs (T ranCost s), fixed costs (capital costs) of resources (FixedCosts), inventory costs (I nvCost s), and handling costs (H andCost s).
In Equation ( 28 T ransCost s is deduced in Equation ( 29) from working times T CT i,c,t computed in constraints (20).
A storage cost is applied to the amount contained in each state e at the end of each period t .In general, Field and W I P states have no storage cost.Some true storages like silos can be used only a few weeks or months per year for the biorefinery.The rest of the time, they are reserved to other activities like cereal production.Hence, the storage costs in equation ( 31

Derivation of energy consumptions and CO 2 emissions
Environmental criteria are important to appraise a biomass supply chain.The following equations are used to derive the energy consumption and the CO 2 emissions from the optimal solution to the mathematical model.They can also be included with adequate weights in the objective function (27).
For each resource combination is given a power RC Power in kW.In general it corresponds to the motorized component of the combination, e.g., the tractor for a "tractor + baler" combination.It can be also the power of electric equipment such as an oven or a press in a pretreatment/densification site.In Equation (33), the total energy consumption in kW×h is simply deduced from the powers RC Power of resource combinations and their working times, i.e., T C H RB i,c,t for harvesting equipment and T CT i,c,t for transport vehicles.

NUMERICAL EVALUATIONS ON REAL DATA
The effectiveness of our MILP model based on extensions and adaptations of state-task networks is demonstrated on a few numerical examples based on real data.Only the first example can be described completely, due to the large amount of data involved in the other tests.

First example
This case involves 10 production zones ("farms" in a broad sense) which cultivate rape (colza) and/or miscanthus and 3 centralized storage sites, at 50 km maximum from one local refinery close to the city of Compiègne (60 km at the north of Paris).These sites are listed in Table 6.
The three first columns respectively give the location (geographical sites) identifiers, the site types and the municipalities.We selected a total demand of 50,730 metric tons/year of dry matter, a typical value for a local biorefinery.Rape yields three products (bulk seeds, straw bales, chaff bales) while miscanthus provides straw bales only.
The planning horizon is one year divided into 52 weeks of seven days each, giving 364 elementary time periods.The annual demands in metric tons for each product are given in Table 7, in reality they are specified for each refinery-period of one week.Each farm grows rape and/or miscanthus.The harvesting chain and harvesting windows applied to each farm are given in the two last columns of Table 6.Chain 1, used by farms 5 and 10 which produce rape only, is identical to the chain already depicted in Figure 2. Chain 2 (see Fig. 3 for an example) is employed in farms 6 and 11 which cultivate miscanthus only.The other farms produce both crops using chain 3, a combination of chains 1 and 2 with a shared on-farm storage for all baled products (rape straw, rape chaff and miscanthus).The earliest windows, ending at period 144 or before, are related with miscanthus, harvested in March-April.Those beginning as from period 189 (beginning of July) concern rape.The windows vary slightly depending on soils but all end beg-October.The refinery and centralized storages (sites L1-L4) are open all over the year.
It is assumed that each farm may ship biomass to the closest storage site and to the refinery.Real distances by road were calculated using the MapPoint software from Microsoft.Data for the equipment were found in agricultural databases, see for instance [18].
The mathematical model has been implemented under the OPL-STUDIO version 12.6 modeling environment from IBM (which calls the CPLEX solver in version 12.6 too) and solved on a Figure 4 shows the breakdown of the expenses.The fixed costs of equipment are included in the harvesting part.Storage costs represent 59%, while they are often under-estimated in the literature.In fact, this percentage is normal because the harvest periods are short (about 15 days for each of the two crops), while the refinery consumes biomass throughout the year: it is therefore necessary to build up important stocks.In particular, initial stocks must be sufficient to supply the refinery before the harvesting periods for miscanthus and (later) rape, to avoid infeasibility.Pesquisa Operacional, Vol.38(1), 2018

Simplifications for large instances
The model can be solved up to 80 farms and 20 centralized storage sites, over one year divided into 52 weeks.Such a size is already respectable but we found one example with 100 farms/25 storages which cannot be solved over 52 weeks: the branch-and-bound of CPLEX reports an "out of memory" error.The problem clearly comes from the large number of integer variables, F S l,r and RT i,c,t .
The model computes the number of machines F S l,r to buy for each site.This leads to a total cost where equipment investments are predominant.In reality, a farmer uses also his machines for other crops not asked by the refinery and may help other farmers or be helped.So, it is difficult to isolate the equipment cost fraction related to the refinery supply chain.The French Chambers of Agriculture regularly publish tariffs to invoice such exchanges [18].The main interest is to provide costs which include everything, even maintenance and equipment amortization.It becomes possible to optimize a total cost, composed of the resource working times of constraints ( 19)-( 20), multiplied by the hourly tariffs.The modification of the objective function is straightforward and variables F S l,r can be suppressed.However, if the user wishes so, it is still possible to derive the equipment required on each site from the optimal solution, using constraints (24)-( 26).
The numbers of vehicle rotations RT i,c,t used to compute transport costs can also be avoided using a simpler system in tons × km, often applied when a refinery subcontracts its transports.The French Federation of Road Transport (FNTR) (http://www.fntr.fr)gives formulas for various trucks.
We have solved again the 10-farm case with 364 days to evaluate the impact of these simplifications.The numbers of variables and constraints are roughly divided by 2 and, obviously, there are no more integer variables.Before the pre-solve, the model has 265,245 variables and 212,760 constraints.The pre-solver goes down to 26,580 variables and 9,791 constraints.The running time drops from 164 seconds to less than one second.The solution obtained is very similar to the one of the complete model, with an important cost difference equal to the equipment fixed costs since they are no longer counted.Looking at the other costs, a marginal difference of only 0.53% can be observed.
These simplifications also allowed to solve the case with 100 farms, 25 storages and 52 weeks which raised an "out of memory" error for the complete model.The simplified MILP has 162,074 variables and 203,721 constraints.Once pre-solved, only 8,339 variables and 5,839 constraints remain.The running time reaches 8,158 seconds (2 hours 1/4) but this duration is acceptable for strategic studies.By dichotomy, we were able to solve instances with 150 production zones, but not more.
To go further, it is important to notice that model size mainly results from the states and tasks of harvesting chains, which are applied to hundreds of farms.In comparison, the states and tasks for the refinery and centralized storage represent around 10% of the STN nodes.Hence, we tried a formulation where all the tasks and states of a harvesting chain, between the Field state and the farm storage, are aggregated to give a "macro-task" H arvest .An example is given in Figure 5.The stock-states before leaving the farm must be kept, otherwise the macro-task would be connected to transport tasks (remember that two tasks cannot be directly connected in an STN).
Macro-tasks are similar to the tasks considered before, with three important differences.
• A classical task may use one resource combination only, while a macro-task may involve several combinations, in fact, all resource combinations required by the aggregated tasks.
• In a classical task, the resource combination used processes 100% of the total input flow, while in a macro-task it can be applied to sub-products.For instance, Figure 2 shows that rape yields 18% of chaff, which explains that the combination "tractor + square baler" used to bale chaff in Figure 5 treats only 18% of the total amount of rape harvested on input.
• Waiting times are still possible for some outputs, see the straw released after 3 days of passive drying in Figure 5, but feasible solutions with pauses inside the harvesting process are now excluded, e.g., when the farmer harvests during one day, stops the second day, and terminates the job the third day.However, we have never seen such useless breaks in the optimal solutions of the complete model and, anyway, they do not correspond to agricultural practices: indeed, due to meteorological uncertainties, farmers prefer to work continuously once they start harvesting.
The condensation of harvesting chains into macro-tasks must be done manually and carefully, but then the database and model modifications are relatively easy.The model encompassing all these simplifications, called "compact model" is applied to a large-scale case in the next subsection.

Large-scale example
To appraise the performance of the compact model, a large case was built after a long work to collect data and format them for our database.Although many detailed data at the farm level are protected in France by the statistical secret, the main results per municipality of the 2010 Agricultural Census are publicly available, including the number of farms, the cultivated area, and the dominating profile, e.g., "cereals and oilseed crops", "viticulture", etc.
The web site http://agreste.agriculture.gouv.fr/chiffres-cles-4/commune/provides for instance one file for the 2,292 cities of the Picardie Region (north-west of France).To build a realistic scenario, we selected in this file the cities with a profile compatible with rape, i.e., with codes 15 ("cereals and oilseed/protein crops") and 16 ("field crops of general type").Among them, we kept the 460 cities at 50 km maximum from Compiègne, where a local biorefinery is foreseen.
The "farms" correspond to the resulting cities, as the data for the underlying 2,146 real farms are not accessible.To estimate the harvestable amount of rape, the cultivated area of each selected city was divided by 3, assuming a classical crop rotation rape-wheat-barley.The resulting area was finally multiplied by the average rape yield in the region, i.e., 6.95 tons/ha.Rape is obtained via the harvesting chain of Figure 2.
Pesquisa Operacional, Vol.38(1), 2018 A partner of the research project has sent questionnaires to the operators of centralized storages, again within a maximum distance of 50 km to Compiègne.The answers were typed in an Excel file giving the following data for 46 storage sites: kind of storage (silo or platform), capacities in tons for silos and square meters for platforms, storage costs, input and output costs, availability window in the year.Storage capacities vary from 200 to 60,000 tons.Each farm may send its products (seeds, straw bales and chaff bales from rape) to the nearest centralized storage, but not directly to the refinery.The planning horizon is one year subdivided into 52 weeks.The total refinery demand is 50,000 metric tons per year and road distances were computed again using MapPoint.
The resulting model contains 588,249 constraints and 574,795 variables.After a reduction to 4,801 constraints and 19,558 variables by the pre-solver, it is solved to optimality in less than one minute (47 seconds).
Our model and its results can be compared BioFeed (Shastri et al., 2011a).This closest model in the literature is more restrictive than ours, since it considers a single product from a single crop and a unique harvesting chain in three steps (harvest, rake, and bale).In Shastri et al. (2011b), BioFeed is applied to a switchgrass supply chain involving 31 farms, 1 centralized storage, and 1 biorefinery.The harvesting period of 120 days is divided into 15-day time steps.In the rest of the year, divided into 60-day time steps, flow variables for farm activities are not generated.The model requires 5,706 seconds of CPU time on a Dell PowerEdge 1900 server with 8 processors, a more powerful computer than our portable PC.So, although the instances compared differ substantially, it seems that our model can tackle larger supply chains, in a more flexible way, and in shorter CPU times.

CONCLUSION
A mixed integer linear programming (MILP) model was formulated to optimize biomass supply chains.The state-task network used to describe a chain is very flexible, since its allows to model in details farm activities, equipment selection, pretreatments, storages, and transports, for different crops and multiple products per crop, over a one-year planning horizon divided into days or weeks.The state-task network and other data such as resource characteristics are described in a database from which the mathematical model is automatically generated.
Although this MILP can already solve instances of respectable size, simplifications were introduced to reduce the number of variables, in particular integer variables.One of these simplifications, the macro-task system which aggregates all states and tasks related to harvesting activities in a farm, has made possible the exact resolution in less than one minute of a large instance with 460 biomass production zones, 46 centralized storages and 1 refinery, over 52 weeks.Instances of that size have never been solved so fast in the literature, especially for such complex supply chains.
The database and the model have been designed carefully to evolve easily, for instance new crops or harvesting chains and multiple refineries can be already handled.They are already integrated in a software prototype from which the development of a commercial version is foreseen.
The next step of our work will be to study dedicated algorithms, such as decomposition techniques, Lagrangean relaxation, and metaheuristics, for large-scale problems beyond 500 farms.
We wish also to add to the model not only the possibility of using multi-modal transportation and tackling uncertainties in biomass availability, but also multi-objective optimization in the Pareto sense, to minimize both cost criteria and environmental impacts (up to now these criteria can be coped with, but only in a weighted sum).
-task networks (STN) were introduced by Kondili et al. (1993) and Shah et al. (1993) to model petrochemical processes over a time horizon divided into discrete periods.They use only three entities: two types of nodes (states and tasks), and arcs.

Figure 2 -
Figure 2 -Example of state-task network (STN) for a simple supply chain with one farm, one centralized storage and oe biorefinery.

H
Set of periods of planning horizon H = {1, 2, . . ., N Per} H R Set of refinery-periods for the demands of the refinery H R = {1, 2, . . ., N Per/Re f P er} L I D Set of location (geographical sites) identifiers L1, L2, ... (key-set of worksheet LO-CATIONS) C I D Set of crop identifiers C1, C2, ... (key-set of worksheet CROPS) C P I D Set of crop part identifiers CP1, CP2, ... (key-set of worksheet CROP PARTS) P I D Set of product identifiers P1, P2, ... (key-set of worksheet PRODUCTS) S I D Set of state identifiers S1, S2, ... (keys-set of worksheet STATES) T I D Set of task identifiers T1, T2, ... (key-set of worksheet TASKS) RI D Set of resource identifiers R1, R2, ... (key-set of worksheet RESOURCES) RF I D Set of resource family identifiers RF1, RF2, ... (key-set of worksheet FAMILIES) RC I D Set of resource combination identifiers RC1, RC2, ... (key-set of COMBINATIONS) CG I D Set of combination group identifiers (key-set of worksheet GROUPS) T S A Set of task-state arcs (i, e), T S A ⊆ T I D × S I D (key-set of worksheet TS ARCS) ST A Set of state-task arcs (e, i), ST A ⊆ S I D × T I D (key-set of worksheet ST ARCS) DPairs Set of pairs (refinery period, product) ⊆ H R × C P I D (key-set of worksheet DE-MANDS) L F Pairs Set of pairs (location, resource family) ⊆ L I D × RF I D (key-set of worksheet LOCA FAM) RC RPairs Set of pairs (resource combination, resource) ⊆ RC I D × RI D Z Set of current task types {H arvest, Dry, Rake, Bale, Pretreat, T ransport, Re f ine} Z H R Subset {H arvest, Rake} of Z Z H RB Subset {H arvest, Rake, Bale} of Z

Figure 5 -
Figure 5 -State-task network of Figure 2 after the introduction of the macro-task.
(Bakken et al., 2007)introduced the first remarkable tactical model of biomass supply chain.The goal is to supply one biorefinery with switchgrass (an herbaceous plant) over a 12-month horizon.The model is a pure linear program (LP), i.e., without integer variables.It is tested on 20 producers, having 3 to 10 fields each, and 4 to 7 storage sites (open air or covered).It determines the optimal amounts harvested, stored and transported each month and the capacity of each storage.An extension with meteorological scenarios weighted by different probabilities is also described.Van Dyken et al. (2010) model as a MILP a forest supply chain for one heating plant.The changes in density, humidity and heating power of each product after each operation are tracked along the chain.These changes depend on the drying mode and vary nonlinearly with drying time but they can be linearized.The model is appraised on a simple case with 3 products, 1 dryer, 1 chipper, 1 pelletizer, 2 storages and 2 demand points, over 12 weeks.It extends an earlier model, eTransport, designed for the planning of energy systems(Bakken et al., 2007).Haque & Epplin (2012) consider one year with a monthly time-step, 6 possible sites for several biorefineries processing switchgrass, and 57 counties in Oklahoma.They develop a MILP in which binary variables specify the locations and capacities of refineries and storage centers, while integer variables are defined to select the number of equipment for harvesting and handling.Eks ¸ioglu et al.(2009)propose one of the rare models including the distribution phase beyond the refinery.The chain collects agricultural and woody biomass to produce ethanol.The MILP model prescribes strategic decisions such as the location, number and size of refineries and col- lection sites, and tactical decisions like material flows.However, the different harvesting activities and their resources are not handled.The objective is to minimize over one year a sum of costs concerning biomass (harvest, storage, transport, conversion) and the distribution of ethanol.Eks ¸ioglu et al. (

Table 1 -
List of worksheets (type E: implementing one entity, type R: one relationship).

Table 2 -
Data from worksheet PARAMETERS.Re f P er Duration of one refinery-period in periods, e.g., 7 if demands given per week Per Dur Duration of one period in days, e.g., 1 or 7N PerNumber of periods of the planning horizon, e.g., 365 for one year if PerDur = 1

Table 4 -
Data arrays indexed over the sets of Table3.
e Input cost (handling cost) in euros per metric ton and per period S OutCost e Output cost (handling cost) in euros per metric ton and per period SThru e Maximum throughput in metric tons per hour S Deg e Product degradation (storage loss) per period, e.g., 0.99 if 1% is lost S Rep e For the Stock state type, ID of the representative in case of shared stock (see 5.2) For each task i ∈ T I D read from table TASKS T T ype i Type of task, allowed values in {Harvest, Dry, Rake, Bale, Pretreat, Transport

each state-task arc (e, i ) ∈ ST A read from worksheet ST ARCS ST
Frac e,i

Table 4 -
(Continuation).Demand for crop part k in refinery-period w, in dry metric tons For For each pair (w, k) ∈ D Pai r s read from worksheet DEMANDS DNeed w,k

each pair (l, f ) ∈ L F Pai r s read from worksheet LOCA FAM
L F N Max l, fMaximum number of resources from family f that can be employed on site l

Table 5 -
List of variables.Inventory level of state e at the end of period t in metric tons Amp i,c,t Total amount processed by task i with resource combination c in period t, in metric tons T C H RB i,c, t Total working time of resource combination c in period t for a task i ∈ Z H RB, in hours T CT i,c,t Total working time of resource combination c in period t for a transport task i, in hours T R l,r,t Total working time of resource r on site l in period t, in hours Integer variables RT i,c,t Number of rotations done by combination c for a transport task i in period t F S l,r

Table 5
, to reduce model size.Inventory levels (S e,t ), flows on state-task arcs (X e,i,t ), and task-state arcs (Y i,e,t ) are defined only in the time windows when state e is open, see constraints (1)-(3).∀ (e, i) ∈ ST A, ∀ t ∈ [S Beg e , S End e ] : X e,i,t ≥ 0 ( 1 ) ∀ (i, e) ∈ T S A, ∀ t ∈ [S Beg e , S End e ] : Y i,e,t ≥ 0 ( 2 ) ∀ e ∈ S I D, ∀ t ∈ [S Beg e , S End e ] : S e,t ≥ 0 S Rep e = r.A nonshared storage e has no representative: S Rep e = N A. Storage capacity constraints are generated only if a limited capacity is specified.Constraints (7) concern non-shared stocks.Constraints (8) for shared stocks are generated only for representatives, i.e., states e such that S Rep e = e.
(11)∈ S I D | SCapa e < H uge and S Rep e = N A, ∀ t ∈ [S Beg e , S End e ] : S e,t ≤ SCapa e (7) ∀ e ∈ S I D | SCapa e < H uge and S Rep e = e, ∀ t ∈ [S Beg e , S End e ] : u∈S I D | S Rep u =e S u,t ≤ SCapa e (8) Inventory balance equations.They take degradations into account.Constraints (9) concern the first period of the opening window and the initial inventory.Constraints (10) are for the other periods.∀e∈SI D : S e,S Beg e = S Q Beg e • S Deg e + (i,e)∈T S A Y i,e,S Beg e − (e,i)∈ST A X e,i,S Beg e (9) ∀ e ∈ S I D, ∀ t ∈ ]S Beg e , S End e ] : S e,t = S e,t −1 • S Deg e + (i,e)∈T S A Y i,e,t − (e, j )∈ST A X e, j,t(10)Definition of flows on task outgoing arcs.Constraints(11)translate one property of STN networks: the flow on a task outgoing arc (i, e) is equal in each period t to a fraction T S Frac i,e of the total input flow consumed T S Dura ie periods before.
Min e > 0 : S e,S E nd e ≥ S Q End Min e (14) ∀ e ∈ S I D | S Q End Max e < H uge : S e,S E nd e ≤ S Q End Max e (15) Maximum reception capacity of storage states.Constraints (16) deal with storage states like silos.They have a maximum reception capacity ST hru e , in metric tons per hour, to be multiplied by the number of opening hours per period of the geographical site containing the state.∀ e ∈ S I D | ST ype e = St ock, ∀ t ∈ [S Beg e , S End e ] : (i,e)∈T S A Y i,e,t ≤ ST hru e • L H ours[SLoca e ] ), FarmCosts concerns resource combinations for H arvest , Rake and Bale tasks (set Z H RB), the transports in the farms being counted with other transports in T rans Op-Cost s.It is derived from working times T C H RB i,c,t computed in constraints (19).The first sum includes each task i with a type T T ype i is in Z H RB. The second considers only the combinations c of the resource combination set (RC I D) which have the group recommended for the task ) are counted only in the opening window [S Beg e , S End e ].
H andCost s = (i,e)∈TS A t ∈[S Beg e ,S E nd e ] S I npCost e • Y i,e,t + (e,i)∈ST A t ∈[S Beg e ,S E nd e ] S OutCost e • X e,i,t (32) ∈H (T CT i,c,t + T C H RB i,c,t ) • RC PowerEquation (34) for CO 2 emissions is almost as simple.Indeed, we know for each resource combination the gasoil consumption R Fuel in liters per hour, while table PARAMETERS indicates the number of kilograms of carbon dioxide emitted per liter of gasoil, K gC O2 (nearly 2.6 kg/l).

Table 6 -
Data for locations and their attributes.

Table 7 -
Annual demands for each crop part.