PEDOTRANSFER FUNCTIONS TO PREDICT WATER RETENTION FOR SOILS OF THE HUMID TROPICS : A REVIEW ( 1 )

Over the past three decades, pedotransfer functions (PTFs) have been widely used by soil scientists to estimate soils properties in temperate regions in response to the lack of soil data for these regions. Several authors indicated that little effort has been dedicated to the prediction of soil properties in the humid tropics, where the need for soil property information is of even greater priority. The aim of this paper is to provide an up-to-date repository of past and recently published articles as well as papers from proceedings of events dealing with water-retention PTFs for soils of the humid tropics. Of the 35 publications found in the literature on PTFs for prediction of water retention of soils of the humid tropics, 91 % of the PTFs are based on an empirical approach, and only 9 % are based on a semiphysical approach. Of the empirical PTFs, 97 % are continuous, and 3 % (one) is a class PTF; of the empirical PTFs, 97 % are based on multiple linear and polynomial regression of nth order techniques, and 3 % (one) is based on the k-Nearest Neighbor approach; 84 % of the continuous PTFs are point-based, and 16 % are parameter-based; 97 % of the continuous PTFs are equation-based PTFs, and 3 % (one) is based on pattern recognition. Additionally, it was found that 26 % of the tropical water-retention PTFs were developed for soils in Brazil, 26 % for soils in India, 11 % for soils in other countries in America, and 11 % for soils in other countries in Africa.


INTRODUCTION
Soil hydraulic properties, e.g., water retention, are cumbersome, time-consuming, and costly to measure, and they also change over time. Therefore, soil scientists and hydrologists have searched alternative methods for fast and accurate prediction of difficultto-measure soil properties. Over the past three decades, estimation methods, called pedotransfer functions (PTFs) have been widely used by soil scientists in temperate regions in response to the lack of measured soil property information. Bouma (1989) described the term pedotransfer function as "translating data we have into what we need". Pedotransfer functions are predictive functions that relate more easily measurable soil data, such as soil texture (sand, silt, and clay content), bulk density (BD), organic matter (OM) or organic carbon (OC) content, and/or other data routinely measured or registered in soil surveys, to hydraulic parameters, such as the soil water retention curve, SWRC (Bouma & van Lanen, 1987;Bouma, 1989;van den Berg et al., 1997). The most readily available data come from soil survey reports and soil databases.
A thorough review on the use and development of hydraulic PTFs was provided by Wösten et al. (2001). Later, Shein & Arkhangel'skaya (2006) analyzed the potential, state of the art, and outlook of using PTFs in soil science. McBratney et al. (2002) reported that the estimation of soil water retention constitutes the most comprehensive research topic in development of PTFs. This may be due to the particular efforts, time, and cost of measurements of this hydraulic property and the need to obtain information on this property for large-scale studies, together with the availability of large databases containing information on water retention. However, Schaap (2005) wrote that "with the exception of a few studies, hydraulic data and corresponding indirect methods about tropical soils are a virtual terra incognita". This situation has not changed that much today. Minasny & Hartemink (2011) indicated that little effort is devoted to prediction of properties of soils of the tropics, where the need for accurate and up-to-date soil property information is even more urgent. They published a review paper on PTFs for predicting physical and chemical properties of soils in the tropics. First, the authors discussed the guiding principles of prediction and the type of predictors, followed by a discussion on PTFs for soil physical and chemical properties, and they then discussed infrared spectroscopy, proximal sensing, and remote sensing. Several authors evaluated the prediction performance of various tropical as well as temperate water-retention PTFs on their local soil datasets (van den Berg et al., 1997;Medina et al., 2002;Tomasella & Hodnett, 2004;Nebel et al., 2010;Botula et al., 2012). Actually, the last review paper exclusively dedicated to published PTFs that predict water retention in tropical soils was published by Tomasella & Hodnett (2004). Therefore, there is a need to provide an up-to-date repository of past and recently published articles as well as papers from proceedings of events dealing with water-retention PTFs for soils of the humid tropics. Another important contribution of this article is the categorization of published water-retention PTFs based on various approaches and their application to soils of the humid tropics based on data available in the literature. This will allow identification of the more common and less common approaches used in estimating water retention of soils in tropical regions. R. Bras. Ci. Solo, 38:679-698, 2014 Thus, topics for future research in development of PTFs for soils of the humid tropics are identified, and recommendations for future research are formulated.

Categorization of water-retention PTFs
Based on different criteria used by various authors over the past three decades, PTFs used to estimate water retention of soils can be categorized as: class PTFs and continuous PTFs; point-based PTFs, parameter-based PTFs and pseudo-continuous PTFs; PTFs based on a specific approach; and equation-based PTFs and pattern-recognition PTFs.
In the paper, the terminologies point-based PTFs and parameter-based PTFs are preferred to the more widely used point PTFs and parametric PTFs to avoid confusion with the term non-parametric PTFs found in the literature (Nemes et al., 2006a,b;.

Class and continuous PTFs
Within the PTFs used to generate soil hydraulic characteristics, Wösten et al. (1990;1995) made a subdivision based on the amount of available information. They distinguished class PTFs and continuous PTFs.

Class PTFs
A class PTF predicts the hydraulic characteristics of a texture class (e.g., loamy sand) and is based on a preliminary grouping (Wösten et al., 1990;1995). Therefore, class PTFs are cheaper and easier to use than continuous PTFs because they require only identification of the texture class to which the soil belongs. However, the accuracy achieved is limited because only one average value of a hydraulic characteristic is provided for each textural class (Wösten et al., 1995). Bruand (2004) distinguished six main grouping criteria for PTFs: genetic-based groupings, horizon-based groupings, texture groupings, groupings based on structure and BD, groupings of parent material, and consecutive groupings. Several class PTFs have been developed for soils of temperate regions (Jamagne et al., 1977;Clapp & Hornberger, 1978;Carsel & Parrish, 1988;Vereecken et al., 1989;Wösten et al., 1995Schaap et al., 2001;Bruand et al., 2002Bruand et al., , 2003Bruand et al., , 2004Al Majou et al., 2008;Baker, 2008). Class PTFs are rare for soils of the humid tropics. The class PTFs of Hodnett & Tomasella (2002) developed for the parameters of the van Genuchten (1980) equation are among those few published for soils of tropical regions. One of the major constraints to their development is the lack of availability of large databases of soils of the humid tropics to provide a sound statistical-based grouping. Moreover, class PTFs generally seem to be less attractive than continuous PTFs due to less flexibility and the occurrence of larger estimation errors in some cases. For instance, the results obtained by Hodnett & Tomasella (2002) showed that the use of class PTFs may lead to significant errors because of the variation within a given textural class.

Continuous PTFs
A continuous PTF is developed without grouping the data, but instead, using the complete dataset to derive equations (Wösten et al., 1990). It estimates the hydraulic characteristics using, for example, the actually measured percentages of clay, silt, and organic matter content (Wösten et al., 1995). Most existing PTFs developed to date fall in this category. They will be discussed in the next sections.
Point-based PTFs, parameter-based PTFs, and pseudo-continuous PTFs In the literature, some authors (Wösten et al., 2001;Cornelis et al., 2001;Sharma et al., 2006) make a distinction between PTFs that predict the water content at some chosen matric potentials (point-based PTFs) and PTFs that estimate the parameters of analytical expressions of the SWRC (parameter-based PTFs). They are referred to as Type 2 and Type 3 PTFs in Wösten et al. (2001). Additionally, a recently published type of PTF that falls somewhere between the above two categories was introduced by Haghverdi et al. (2012) and referred to as a pseudo-continuous PTF. Figure 1 provides a schematic representation of point-based, parameterbased, and pseudo-continuous PTFs.

Point-based PTFs
The PTFs of Gupta & Larson (1979), , Saxton et al. (1986), and Saxton & Rawls (2006) are among early, widely-applied, published, point-based PTFs developed for soils in temperate areas. Gupta & Larson (1979) used 43 different soil materials originating from ten locations in the eastern and central USA to develop their PTFs. The twelve PTFs that estimate soil moisture content at matric potentials ranging from -4 to -1500 kPa were developed from disturbed samples, containing mixtures of dredged sediment and productive soil in different proportions.  estimated water content within the same matric potential range using the same soil properties. Their data originate from 2543 horizons from across the USA. Saxton et al. (1986) developed point-based PTFs from soils of the USDA dataset. These PTFs have been successfully applied to a wide variety of studies related to agricultural hydrology and water management, together with models like SPAW (Saxton & Wiley, 2006) and AquaCrop (Steduto et al., 2009). Using the soil database currently available from the USDA, Saxton & Rawls (2006) formulated PTFs similar to those previously reported by Saxton et al. (1986) but including more variables, with a wider application range, based on data from 1722 soil samples. In this updating process, the initial equations were combined with equations of hydraulic conductivity, also considering the effects of density, gravel, and salinity.
In the (sub)humid tropics, various efforts have been made to develop point-based PTFs from soil datasets specific to these regions. Most of these PTFs have been developed for application within restricted geographical domains for a limited range of soil textures and soil types. Pidgeon (1972) used a dataset including a wide textural range of ferrallitic soils (from loamy sands to clay) from ten sites in Uganda to derive point-based PTFs. Kaolinite was the dominant clay mineral, but, in two sites, the dominant clay minerals were illite and montmorillonite, respectively. He estimated gravimetric water content at field capacity (FC), permanent wilting point (PWP), and available water capacity (AWC), among other things. The PTF developed by Pidgeon (1972) estimates FC as equivalent to the soil moisture of a wetted plot after 48 h of free drainage. The author also provided equations that convert this value into water content of undisturbed cores at -10 and -33 kPa. MacLean & Yager (1972) derived PTFs to predict AWC 10-1500 kPa based on texture, OC, and soil sample depth for soils of Zambia. Simple relationships between clay content and gravimetric water content at -1500 kPa (PWP) have also been derived for ferralic and oxic horizons in various tropical regions, as reported in FAO (1974) and Soil Survey Staff (1975;. Lal (1978;1981) derived point-based PTFs to predict gravimetric water content (at -10 kPa, -33 kPa, and -1500 kPa) and AWC 10-1500 kPa based on a dataset of soils developed from two different parent materials in Southern Nigeria. The PTF development datasets included mostly strongly weathered soils, but some hydromorphic soils and high activity clay soils were also present.
Aina & Periaswamy (1985) related measured volumetric water content of undisturbed and sieved soils at -33 and -1500 kPa and AWC 33-1500 kPa to soil texture and BD. The soil dataset comprised different Ultisols and Alfisols containing predominantly kaolinite. They constructed a PTF estimating AWC 33-1500 kPa for core samples from silt, clay, and BD; and a PTF relating AWC 33-1500 kPa for sieved samples to sand and BD. Arruda et al. (1987) used well drained and mainly highly weathered soils from Southeast Brazil to derive gravimetric point-based PTFs predicting water content at -33 and -1500 kPa from silt and clay. Dijkerman (1988) related sand and clay content of soils from Sierra Leone to gravimetric water content at -33 and -1500 kPa. The dataset used included mostly strongly weathered Ultisols and some hydromorphic soils. Bhavanarayana et al. (1986) and Rao et al. (1988) developed point-based PTFs to predict volumetric water content at FC and PWP for Indian soils. A statistical relationship was established between clay and OC content, and gravimetric water content at -1500 kPa (Soil Survey Staff, 1992). Bell & van Keulen (1995) derived PTFs to predict water content at PWP for four groups of Mexican soils. Van den Berg (1996) developed PTFs to predict water content at -10 and -1500 kPa and AWC 10-1500 kPa for strongly weathered soils in south and Southeast Brazil. Van den Berg et al. (1997) used two datasets of soils originating from South America, Africa, and Southeast Asia. The first dataset was used to derive volumetric point-based PTFs to calculate water content at -10 and -33 kPa and AWC 10-1500 kPa , as well as to estimate the parameters of the van Genuchten (1980) equation. The second dataset was used for validation purposes. Singh (2000) derived point-based PTFs to predict volumetric water content at saturation, -33 kPa, and -1500 kPa based on sand and clay content of soils from India. Mdemu & Mulengera (2002) developed local PTFs to predict water retention at eight different matric potentials and AWC for soils in Morogoro, Tanzania. Point-based PTFs were derived by Igwe et al. (2002) to predict water content of some soils of the  Tomasella et al. (2003) used a large dataset including soils from different geomorphic regions of Brazil to relate basic soil properties to volumetric water contents at different matric potentials (-6, -10, -33, -100, and -1500 kPa). The point-based PTFs of Tomasella et al. (2003) use moisture equivalent as an input. It is defined as the water content remaining in a sample (fraction <2 mm) after centrifuging at 2400 rpm for 30 min, generally expressed in gravimetric units. In the PTF development process, moisture equivalent has been used by Tomasella et al. (2003) as a predictor of water retention because it is basic information found in most Brazilian soil survey reports. Point-based PTFs were developed by Saikia & Singh (2003) Reichert et al. (2009) generated point-based PTFs to predict soil water retention at various matric potentials (-6, -10, -33, -100, -500, and -1,500 kPa) based on texture, OM, and BD of soils of Rio Grande do Sul in Brazil. Minasny & Hartemink (2011) developed PTFs to predict water content at -10, -33, and -1500 kPa based on soil texture and BD. The development dataset and the validation dataset were composed exclusively of soils from the tropics. These soil datasets are parts of the IGBP-DIS soil database obtained from ISRIC in Wageningen (the Netherlands). Chakraborty et al. (2011) developed PTFs from a wide textural range of Indian soils for four points of the SWRC, namely -33, -100, -500, and -1500 kPa.
Recently, Obalum & Obi (2012) proposed point-based PTFs for kaolinitic and coarse-textured tropical soils from southeastern Nigeria. Santos et al. (2013) generated and validated PTFs to predict gravimetric water content at -33 and -1500 kPa for different soil classes from the central-south portion of the State of Rio Grande do Sul in Brazil.

Parameter-based PTFs
Parameter-based PTFs are equations that estimate the parameters of analytical expressions describing the SWRC, such as the Brooks & Corey (1964), the Campbell (1974), and the widely applied van Genuchten (1980) equations. Parameter-based PTFs generate continuous curves describing the hydraulic characteristics of soils. This is very important for modeling purposes (Tietje & Hennings, 1993;van den Berg et al., 1997;Cornelis et al., 2001) because some soil water and solute transport models require the complete SWRC as input. Furthermore, parameterbased PTFs allow the computation of hydraulic values at arbitrary pressures, as indicated by Borgesen & Schaap (2005).
The first parameter-based PTFs were developed using datasets of soils from temperate regions. Cosby et al. (1984) and Rawls & Brakensiek (1985) developed regression equations for the Brooks & Corey (1964) model based on soils from the USA. Saxton et al. (1986) used the percentage of clay and sand to calculate the parameters of a model that was derived from the SWRC model of Campbell (1974). Vereecken et al. (1989) developed PTFs widely used for estimation of the parameters of the van Genuchten (1980) model based on the physicochemical characteristics (sand, clay, OC, and BD) of 182 horizons of 40 different Belgian soil series. Wösten et al. (1999) predicted the parameters of the van Genuchten (1980) model using the HYPRES database including data from 4030 horizons from all over Europe.
For tropical regions, parameter-based PTFs were developed by van den Berg et al. (1997) to predict the water-retention parameters of the van Genuchten (1980) analytical equation. Tomasella & Hodnett (1998) developed PTFs to predict the parameters of the Brooks & Corey (1964) equation from texture and OC using a dataset of various soils from Brazilian Amazonia. Tomasella et al. (2000) derived parameterbased PTFs for the van Genuchten (1980) model using soil information from a dataset containing 517 soil horizons from various regions in Brazil. Tomasella et al. (2000) stated that the van Genuchten (1980) analytical function is very popular in the modeling community although it may not be the best one to properly describe the hydraulic behaviour of soils such as Oxisols. Earlier, van den Berg et al. (1997) found that the van Genuchten (1980) equation can adequately describe moisture retention curves of soils with low activity clays in the southern part of Brazil. Later, Hodnett & Tomasella (2002) arrived at the same conclusion for Brazilian soils. Hodnett & Tomasella (2002) used part of the IGBP-DIS soil database obtained from ISRIC in Wageningen (the Netherlands) to calculate the four parameters of the van Genuchten (1980) model. The authors referred to this dataset as the IGBP/T dataset, which exclusively contained soils from tropical climates. Santra & Das (2008) developed parameterbased PTFs for the van Genuchten (1980) model to predict water retention of soils from a hilly watershed in eastern India. Adhikary et al. (2008) did the same for the Brooks & Corey (1964) model to provide a prediction based on soils from various parts of India.
From this review, one can see that most of the PTFs developed for soils in the (sub)humid tropics were point-based PTFs. Using validation statistics, several authors (Pachepsky et al., 1996;Tomasella et al., 2003;Dashtaki et al., 2010;Vereecken et al., 2010) noted that the point-based PTFs better predicted water retention than the parameter-based PTFs. This may be attributed to the fact that water content is controlled by different soil properties, depending on the level of soil matric potentials. The point-based PTFs allow for more appropriate independent variables to describe the water content variation than do the parameter-based PTFs. This may partially explain why most of the PTFs developed for tropical soils fall in the group of point-based PTFs. Most point-based PTFs are often limited to the prediction of water content at matric potentials generally recognized as representing FC, i.e., -10 and -33 kPa, and PWP, i.e., -1500 kPa. These values are typically used to calculate the water depth that should be applied through irrigation (Hansen et al., 1980) and to calculate soil water availability, which is a key element in assessing the suitability of a given region for producing a given crop (Sys et al., 1991). However, this has been perceived as a weakness by several authors (Tietje & Hennings, 1993;Cornelis et al., 2001). They argue that most of the simulation models require a continuous function rather than the discrete description provided by measurement points. However, van den Berg et al. (1997) and Tomasella et al. (2003) showed that point-based PTFs used to estimate soil water retention, at least in Brazilian soils, provided more accurate results than parameter-based PTFs. Recently, Haghverdi et al. (2012) indicated that use of parameter-based PTFs has a number of drawbacks. In some cases, the real shape of the SWRC is not similar to the shape of the chosen equation. In addition, authors like Minasny & McBratney (2002b) reported some problems in correlating the parameters of SWRC models to basic soil properties. Furthermore, parameter-based PTFs determine a priori which equation has to be used by the potential user, which, for most published PTFs, is either the van Genuchten (1980) or the Brooks & Corey (1964) closed-form equations. Vereecken et al. (2010) conducted a detailed review of temperate PTFs developed to estimate the parameters of the van Genuchten (1980) SWRC model.

Pseudo-continuous PTFs
Haghverdi et al. (2012) introduced pseudocontinuous PTFs, in which the natural logarithm of matric potential is considered as an input parameter, enabling the user to derive water content at any desired matric potential. Consequently, there is only one output parameter, θ, which shows the water content at the predefined matric potential, i.e., different values of matric potential yield different water contents. This recent approach has only been tested for soils of dry regions.

Specific approach-based PTFs
McBratney et al. (2002) stated that there are various ways to derive PTFs. Generally, they can be classified into two approaches: semi-physical -this approach attempts to describe a physical or chemical model relating the basic properties to the predicted properties; and empirical approach -this is the most widespread approach, linking the basic soil properties to the more difficult-to-measure soil properties by means of different numerical fitting methods.

Semi-physical approach
Semi-physical methods recognize the similarity between the shape of the particle size distribution (PSD) and water retention curves. They offer valuable conceptual insights into the physical relations between texture distribution and pore size distribution (POD). A drawback of these methods is that they often require a very detailed PSD, making them almost as difficult to apply as direct measurements (Schaap, 2005).
Arya & Paris (1981) and Haverkamp & Parlange (1986) translated PSD data into a water retention curve by means of the capillary equation. They assumed that the network of pores in the soil is a bundle of cylindrical capillaries. Pedotransfer functions of this group require a detailed PSD (more than only clay, silt, and sand content). Khlosi (2003) found that eight particle size mass fractions are sufficient to estimate the water retention curve relatively accurately. Tyler & Wheatcraft (1990) used fractal mathematics and scaled similarities to show that the empirical constant in the Arya & Paris (1981) model is equivalent to the fractal dimension of the tortuous fractal pore. The fractal dimension described by Mandelbrot (1983) is a measure of the degree of irregularity of the object seen in all scales (or resolutions) of observation, where the fractal structure is the one in which parts of it are similar to all of it. In simple words, a small piece of the object looks rather like a larger piece or the object as a whole. Therefore, the key property of fractal geometry is a degree of self-similarity across a range of spatial scales (or resolutions) of observation (Feder, 1988).
The Rieu & Sposito (1991a,b) model appears to be the first mass fractal model of the soil water retention characteristics (Millàn et al., 2006). Perrier et al. (1996) developed a general model of the SWRC for any soil whose POD is fractal using Mandelbrot (cumulative number vs. size) distribution (Mandelbrot, 1983). Based on the previous works of Rieu & Sposito (1991a,b), Perfect et al. (1998), andPerfect (1999) used the definition of the volumetric water content of the prefractal Menger sponge and came up with a reduced two-parameter model of the SWRC. Perrier et al. (1999) proposed a symmetric pore-solid fractal (PSF) model. This PSF model is characterized by the same geometric shape in the distribution of soil pores and the distribution of soil solids; both are assumed to be power functions with the same scaling component. Based on the PSF model, Bird et al. (2000) developed a new SWRC fractal model, which includes the Tyler & Wheatcraft (1990) and Rieu & Sposito (1991a,b) models as special cases. Based on the observation that soil water retention is usually sensitive to both soil structure and texture, Millàn & González-Posada (2005) assumed that two fractal regimes, each with different fractal dimensions, could be present in most soils. They extended the model of de Gennes (1985), which is similar to the model of Tyler & Wheatcraft (1990), to a model with two fractal regimes. Cihan et al. (2007) introduced a general scale-variant fractal drainage model, which can be simplified into two scalevariant and scale-invariant models. In their general model, the proportion of pores that drain at a given matric potential depends on both the mass fractal dimension of the drained pore phase and the proportion of pores that drain at air-entry value. Ghanbarian-Alavijeh et al. (2010) (2005) presented a piecewise fractal approach to approximate the soil water retention data and tested their model with previously published soil datasets and two unpublished datasets corresponding to clay loam and silty clay loam soils located within a hydrographical basin in South Cuba. Andrade et al. (2008) used fractal theory to incorporate a fractal dimension based on the SWRC and/or the PSD in the Brooks & Corey (1964) water retention model to estimate the available water in a soil from Brazil.

Empirical approach
The empirical approach is the one that is most used to develop water retention PTFs in temperate as well as in tropical regions. The most commonly used techniques for fitting or deriving PTFs are statistical regressions -Multiple Linear Regressions (MLR) and polynomials of the n th order. Other modern numerical and statistical methods applied are Generalized Linear Models (GLM), General Additive Models (GAM), the Group Method of Data Handling (GMDH), and Multiple Adaptive Regression Splines (MARS). Currently, data-mining techniques are gaining popularity in the PTF-research field with the application of nonconventional statistical methods, e.g., Artificial Neural Networks (ANNs), Classification and Regression Trees (CART), k-Nearest Neighbor (k-NN), Support Vector Machines (SVM), Genetic Algorithms (GA), and Genetic Programming (GP).

Multiple linear regressions and polynomials of the n th order
Many of the available and well-established PTFs for predicting soil hydraulic properties from continuous soil properties are based on multiple linear regressions (MLR) or polynomials of the n th order (Vereecken & Herbst, 2004). In general, three main objectives can be distinguished when using statistical regressions to model relations between two sets of variables: prediction, model specification, and parameter estimation. Multiple linear regression equations are a common statistical tool used for the prediction of the response variable y from a number of n predictor variables x i . A multiple linear regression equation can be written as (Herbst & Diekkrüger, 2002): with the constant a (intercept), the regression coefficients b i , and the error ε. A nonlinear regression equation based on a second-order polynomial has the following form: where besides the intercept a, two regression coefficients b i and c i have to be determined for every predictor variable x i (Rawls & Brakensiek, 1985). Gupta & Larson (1979) used MLR equations of the following form: to predict the soil water content (θ p , m 3 m -3 ) for 12 different matric potentials, where a, b, c, d, and e are regression coefficients, OM is organic matter content, and BD is bulk density. Intermediate values could be determined by fitting one of the analytical SWRC expressions.  estimated water content within the same matric potential range with the following model: Tomasella & Hodnett (1998) studied Brazilian soils and derived MLR PTFs for water contents θ at nine matric potentials: Most of the aforementioned point-and parameterbased PTFs developed for soils of the humid tropics are MLR PTFs and some PTFs, e.g., Tomasella et al. (2000), are polynomials of n th order. Scheinost et al. (1997) found difficulty in estimating the scaling (α) and shape (n) parameters of the van Genuchten (1980) equation using the regression approach. Realizing the overparametrization (too many adjustable parameters relative to number of data points) of the van Genuchten (1980) equation, they proposed the following approach: (1) set up the expected relationship between the parameters of the hydraulic model and soil properties; and (2) insert the relationship into the model and estimate the parameters of the relationship simultaneously by fitting the extended model using nonlinear regression for all data.

Extended nonlinear regression
This approach is referred to as extended nonlinear regression (ENR) by Minasny et al. (1999). Using soils from Australia, they compared MLR and ENR approaches in developing point-and parameter-based PTFs for water retention. The authors found that ENR was the most adequate approach for parameter-based PTFs. For soils in the humid tropics, this approach was used by Hodnett & Tomasella (2002) to develop PTFs for the parameters of the van Genuchten (1980) equation.

Generalized linear and additive models and multivariate adaptive regression splines
Generalized linear models (GLM) extend the linear regression models to accommodate the non-normal response distributions (Hastie & Pregibon, 1992). The theory and its applications to soil science have been reviewed by Lane (2002). For example, McKenzie & Austin (1993) used GLMs to predict soil attributes such as clay content, cation exchange capacity (CEC), pH, and BD, etc. using environmental variables (geomorphic unit, local relief, etc.) as predictors. Gessler et al. (1995) used GLMs to predict the presence or absence of a bleached A2 horizon using digital terrain information. Gessler (1996) extended this flexibility by using generalized additive models (GAM). These models attempt to characterize the nonlinear effect which is not considered in GLM. This is done by allowing arbitrary smooth functions of the predictors to replace some or all of the linear components of the GLM (Hastie & Tibshirani, 1990). Yet the use of GAM, as shown in the soil science literature, has been minimal, as pointed out by McBratney et al. (2003).
Other models, such as multivariate adaptive regression splines (MARS), are used to model continuous variables (Friedman, 1991;Hastie et al., 2001). Shepherd & Walsh (2002) used MARS to develop prediction equations for some soil properties in eastern Africa from NIR diffuse reflectance spectra. To our knowledge, GLM, GAM, and MARS have not been used to derive hydraulic PTFs in the humid tropics.

Artificial neural networks
An artificial neural network (ANN) consists of many interconnected simple computational elements called nodes or neurons (Figure 2). Neural networks are sometimes described as universal function approximators, i.e., they can learn to approximate any continuous nonlinear function to any desired degree of accuracy (Hecht-Nielsen, 1990;Haykin, 1994). An advantage of ANN PTFs, as compared to MLR PTFs, is that they require no a priori concept of the relations between input and output data. During an iterative calibration procedure, the optimal relations between input and output data are found and implemented automatically. A drawback is that these relations are difficult to interpret because of the black-box nature of neural networks (Schaap & Leij, 1998). Koekkoek & Booltink (1999) used ANNs to predict water retention at various matric potentials based on Dutch and Scottish soil datasets. Schaap et al. (1999) developed ANN PTFs to determine the parameters of the van Genuchten (1980) equation. Their ANN PTFs were based on 1209 soil samples from the USA. Schaap et al. (2001) developed the ROSETTA software, a computer program that implements four hierarchical ANN PTFs for estimation of the van Genuchten (1980) water retention parameters. This stand-alone software combines neural network analyses with the bootstrap method (Efron & Tibshirani, 1993), thus allowing the program to provide uncertainty estimates of the predicted hydraulic parameters.

Minasny & McBratney (2002a) proposed a new objective function for parameter-based ANN PTFs.
The authors argued that this new method, called the neuron-m method, provides better accuracy and less bias than the ROSETTA program. This is because the network is set up so that the predicted parameters fit the measured data, instead of training the neural network to fit the estimated parameters. Sharma et al. (2006) Figure 2. Schematic overview of a three-layer neural network. Source: Schaap & Leij (1998).

conductivity (Ksat) of soils of the Volta basin in Ghana.
Other studies used the ROSETTA program to derive the parameters of the van Genuchten (1980) SWRC model for shrink-swell and highly weathered soils and compared the results with locally-derived or published PTFs based essentially on MLR techniques Botula et al., 2012).

Group method of data handling
The group method of data handling (GMDH) combines the advantages of MLR techniques and ANNs (Hecht-Nielsen, 1990). The GMDH constructs a flexible neural network-type equation to relate inputs to outputs and, at the same time, has a builtin algorithm to retain only essential input variables (Farlow, 1984). Some authors have used this method in an attempt to improve the accuracy of hydraulic PTFs of soils of temperate and tropical regions. Pachepsky et al. (1998) used GMDH to develop PTFs from texture, BD, penetration resistance, and water content at 0, -5, -10, -20, -100, and -1500 kPa in 180 soil samples from New Zealand. Nemes et al. (2005) used the GMDH technique to develop Ksat PTFs based on data originating from the USA, Hungary, and the European HYPRES database. Ungaro et al. (2005) developed hydraulic GMDH PTFs for the soils of the Pianura Padano-Veneta region in North Italy. The derived PTFs estimate Brooks & Corey (1964) water retention parameters and moisture content at -5, -10, -33, and -1500 kPa. However, they commented that using them in earth sciences is still fairly uncommon, though they can provide good predictive models that successfully compete with MLR and ANN models. Tomasella et al. (2003) applied GMDH to obtain tropical point-based and parameter-based PTFs using the database of Tomasella et al. (2000) complemented with new data from a great variety of soils from Brazil. Tomasella et al. (2003) found that GMDH point-based PTFs predict water retention better than GMDH parameter-based PTFs for Brazilian soils.

Regression trees
A regression tree is a special type of decision tree that can predict continuous variables (McBratney et al., 2002). Regression tree (RT) modeling is an exploratory technique based on uncovering structure in data and a technique that partitions sample data to find both the best predictors and the best grouping of samples (Clark & Pregibon, 1992). The resulting model divides data first into two groups, then into four groups, and so on, providing groups as homogeneous as possible at each of the levels of partitioning. Each partitioning can be viewed as a branching, and the final fit of the model to the data looks like a tree with two branches originating at each node (Figure 3). Both categorical and numerical variables can be used as predictors in RT (Breiman et al., 1993).  used RT in their exploratory study on the potential value of structural information in the development of more accurate PTFs in modeling water transport in soils. They used data from the Unsaturated Soil Hydraulic Database (UNSODA). Pachepsky et al. (2006) used Classification and Regression Trees (CART) to develop and discuss a PTF relating soil structure to soil water retention. This study was based on a subset of 2149 samples from the U.S. National Soil Characterization Database. No studies were found where the CART had been applied to predict soil water content of humid tropical soils.

k-Nearest Neighbor
The k-Nearest Neighbor (k-NN) technique is referred to as a lazy learning algorithm that has been used for classifying sets of instances based on nearest training instances in a space of multi-dimensional features. It is said to be lazy since it passively stores the data until the time of application. All calculations are performed real-time, i.e., only when estimations need to be generated. Once the k-NN algorithm stores a set of training instances, application of the k-NN technique means identifying and retrieving the instances most similar to the target object from that  set of stored instances, based on their input attributes (Figure 4). More theoretical details on this similaritybased approach are given in Dasarathy (1991). The k-NN approach is considered by several authors (Buishand & Brandsma, 2001;Bannayan & Hoogenboom, 2009) as one of the most attractive pattern classification algorithms. Nemes et al. (1999) used a k-NN variant, which they termed the similarity technique, to estimate missing soil PSD points from other existing PSD points to harmonize data of the European HYPRES database . Jagtap et al. (2004) used a k-NN technique to estimate the drained upper limit and lower limit of plant water availability from soil water retention data measured in situ. Nemes et al. (2006a,b) developed another variant of the k-NN technique to predict soil water retention at -33 and -1500 kPa, and they also performed a detailed sensitivity analysis of this technique. The newly developed k-NN algorithm proved its robustness in different scenarios. Based on the satisfactory results yielded by their k-NN algorithm, Nemes et al. (2008) developed user-friendly software called "k-Nearest" with the option of estimating the uncertainty of the prediction. Elshorbagy et al. (2010a,b) identified the k-NN technique as an attractive modeling technique for hydrological applications because of its high level of flexibility. Recently, Patil et al. (2012) used the k-NN software developed by Nemes et al. (2008) to estimate water content at -33 and -1500 kPa of 157 shrinkswell soils in India to derive their AWC. The ability of the k-NN approach to estimate water content at different matric potentials of highly weathered soils in the humid tropics was tested for the first time by Botula et al. (2013). They applied a variant of the k-NN technique to predict soil water retention in a humid tropical region of Central Africa with high accuracy.

Support vector machines
Recently, support vector machines (SVM) have gained popularity in many fields traditionally dominated by ANN (Lamorski et al., 2008). They are considered a pattern-recognition method that presents the advantage of eliminating the local minimum issue, which is one of the main weaknesses of the ANN approach. Lamorski et al. (2008) used SVM to predict water retention of soils from Poland at eleven matric potentials using sand, clay, and BD, while Twarakavi et al. (2009) used this technique to predict the parameters of the van Genuchten (1980) equation using four different levels of the following input variables: sand, silt, clay, and water content at -33 kPa and at -1500 kPa. The interested reader can refer to Vapnik (1995;1998) and Noble (2006) for further theoretical insights. Until now, this technique has not yet been applied to soils of the humid tropics.
Genetic programming Koza (1992) proposed an automatic programming technique called genetic programming (GP) for evolving computer programs to solve, or approximately solve, problems. Genetic programming is a method for constructing populations of models using stochastic search methods, i.e., evolutionary algorithms. An important characteristic of GP is that both the variables and constants of the candidate models are optimized. Hence, it is not necessary to choose the model structure a priori as in other regression techniques. The GP technique has only been recently introduced in soil water related studies such as soil moisture (Makkeasorn et al., 2006), evapotranspiration (Parasuraman et al., 2007a), Ksat estimation (Parasuraman et al., 2007b), and hydrological modeling (Parasuraman & Elshorbagy, 2008;Elshorbagy et al., 2010a, b;Selle & Muttil, 2011). To our knowledge, GP has not yet been used to predict water content of soils of the humid tropics. Parasuraman et al. (2007b) stated that adopting the ensemble technique in PTF development not only assists in evaluating the uncertainty of the developed PTFs but also helps in addressing one of the pertinent issues in any machine learning (e.g., ANNs, GP) algorithm, namely generalization. The word ensemble is French, meaning "together" or "at the same time" and usually refers to a unit or group of complementary parts that contribute to a single effect. In predictive modeling, an ensemble is a set of individual models in which the component models (also known as members) are redundant in that each provides a solution to the same task, even though this solution may be obtained by different means (Baker & Ellison, 2008b). Parasuraman et al. (2006) Hodnett (1998) for Brazilian soils in the Amazon region. Using the ensemble approach, they developed a new PTF to estimate soil water retention. This ensemble PTF was implemented in a program called CalcPTF (Guber & Pachepsky, 2010).

Ensemble pedotransfer functions
To date, there is no ensemble PTF which has been developed based exclusively on PTFs developed in tropical regions.

Equation-based PTFs and pattern-recognition PTFs
The PTFs described above can be categorized as equation-based PTFs and pattern-recognition PTFs. Equation-based PTFs are directly related to a mathematical model. Their formulation is based on conventional statistical procedures such as MLR, ENR, GLM, GAM, MARS, and GMDH to some extent. In contrast, in pattern-recognition PTFs, no a priori model needs to be defined. They are based on pattern recognition and make use of the recently developed data-mining and machine-learning techniques: ANN, RT, k-NN, SVM, GP. In figure 5, a schematical representation of different types of water-retention PTFs categorized according to various criteria is provided.

Considerations
This review of water-retention PTFs for soils in the humid tropics reveals that: 97 % of the PTFs based on the empirical approach are Continuous PTFs, and 3 % are Class PTFs; 91 % of the PTFs are based on the empirical approach, and only 9 % are based on the semi-physical approach; 97 % of the empirical PTFs were derived based on the MLR and polynomial of the n th order techniques, and 3 % are based on the k-NN approach; 84 % of the continuous PTFs are point-based, 16 % are parameter-based, and 0% are pseudo-continuous PTFs; 97 % of the continuous PTFs developed for soils in the humid tropics are equationbased PTFs, and 3 % are pattern-recognition PTFs; 26 % of the tropical water-retention PTFs were developed for soils in Brazil, 26 % for soils in India, 11 % for soils in other countries in America (USA, Mexico and Cuba), and 11 % for soils in other countries in Africa (Sierra Leone, Tanzania, Uganda and Zambia). Table 1 shows that the aforementioned tropical PTFs are derived based on different scales of data collection, from the local to international level. Most of the soils in the development dataset were highly weathered soils dominated by low activity clay minerals such as kaolinite. They mainly belong to the FAO Soil Group of Ferralsols and related soils (Acrisols and Nitisols). However, other soils like shrink-swell soils (Vertisols) and hydromorphic soils were also present, though to a limited extent. There is wide variation in the size of the development dataset, ranging from 13 to 685 soil samples. Most of the local tropical PTFs published in peer-reviewed journals originated from three countries, located in South-America (Brazil), West-Africa (Nigeria), and Southeast Asia (India). This shows that in other regions of the humid tropics, such as Central Africa, there have been no efforts to develop local waterretention PTFs. Recently, Botula et al. (2012) emphasized the need for development of PTFs to predict water retention of soils in Central Africa. Means that the appropriate information was not available. equivalent (Tomasella et al., 2003), micro-porosity, and total porosity (Obalum & Obi, 2012) are rarely found in tropical soil databases and, therefore, have been rarely used as predictors in the development of tropical PTFs. Other predictors related to soil structure (Pachepsky et al., 2006) and topography (Sharma et al., 2006) have been suggested to improve the predictive ability of PTFs, but they have not yet been used for tropical PTFs.
From table 2, it can be seen that gravimetric or volumetric water contents at -10, -33, and -1500 kPa are the most selected output variables. The first two matric potentials are related to FC of tropical soils, whereas the third one is related to PWP. The PTFs of Pidgeon (1972) and van den Berg et al. (1997) predict volumetric water content at -10 kPa but not at -33 kPa, whereas most of the aforementioned PTFs do. According to various authors (Pidgeon, 1972;Lal, 1978;Babalola, 1979;Reichardt, 1988), the water content at -33 kPa, as well as water content at FC, is considered to be too low for tropical soils and they suggested measuring soil moisture at -10 kPa. Other authors, such as Ottoni Filho & Ottoni (2010), suggested even -6 kPa matric potential as providing a more accurate estimation of FC for Brazilian soils. Water retention at -100 kPa was also a recurrent output variable in all tropical PTFs that predict water content at several matric potentials. The critical matric potentials at which many crops undergo water stress are around -100 kPa for excessively dry conditions (Taylor & Ashcroft, 1972). In a similar vein, Pidgeon (1972) indicated that -100 kPa has often been considered as the limit for freely available water since below this matric potential, the growth of many crops may be reduced.
Virtually all parameter-based tropical PTFs predict the parameters of the van Genuchten (1980) model, with the exception of PTFs developed by Adhikary et al. (2008), which predict the parameters of the Brooks & Corey (1964) model.
These figures confirm once more the earlier statement of Schaap (2005) that "with the exception of a few studies, hydraulic data and corresponding indirect methods about tropical soils are a virtual terra incognita". This situation has not changed much until the present. Data-mining techniques such as ANNs, k-NN, and SVM, which are gaining popularity in pedotransfer modeling in temperate regions, have rarely been applied to develop PTFs for soils of the humid tropics. Pedotransfer functions derived from the semi-physical approach, such as the pore-solid fractal approach, have not yet been applied to soils in the humid tropics. Moreover, additional studies should be devoted to the development of tropical PTFs based on new and/or promising predictors such as DCB-Fe (Botula et al., 2012).