Predicting large-scale spatial patterns of marine meiofauna: implications for environmental monitoring

.

2 Gallucci et al. those patterns.Such knowledge is imperative to provide the baseline for sound monitoring designs delineated around clear and objective scientific questions, which are crucial to the decision about which ecological indicators to measure, the spatial extent of sampling, its frequency, and ultimately, to recognize the usefulness and limits of the data to assess and monitor relevant impacts.However, the absence of sufficient baseline data has very often made impact assessment challenging (Korpinen and Andersen, 2016;Ridall and Ingels, 2021).
Choosing the appropriate ecological indicators is one of the main challenges for large-scale monitoring programs.Among soft-sediment communities, the group of small benthic metazoans termed meiofauna (Mare, 1942) are well documented as sentinels for monitoring change in ecosystems worldwide, particularly in those in which larger organisms may be scarce and smaller organisms dominate the local biomass and diversity and are largely responsible for maintaining ecosystem function, such as in offshore sediments and the deep sea (Ingels et al., 2021).Advantages of using meiofauna as biological indicators of environmental changes include their ubiquitous presence in diverse ecosystems, small size, high abundance, rapid generation time, no pelagic larval phase, and their close relation and dependence on the sediment, which contribute for quick and reliable responses to local changes (Kennedy and Jacoby, 1999).Moreover, the presence of a variety of simultaneously sampled and analyzed indicator taxa that can differ in microhabitat use, niche breadth, ecological function, and response to environmental change increases sensitivity to diverse environmental disturbances.As such, meiofauna has been considered cost-efficient and effective bioindicators (Ingels et al., 2021;Ridall and Ingels, 2021).
Information on the spatial patterns of meiofauna assemblages is gathered by sediment sampling using corers (e.g., box-corer, multiple-corer) followed by sample processing and identification in a laboratory (Giere, 2009;Somerfield and Warwick, 2013).While sampling can be very expensive, especially on sites whose accessibility requires sampling from a ship, sample processing and taxonomic identification are very labor-intensive and often require the expert work of specialists.Moreover, biological samples are by nature 'station-based' samples and thus need interpolation to obtain full-coverage predictions of benthic assemblage distributions.To overcome these difficulties and obtain accurate predictions of meiofauna distribution, meiofauna indicator parameters can be modeled using a set of environmental variables as predictors, which should ideally be collected at a higher resolution.Especially for benthic studies, variables such as water depth, sediment type, and proxies of food resources are typically used as predictors.Although this modeling rationale is simple, such predictive modeling has been rarely used with meiofauna taxa (Ostmann and Martínez Arbizu, 2018).First, because most benthic studies lack higher resolution maps of their environmental variables and, second, because dealing with multiple spatial-scales in meiofauna data is a challenge (Gambi and Danovaro, 2006;Fonseca et al., 2010).The relative importance of small-to meso-scale variability may be as important as the large-scale one (Azovsky, 2009), hampering accurate predictions.As such, to obtain a sound monitoring program for meiofauna taxa, two key steps are necessary: a model capable of performing accurate predictions across spatial scales and a high-resolution environmental map.
This study uses machine learning (ML) techniques to model meiofauna indicators in relation to environmental variables and interpolate them in a high-resolution grid map of the Santos Basin (approx.350,000 km 2 ) in the southeastern Brazilian continental margin to provide baseline information to foster future monitoring programs.ML follows a principle of predictive analytics and life-long learning, which means that a monitoring program based on ML can be constantly recalibrated as new data is collected to generate better predictions (L'Heureux et al., 2017).Moreover, in comparison with traditional statistics, when a large amount of information is available, ML algorithms can better handle some of the shortcomings typically faced by large-scale studies, such as unbalanced designs, covariate effects, missing data, non-linearity, among others (Fonseca and Vieira, 2023).The area under study, Gallucci et al. the Santos Basin, shows increasing industrial activities related to oil and gas exploration from its coast to the deep sea (Perez et al., 2020;ANP, 2021;Sumida et al., 2022;Moreira, et al., 2023).Such routine activities may cause a variety of disturbances to the soft-sediment environment, which may have detrimental effects on benthic communities (Netto et al., 2010;Ellis et al., 2012;Washburn et al., 2016;De Léo et al., 2020).This study, which is part of the "Santos Project -Santos Basin Environmental Characterizationcoordinated by PETROBRAS," supports future monitoring programs for the area.

Study area and sampling design
The Santos Basin (SB) is located between the Cabo Frio High (22°S), in Rio de Janeiro state (RJ), and the Florianopolis High (28.5°S), in Santa Catarina state (SC), in the southeastern Brazil (Figure 1).The basin occupies an area of approximately 350,000 km 2 , bordering several Brazilian states and reaching a 2500-m water depth in the São Paulo Plateau.The continental shelf is narrower in Cabo Frio (RJ) (70 km) and wider off the municipality of Santos (SP) (230 km) and has a declivity ranging from 1:600 to 1:1,300 and a shelf break depth varying from 120 to 180 m (Mahiques et al., 2010).Sedimentation in the studied area shows a strong influence of oceanic water masses dynamics and shelf circulation, which differentiates its northern and southern sectors from the São Sebastião Island (24°13'S) (Piola et al., 2000;Mahiques et al., 2002Mahiques et al., , 2004;;Ciotti et al., 2014).The Brazil Current plays an important role in the dynamics of the studied area, carrying southward Tropical Water (TW), South Atlantic Central Water (SACW), and, to the south of the Santos Bifurcation (28°S), Antarctic Intermediate Water (AAIW) (Castro et al., 2006;Silveira et al., 2020).More details on these oceanographic features during the sampling periods can be found in Silveira et al. (2023) and Dottori et al. (2023).(Moreira et al., 2023).This study only focuses on the 2019 benthic sampling (cruises SANSED1 to 6).A total of 100 sampling stations were distributed in eight transects (A-H along a S-N gradient) perpendicular to the coast.In each transect, samples were collected at 11 isobaths (25,50,75,100,150,400,700,1,000,1,300,1,900, and 2,400 m; Figure 1).Additional 12 stations were sampled from 1,400 to 2,250 m depth below sea level in the São Paulo Plateau region, in which most of its oil and gas production takes place.Sampling was carried out in July 2019 at the continental slope and plateau (400 to 2,400 m) and in November 2019 at the continental shelf (25 to 150 m).

Sampling and samples processing
Sediment samples were collected onboard the R/V Ocean Stalwart with a spade type box corer (0.25 m² surface area).In 20 stations, sampling was only possible with a modified Van Veen grab (231 L, 0.75 m² surface area).Both instruments have large top lids for sample access.In each station, three successfully validated deployments were pursued.Deployment validation was assessed after the visual inspection of the integrity of the sediment structure and retainment of the overlaying water, which was then carefully siphoned off.In a multidisciplinary approach, in addition to meiofauna, many biological and environmental variables were collected from each box-cores or van Veen grabs and analyzed by other research parties.Methodological analyses for these variables are available in Moreira et al. (2023) and Carreira et al. (2023), and are summarized in Tables S1 and S2 (Supplementary Material).To collect the meiofauna samples, a cylindrical corer (5 cm diameter, 10 cm high, 19.63 cm² area) was carefully inserted into the sediment and the material was extracted, stored, and fixed with 10% buffered formalin.Sampling was incomplete in stations P1, B5 (only two successful replicates), A7, H4, G9 (only one successful replicate) and G11 (no successful sampling).In a laboratory, meiofauna samples were washed through two stacked sieves (500-and 45-µm apertures) to eliminate coarser sediment and larger organisms, and fine sediments, respectively.Meiofauna organisms were extracted from the material retained in the 45-µm sieve by the density flotation technique, using Ludox TM 50 (Sigma-Aldrich) adjusted to the specific gravity of 1.18 g cm -³ (Somerfield and Warwick, 2013).Flotation was repeated three times for each sample.Meiofauna organisms were then transferred to 10% formalin and stained with Rose Bengal.All specimens were counted in a Dollfus plate and identified into higher taxonomic groups using a stereomicroscope.The meiofauna samples were deposited in the Biological Collection "Prof.Edmundo F. Nonato" (COLBIO-IOUSP, 2022).

Data analysis
The following meiofauna univariate parameters were considered: total meiofauna density, densities of the most frequent groups (>50% of the samples), taxa richness (S), and the Shannon diversity index (H', log base 2).Densities were adjusted to ind. 10 cm -2 and richness and diversity were calculated per sample in R (R Core Team, 2021).The univariate parameters of the meiofauna in response to environmental variables, bathymetry, and coordinates have been analyzed by random forest techniques (Breiman, 2001).In case of asymmetric data distribution, meiofauna descriptors were log (x+1) transformed.To evaluate the effect of small-scale variability on the performance and accuracy of the models, the dataset was analyzed considering the three replicates (total of 300 sampling points) and by aggregating replicates by sampling station (total of 100 stations).In both cases, our analysis followed the analytical pipeline proposed by Fonseca and Vieira (2023, which was implemented using the iMESc application (Vieira and Fonseca, 2023).Gallucci et al.The pipeline is composed of a base model and a meta model.The base model consists of analyzing meiofauna data in response to the 38 environmental available variables.The meta model uses the predictions of the base-model to construct a reduced model (based solely on latitude, longitude, and depth) to project the parameter into a high-resolution bathymetric grid.Briefly, for the base model, 80% of the data are used for training and the remaining 20%, for testing (validation).The fine tunings for the random forest (RF) regression included 500 trees and a method of cross validation with 5 folds and 10 repetitions, random search of tune length equal to 10 of the number of variables to be used as candidates at each split point (mtry).Tuning parameters with the lowest root mean square error (RMSE) were selected for the optimal model and predictions were evaluated against the testing data.For the meta model, latitude, longitude, and bathymetry were used as predictors.Predictions of the meiofauna parameters were calculated for a high-resolution 2 km × 2 km bathymetric grid, totaling 100,555 data points.The same fine-tunings used for the base model were considered for the meta model.Both predicted data and mean absolute percentage errors (MAPE) were plotted in the geographical map for interpretation.Depth, latitude, and longitude were used as predictors to model and plot the top five environmental variables selected by the Random Forest analyses across all meiofauna indicators.
Meiofauna mean density varied from 55 to 2,001 ind. 10 cm -2 .It was higher in the stations along the continental shelf up to 75 m below sea level and in a few stations in the slope off Rio de Janeiro State at the northern portion of the basin (Figure 2A).We observed low meiofauna densities below 2,000 m below sea level along the lower slope and plateau.As for total meiofauna, all meiofauna taxa that occurred in more than 50% of samples showed higher density at the shallow continental shelf, particularly in the northern portion of the basin, with highest values just off Cabo Frio (transect H).Nematode mean density varied from 40 to 1,758 ind. 10 cm -2 and, in addition to high densities in the northern shelf, showed high numbers in the inner continental shelf in the southern region (Figure 2C).For this taxa, we observed particularly low densities in the 150 m isobath.Copepoda densities varied from 8 to 600 ind. 10 cm -2 and were higher in the shallow northern sector of the continental shelf (Figure 2D).Kinorhyncha, as well as Polychaeta, although less abundant (densities ranging from 0 to 35 and from 0 to 71 ind. 10 cm -2 , respectively) also showed higher densities at the shallowest stations in the southern continental shelf, particularly at transect B, facing the Paranagua Estuary (Figure 2E-F).All taxa showed decreasing densities toward the lower slope with lowest values at stations located at São Paulo Plateau (Figure 2C-F).

Environmental drivers of meiofauna taxa: base-models
When considering the three replicates, the accuracy (R²) of the RF base-models for the univariate descriptors of the fauna varied from 0.32 to 0.69 on the training part of the data (Table 1).All models with three replicates had low accuracy to predict new data (R² test part < 0.18).Overall, the base models on mean values had higher training accuracies than the models dealing with replicates.Meiofauna and nematode density showed the highest accuracies, with a R² of 0.74 and 0.76, respectively.On the opposite side, taxa richness was least explained by the set of environmental variables (R² 0.54; Table 1).When considering the testing part of the mean dataset, RF models showed accuracies varying from 0.33 to 0.79, in which meiofauna, nematode, and copepod densities had accuracies higher than 60%.

Selected environmental variables
The most important environmental variables selected by the base models (among the top five with the lowest mean depth position for each meiofauna parameter) showed great variability across the Santos Basin (Figure 4).Apart from δ 13 C, the observed difference between the minimum and maximum value spanned several orders of magnitude (Table 2).The models of spatial distribution of these variables in relation to water depth and location (latitude and longitude) showed variable levels of accuracy, with R² values varying from 0.32 to 0.78.Carbonate percentages, which were important to all meiofauna indicators, showed one of the highest R² among all.We observed the highest percentages (> 60%) at the deepest stations (São Paulo Plateau) and along the 150-m isobath (Figure 4A).The standard deviation of the sediment grain size was only partially structured with depth and location (R² 0.46), with higher values at the deepest stations (Figure 4B).
Among the indicators of organic matter, phaeopigments was the most structured with depth and location (R² 0.68), showing the highest concentrations between the 25 and 100 m isobaths with two peaks: one up north just off Cabo Frio extending southward to Ilha Grande (RJ) and São Sebastião Island (SP) and another peak coming from the southernmost part of the Basin up to latitude 26°S.The model for chlorophyll-a (Chl.a) was less accurate than that for phaeopigments (R² 0.55).Nevertheless, it also showed higher values at the continental shelf peaking at the northern region, just off the Cabo Frio upwelling area extending to the southern part of RJ state and at the southern region of SB to a lesser extent.Particularly for Chl.a, the central part of the continental shelf showed reduced values (Figure 4C).Although poorly related to depth and location (R² 0.32), the higher concentrations of lipids were mostly associated with the 100 m isobath and their intermediate values at the slope between the 400 and 700 m isobaths (Figure 4E).Depth and location explained 66% of the spatial variability in δ 13 C (Table 2).We found the most negative values (around -24‰) at the central and northern coast, whereas we observed the less negative values (-19‰) at the deepest stations in the southern portion adjacent to the São Paulo plateau.Inorganic phosphorus showed the highest accuracy (R 2 0.78), although this variable was among the most important only for taxa richness.The distribution of this variable was similar to carbonates, with low concentrations in the inner and mid-shelf and a peak around the 150 m isobath but, unlike carbonates, gradually decreased toward the deep sea.

Meta model: integrating the fauna and environmental variables
The accuracy of the RF meta models varied from 0.74 to 0.82 (Table 3) with a small standard deviation for each model (R² SD < 0.14).Overall, the predictions of the meta models showed the highest values on the continental shelf and slope for all meiofauna descriptors (Figure 5).The northern portion of the SB, close to the coast, has the highest densities of organisms and it is associated with a peak in chlorophyll-a.Total meiofauna and nematodes showed the highest densities in the continental shelf (50 and 100 m isobaths) in the northernmost region, followed by a lower magnitude but still high abundance all over the coastline down to 100 m below sea level.
This distribution agrees with that of phytopigments (Chl.a and Phaeo) at the northernmost and southernmost stations but not at the central part of the continental shelf, in which phytopigment values were particularly low at these depths.They also showed a conspicuous reduction at the 150 m isobath, which agrees with carbonate concentrations above 60%.The intermediate values of meiofauna organisms at the slope, especially between the 400 and 1,000 m isobaths, is associated with intermediate lipid concentrations.The deepest regions of the SB are characterized by low densities of organisms, low concentrations of phytopigments, high concentration of carbonates, and a heterogeneous sediment composition.Taxa richness and kinorhynch, polychaete, and copepod densities were high along the coastline but discontinuous at the central portion of the Basin (Figure 5), as observed for Chl.a (Figure 4C).Copepods, in particular, showed the highest densities at the shallowest part of the northern region (Figure 5A).Although much lower than in the northern region, copepod densities were also pronounced in the southern part of the continental shelf and between the 400 and 1000 m isobaths in the slope.In the deepest stations, nematode abundances and taxa richness were the lowest and Kinorhyncha, Polychaeta, and Copepoda were nearly absent.
The mean absolute percentage error (MAPE) of the meta models for taxa richness, total meiofauna, and nematode densities were relatively higher at the 150 m isobath at deeper stations from the southern region, with errors varying from 4% to 8% (Figure S1).Polychaetes and kinorhynchs showed the highest error estimates at the Plateau, 30% and 100%, respectively.

DISCUSSION Meiofauna distribution and environmental drivers
The ranges of meiofauna densities and richness observed in the Santos Basin are comparable to those previously found in the shelf off Ubatuba (SP) and Cabo Frio (RJ) (Argeiro, 2009;Yaginuma, 2010) and in the Campos (Petrobras, 2013;Fonsêca-Genevois et al., 2017) and Espirito Santo Basins (Santos et al., 2016(Santos et al., , 2020;;Venekey et al., 2016).Meiofauna densities in these three Basins exceeded that in the shelf off Sergipe, south of Alagoas (Pinto et al., 2018).
In the Santos Basin (SB), variations in meiofauna taxa and richness densities along the continental shelf and slope reflected the complex interaction between depth variation, sediment composition, and the quantity and quality of the organic matter.
The overall picture of SB meiofauna suggests that they are organized in a mosaic of six benthic zones, which were delineated based on meiofauna and benthic environmental characteristics (Figure 6).SB inner and mid-shelves have two zones associated with distinct oceanographic processes driven by water masses dynamics, shelf circulation, and wind patterns, which promoted high inputs of phytodetritus and organic matter into the sediments: the La Plata Plume zone, in the southern region (LPP) and the Cabo Frio Upwelling zone (CBU), in the northernmost region.Such an input of high-quality organic matter fueled meiofauna abundances and supported higher taxa richness.The central region of the continental shelf comprises the third benthic zone (CCS, Figure 6).This zone was characterized as less productive, with lower abundances for most taxa (except nematodes) and lower richness.The breakpoint of the continental shelf, along the 150 m isobath, contains a fourth zone, characterized by high carbonate percentages in the sediment corresponding to low densities of meiofauna taxa, especially nematodes (CB, Figure 6).A fifth zone, the Upper and Mid-Slope zone (UMS, Figure 6), was determined from 400 and 1,000 m below sea level along the entire basin and was characterized by higher meiofauna abundances, taxa richness, and lipid content than the adjacent zones.Finally, the Lower Slope and Plateau zone (LSP) was characterized by negligible inputs of primary organic matter and high percentages of carbonates, which reflected lower taxa richness and generally low abundances (LSP, Figure 6).

The Cabo Frio Upwelling zone (CFU)
This benthic zone comprehends the northernmost stations from the continental shelf northward of São Sebastião Island (24°S) (Figure 6).It was characterized by the highest densities of all meiofauna taxa, higher taxa richness (Figure 2), and the highest concentrations of chlorophyll-a and phaeopigments (Figure 4).This region is exposed to seasonal Ekman-driven upwelling that brings cold, nutrient-rich South Atlantic Central Water (SACW) nearshore (Castro Filho et al., 1987).The SACW intrusion promotes the deposition of high quantities of fresh organic matter on bottom sediments (Arasaki et al., 2004;Carreira et al., 2012), which has been shown to promote an increase in benthic biota density and biomass (Sumida et al., 2005;De Léo and Pires-Vanin, 2006).Additionally, regional-scale studies have shown that areas with high availability of food in the sediment are typically characterized by higher local diversity (e. g., Lambshead et al., 2000;Danovaro et al., 2008), as in this study.Our sampling took place at the late austral spring, during which upwelling events occur with higher frequency and intensity (Castro Filho et al., 1987).Also, the intrusion and resurgence of SACW nearshore and its related increase in primary production is most pronounced near Cabo Frio, Rio de Janeiro (Coelho-Souza et al., 2012) to São Sebastião Island (SP) (Castro Filho et al., 1987;Brandini et al., 2018), which coincides with the benthic zone established in this study.Dottori et al. (2023,) confirmed upwelling conditions for our sampling period within the CFU zone.These results corroborate the strong benthic-pelagic coupling already observed for other benthic components within this zone (Sumida et al., 2005;De Léo and Pires-Vanin, 2006) and highlight the primary importance of this flux of fresh and labile organic matter in driving meiobenthic densities and diversity within the northern part of the SB continental shelf.

The La Plata Plume zone (LPP)
This zone comprises the southernmost region of the inner and mid-continental shelf of the SB and showed relatively high meiofauna densities for all taxa, as well as higher richness, although not as substantial as densities and richness observed in the Cabo Frio upwelling zone.Within the LPP, sediments were marked by a conspicuous area with high phaeopigment concentration in the midshelf and a less prominent (but higher than the surrounding area) chlorophyll-a concentration, indicating increased phytodetritus availability.Such pattern is likely a result of the intrusion of low salinity and cold nutrient-rich waters from the Sub-Antarctic Argentinian shelf, the La Plata River discharge, and the Patos Lagoon, which the Brazilian Coastal Current (BCC) carries northward and occupy the inner and mid-shelf of the southern SB region (Piola et al., 2000;Souza and Robinson, 2004;Brandini et al., 2018).These nutrient-rich waters enhance primary production (Brandini et al., 2018), which, in turn, influences the deposition of the pelagic organic matter to the sediment (Mahiques et al., 2004).Our results indicate that all meiofauna taxa are responding to these fresh phytodetritus input by their increased density and taxa richness.The size of this benthic zone may vary seasonally as a consequence of the increased inflow of the La Plata River discharge during winter (Guerrero et al., 1997) and the dominating southwest winds, which force the BCC northward and spread these waters as far north as 24°S, even during low river discharge periods (Souza and Robinson, 2004;Möller et al., 2008).Especially in November of 2019 (the period of our sampling campaign), salinity data indicates the contribution of coastal waters influenced by the La Plata River plume up to 25°S (transect B) (Dottori et al., 2023), which coincides with the LPP we defined based on meiofauna data and phytodetritus indicators, suggesting a strong benthic-pelagic coupling within this zone.

The Central Continental Shelf zone (CCS)
This zone comprises the inner and middle shelf between the CFU and LPP zones.The evidence coming from the descriptors of the organic matter indicates that, during the campaign, CCS showed oligotrophic conditions with a dominance of detritus from the continent (more negative δ 13 C values) and limited input from the pelagic system (low concentrations of Chl.a and Phaeo) (Carreira et al., 2023).The absence of phytodetritus reflected the low kinorhynch, polychaete, and copepod densities.It is reasonable to conclude that these three taxa prefer the more productive zones of the continental shelf, reducing their densities toward this central zone and the deeper and less productive zones.The literature supports the association of these three benthic taxa with pelagic processes.Harpacticoid copepods are diatom feeders (Troch et al., 2005;Mascart et al., 2013;Urban-Malinga, 2014) and their abundances are often related to the phytopigments, organic matter, and bacteria in the sediment (Mohamed et al., 2018;Veit-Köhler et al., 2018;Pruski et al., 2021).Kinorhynchs feed mainly on detritus and diatoms and are generally associated with sediments with high organic matter content (Giere, 2009;Landers et al., 2020).Meiofaunal annelids inhabit a variety of environments and their abundances are also related to an increase in organic content (Villora-Moreno et al., 1991;Giere, 2009;Pruski et al., 2021).An interesting aspect is that nematodes showed relatively high densities in this zone (similar to those registered for the LPP).The influence of terrestrial organic matter input in this region has been described by Corbisier et al. (2014) and Mahiques et al. (2014).The input from such terrestrial organic matter from the estuarine systems (Santos, Juréia-Itatins, and Predicting spatial patterns of marine meiofauna Ocean and Coastal Research 2023, v71(suppl 3):e23037 14 Gallucci et al.Cananéia-Iguape) may play an important role in supporting high nematode populations.Although nematodes also largely rely on phytodetritus, the high trophic diversity within the group might support equally dense populations of species that rely upon other food sources and associated microbiota (Moens et al., 2013).
Since our data is limited in time and taxonomic resolution, two important aspects require further consideration in a monitoring plan for this zone.First, the temporal dynamics of the LPP and CFU oceanographic processes.Such a monitoring plan can expect that benthic organisms will rapidly respond changes to the spatial extension of these pelagic processes (reduction or increase).LPP and CFU variability are well documented in the literature (Castro Filho et al., 1987;Castro et al., 2006;Piola et al., 2008).For instance, during wintertime, the contribution of the La Plata river reaches up to 24°S, covering half of the CCS.In the same way, the water mass that characterizes the CFU zone can move southward during spring-summer months.As such, the synchrony in space and time of the benthic-pelagic coupling might be determinant to understand the characteristics of the CCS.The second aspect regards species composition and endemism, critical for a monitoring plan (Faith et al., 2004).It is important to understand whether this zone can be distinguished by a particular set of species or if it acts as a sink of species coming from the adjacent productive zones.

The Upper and Mid-Slope zone (UMS)
This zone comprises the continental slope, extending from 400 to 1,000 m below sea level along the entire basin (Fig. 6).It is characterized by higher taxa richness and relatively higher densities of nematodes, copepods, and kinorhynchs than its surroundings (i.e., CB and LSP zones).The UMS is also defined by a band of relatively high lipid content that was not accompanied by chlorophyll-a or phaeopigments.The same zone has been also evaluated by Carreira et al. (2023).In addition to lipid content, the authors showed that the biopolymeric carbon content was high within this bathymetric zone, indicating the high nutritional value of the available organic matter (Carreira et al., 2023).In an oligotrophic context, as observed for the UMS, the quality of the organic compounds in the particulate organic carbon is critical to the fauna (Danovaro et al., 1999;Campanyà-Llovet et al., 2017) and might explain the higher densities and richness of meiofauna within this zone.
Unlike the other zones in the continental shelf (in which the meiofauna patterns could be well linked to oceanographic processes), the origin of the high-quality organic matter in the UMS zone is not evident.Carreira et al. (2023) have proposed two processes, the first more plausible than the second: 1) this OM was laterally advected (supported by the absence of labile pigments) or 2) it was produced in the photic zone offshore and directly transported to the sediment.Although considered less plausible, the second hypothesis could be supported by the occurrence of a deep chlorophyll maximum layer (DCML) corresponding to a maximum chlorophyll-a concentration in subsurface waters at the base of the euphotic zone, described by Tura and Brandini (2020).The DCML occurs during events of oligotrophic water dominance above the continental shelf and slope together with shelf-break upwelling events, contributing to the substantial production and accumulation of subsurface particulate matter stocks in the water column (Tura and Brandini, 2020).Additionally, above the slope around 500 m, the accumulation of particles is favored by the transition between the overlying southward Brazilian Current and the underlying northward Intermediate Western Boundary Current, resulting in current velocities equal to zero (Mahiques et al., 2021).

The Carbonate zone (CB)
As shown by the RF analysis, the calcium carbonate percentage was an important predictor to all meiofauna indicators.The area has two distinct regions with high carbonate percentages: a conspicuous carbonate belt between 100-200 m isobaths at the shelf break (classified as Carbonate zone; CB, Figure 6) and a calcareous ooze below 1,300 m (Lower Slope and Plateau -LSP, Figure 6).The CB zone is a deposit of calcareous algae formed during the Quaternary as a consequence of sea-level oscillations (Dominguez et al., 2013).The predominance of carbonates at the shelf break together with the low concentration of organic matter and phytopigments indicate an area of Gallucci et al. great hydrodynamics.Particularly along the shelf break of the SB, the main flow of the Brazilian Current brings warm salty oligotrophic Tropical Water from the northeast (Brandini et al., 2018).These nutrient-poor waters and low sedimentation rates (Mahiques et al., 2004) contribute to the oligotrophic conditions observed in this benthic zone.The impoverished abundance of meiofauna (especially nematodes) at coarser sediments and oligotrophic areas is well documented (Giere, 2009).Of the benthic zones, sample error estimates at the CB-zone were relatively higher, suggesting that such reduced densities are more difficult to predict under a scenario of environmental change.Additional stations and seasonal sampling would improve our understanding of this zone.

The Lower Slope and Plateau zone (LSP)
This zone incorporates the isobaths from 1,300 to 2,400 m depth, including all the stations from the São Paulo Plateau.It was characterized by very low densities of all meiofauna taxa as well as an impoverishment in taxa richness since some of the taxa were restricted to the US and upper slope.Low abundances and higher taxa diversity in the lower slope and bathyal depths are well reported in the literature (Danovaro et al., 1999;Netto et al., 2005;Rex et al., 2006;Schmidt and Martínez Arbizu, 2015) and it is generally associated with a low input of food resources (Soltwedel, 2000).On average, although 25 to 50% of the primary productivity reaches the seabed in coastal areas, only about 1% is delivered to the deep sea (Suess, 1980).In addition to these low quantities, the organic matter reaching the seafloor is generally highly refractory since bacteria attach to and transform descending phyto-detrital particles as they sink, degrading the most labile organic compounds before reaching the deep-sea floor (Danovaro et al., 1999).Likewise, indicators of food resources at the SB, such as phaeopigments, chlorophyll-a, and lipids, were extremely low in stations that comprised the LSP zone (Carreira et al., 2023,).
Another conspicuous feature of this zone is a calcareous ooze just below 1,300 m, particularly at the São Paulo Plateau region, characterized by carbonate percentages as high as 80% (Figueiredo et al., 2023).In this zone, calcium carbonates stem from the deposition of biogenic detritus, mainly pelagic foraminiferans and pteropods shells.This deposition may not be recent but it strongly affects the sediment structure and contributes to a greater grain size standard deviation.Despite its importance as an environmental predictor, it is unlikely that this parameter overcomes the effects of low food availability at these greater depths, as discussed above.Information on assemblage composition at a higher taxonomic resolution may help to disentangle the different contribution of these environmental predictors in driving meiofauna communities at this particular zone.

Implications for the ecosystem-based management of the Santos Basin
An ecosystem-management approach consists of two principles: (1) understanding the potential risks of anthropogenic pressures and (2) evaluating how key ecosystem indicators will respond to those risks (Leslie and McLeod, 2007;Long et al., 2015).Mapping risks and modeling responses are, thus, critical issues for such a program to the succeed and, most importantly, conserve ecosystems.Thus, this study showed that meiofauna descriptors are tightly linked to changes in pelagic and sedimentary environments.Nevertheless, as discussed above, although some zones can be recognized by most meiofauna descriptors, others are better characterized by specific ones.Such complementarity should be considered during the implementation of an integrated program to monitor multiple aspects of an ecosystem (Dale and Beyeler, 2001).Particularly in the SB, data suggest that nematodes might be more sensitive to changes in carbonate percentages, whereas copepods, polychaetes, and kinorhynchs might be more sensitive to subtle variations in primary productivity and its subsequent settling, as observed between the LPP and CCS zones.Copepods can also be sensitive to variations in the quality of organic matter, particularly in the UMS zone.Therefore, any anthropogenic pressure that may change these variables, such as climate change (Danovaro et al., 2004) mining (Miljutin et al., 2011), and oil and gas exploration (Netto et al., 2009;Montagna et al., 2013;Reuscher et al., 2017;Rohal et al., 2020) are expected to affect the meiofauna.
An additional important aspect to be considered is that, of the 38 evaluated environmental variables, only 15 have been used to retrieve accurate predictions, six of which were among the top five for all models.Such findings should also be considered to optimize future monitoring programs.In practice, this means that increasing sampling coverage and time intervals for a set of selected environmental variables will enable better predictions of the state of meiofauna indicators.In this monitoring scenario, as soon as an unexpected value (out of the predictions) is detected, an intensive but localized sampling campaign including additional environmental and fauna parameters should be conducted.The principle of adaptive monitoring programs (Iacono et al., 2010) is such an approach, reducing costs and improving the understanding of the system.It may also serve as the baseline to build a long-term machine learning monitoring program (Fonseca and Vieira, 2023).Within this context, a critical point to be considered is the establishment of tolerable change thresholds (Johnson, 2013).Based on our results, the thresholds for the SB can be spatialized in a 2 km × 2 km bathymetric grid.Thus, for each grid cell, we recommend the use of lower and upper quantiles (i.e., 2.5% and 97.5%, quantiles) of the predicted value of a fauna descriptor.The use of quantiles, instead of confidence intervals, are recommended because the latter assumes a normal variance distribution at each grid cell (not the case in such a large multidisciplinary dataset).Note that this range must be constantly re-evaluated according to the objectives of the program (Fidler et al., 2006).
Sampling coverage and resolution are two other aspects that require attention.The current sampling program assumed a regular bathymetricoriented grid design with three replicates per station (Moreira et al., 2023).Based on the model accuracy and the observed error estimates, coverage and sampling intensity can be optimized.Without considering temporal variation, the coverage of future monitoring programs could be adjusted to concentrate more stations on the regions with higher error estimates (e.g., the CB zone and deeper stations in the southern regions) and more sparsely distribute stations where estimations are more precise (e.g., shallower stations).The inclusion of temporal data will certainly add extra variability, aiding the decision of a more appropriate sampling design.Regarding sampling effort at the smaller scale (hundreds of meters in this study), the models were well adjusted either using three replicates or mean values, but only the models on means obtained accurate predictions (test data, Table 1).This indicates that the models were unable of predicting within-station variability across the basin.This can be interpreted in the light of the patch mosaic model (Gallucci et al., 2008), in which each station is under distinct structuring local fauna processes.The consequence for future large-scale monitoring programs is that composite samples should be considered when characterizing an area of hundreds to thousands of meters.The use of composite samples is already applied in terrestrial environments (e. g., Patil, 1995;King et al., 2006;Griffiths et al., 2016), but are less explored in oceanographic programs (Cho et al., 2021).We should emphasize that any design of sampling coverage and intensity (i.e.replicates) must be carefully analyzed with the sources of anthropogenic stressors and other ecological indicators (Long et al., 2015).

CONCLUSION
Meiofauna in the Santos Basin are spatially organized in a mosaic of six benthic zones.For monitoring purposes, this study showed that, although some zones can be recognized by most meiofauna descriptors, others are better characterized by specific descriptors, implying that meiofauna indicators should be monitored concomitantly.Additionally, the results of our machine learning models showed that 15 environmental variables sufficed to retrieve accurate predictions, six of which were among the top five for all models.These results can support the optimization of future monitoring programs regarding the sampling coverage, intensity, and environmental variables needed to reduce costs and increase our understanding of the system.The monitoring program to be implemented should preferably be adaptive and based on long-termlearning algorithms.
Gallucci et al.

Figure 2 .
Figure 2. Mean densities of total meiofauna and most frequent taxa (ind.10 cm -2 ) and taxa richness at stations sampled along the Santos Basin in the Brazilian continental margin.A) Total meiofauna; B) Taxa richness; C) Nematoda; D) Copepoda; E) Polychaeta; F) Kinorhyncha.
Gallucci et al.

Figure 5 .
Figure 5. Results of the predictions based on meta models for the meiofauna descriptors in the Santos Basin.Total meiofauna (A) and Nematoda (C) are shown in ind. 10 cm -2 .Copepoda (D), Polychaeta (E), and Kinorhyncha (F) data were log(x+1) transformed.

Figure 6 .
Figure 6.Scheme of the Santos Basin depicting potential benthic zones based on the response of the meiofauna descriptors to the environmental conditions.
, in which we observed the highest values of phytodetritus and meiofauna indicators.Nonetheless, Cabo Frio upwelling fronts reach 300-400 km southward up Predicting spatial patterns of marine meiofauna Ocean and Coastal Research 2023, v71(suppl 3):e23037 13 Gallucci et al.

Table 1 .
Results of the random forest of the Base Model for each meiofauna indicator for the training and test portion of the data set.mtry: number of variables randomly sampled as candidates at each split point, RMSE: root mean square error, R²: percentage of variance explained, MAE: mean absolute error, SD: standard deviation.N: meiofauna density, S: number of taxa.

Table 2 .
Minimum, mean, and maximum values of the environmental variables selected by the base models and results of the random forest model from data derived from the mean value observed per station.RMSE: root mean square error, R²: percentage of variance explained, MAE: mean absolute error, SD: standard deviation.

Table 3 .
Results of the Meta model for the variables that had R² larger than 0.6 on the test part of the Base model (Table1).RMSE: root mean square error, MAE: mean absolute error, SD: standard deviation, N: meiofauna density, S: Taxa richness.