Mati sandbox — Enhanced Rock Weathering false-positive explorer

May 14th 2026

Enhanced rock weathering (ERW) is moving from research and pilots into commercial deployment at scale. The first major tranche of deployment-scale measurement, reporting, and verification (MRV) data from ERW suppliers has now been published through the Isometric and Puro.earth registries, under successive versions of their respective Enhanced Weathering in AgricultureIsometric. Enhanced Weathering in Agriculture Protocol, versions 1.0, 1.1, and 1.2. London: Isometric. and Enhanced Rock WeatheringPuro.earth. Enhanced Rock Weathering Methodology, first edition. Helsinki: Puro.earth. methodologies. Now that meaningful tonnage is moving through verification, this kind of public, registry-grade data is enormously valuable. Every delivered tonne accompanied by publicly disclosed underlying data sharpens the field's collective understanding of how ERW behaves in the real world.

Mati will soon be adding to this data pool, and we want to take this opportunity to demonstrate why we believe our forthcoming carbon dioxide removal (CDR) claims rest on robust data and a rigorous understanding of the methodology behind it. In this piece, we walk through some of the challenges in producing ERW carbon removal credits. We also walk through what we think it takes to make a scientifically and statistically supported CDR claim, and we've built an interactive tool to help others meet those goals. We have imposed these expectations on our own work.

Solid-phase MRV is the methodology behind all of these claims. It samples soil before and after basalt application and reads the difference in soil chemistry to track how much rock has dissolved and how much CDR resulted.

Improvements to our logistical and analytical pipelines have meaningfully increased the quality of our own solid-phase MRV datasets throughout the last year. From this, we have taken the opportunity to build workflows that we trust as scientists and tools that we believe the enhanced weathering community needs in order to expand this CDR pathway.

The tool we're shipping today includes four datasets from three Mati deployment clusters, all from the Kharif (monsoon) 2024 season in India. Three are currently under review for crediting. The fourth, Seoni NCR (no credit), is a predecessor dataset to the creditable Seoni deployment also presented in the tool. We held it back from crediting because we judged its data as insufficiently robust, and we include it here to make that judgment inspectable.

Access the tool here

A separate version of the explorer that accepts user-uploaded solid-phase MRV data is also available on GitHub.

What brought us here in the first place

Solid-phase, or soil-based MRV is the primary method of carbon dioxide removal quantification for most ERW providers, including the deliveries published through Isometric and Puro.earth. The approach, nicknamed "TiCAT", is a simple mass-balance model formalized for ERW by Reershemius et al. (2023)Reershemius, Tom, Mike E. Kelland, Jacob S. Jordan, Isabelle R. Davis, Rocco D'Ascanio, Boriana Kalderon-Asael, Dan Asael et al. "Initial validation of a soil-based mass-balance approach for empirical monitoring of enhanced rock weathering rates." Environmental Science & Technology 57, no. 48 (2023): 19497–19507. and draws on long-standing concepts in the chemical weathering literature.

The basic idea is relatively simple: elements in rocks can generally be classified as "immobile" or "mobile." Immobile elements are locked up in refractory mineral phases that resist weathering, so they stay in the soil. Mobile elements are released into solution by weathering reactions and easily carried away by water. When we apply rock powder to soil and track the ratio of immobile to mobile elements over time, we can quantify the extent of weathering. From this, we can calculate how much rock weathering and corresponding CDR occurred.

Schematic of basalt-driven enhanced rock weathering: basalt applied to soil carries immobile (I) and mobile (X) elements; weathering dissolves the mobile elements, which leach away as bicarbonate, leaving immobile elements behind. — **Figure 1.** The three-stage signal solid-phase MRV measures. *Left:* basalt is broadcast and tilled into the topsoil, mixing immobile (I) and mobile (X) elements from the rock down into the soil column (curved arrows) and raising both tracers above baseline. *Middle:* atmospheric CO₂ dissolves in soil water to form carbonic acid (CO₂ + H₂O), which percolates into the soil (downward arrows) and attacks the basalt. *Right:* weathering releases the mobile elements (X, now dashed); they exit the field as bicarbonate-bound dissolved inorganic carbon (HCO₃⁻ + X, downward arrows), leaving the immobile elements (I) locked in place. The change in I/X between left and right is the fingerprint solid-phase MRV reads.

The mass-balance equations themselves are simple, written to describe a mixture of ERW feedstock and soil. Because we cannot directly measure how much rock has made it into a given soil sample, we have to either assume our deployment logs are perfect and all the rock was evenly spread, or we can infer it from the chemistry of that sample using the concentrations of immobile (I) and mobile (X) elements. Both quantities feed the mass balance, but the main fingerprint of rock addition typically comes from I, which is preserved through weathering.

The problem is that calculations of the weathering extent (τ) become unstable when the measured change in immobile tracer concentration is small. Solving the mass balance equations for τ gives

τ = 1 - Δ X + r X s r X b,

where b is the composition of the basalt feedstock as applied to the field (the rock layer in Figure 1, left), s is the composition of the baseline soil sampled before basalt application (the soil layer in Figure 1, left), and ΔX is the change in mobile-element concentration in the post-application soil sample relative to the baseline (Figure 1, right). The rock fraction (r) sits in the denominator. When the detected increase in an immobile element is small, the estimate of how much rock has weathered is drastically more sensitive to noise in the data. This dynamic makes it more difficult to resolve statistically significant rock-weathering signals. The same instability can appear when the immobile signal is large but not statistically distinguishable from noise.

Whether the noise arises from subtle variations in soil geochemistry, sampling biases, or analytical uncertainty, the end result is the same: the apparent weathering extent can swing between arbitrarily large and small values, with large values being strongly favored. Ultimately, flawed or even erroneous CDR claims could be made based on such data. You can explore this effect for yourself in the widget below.

Figure 2. TiCAT asymptote explorer. The two panels show how mobile element change (ΔX), weathering extent (τ), and soil rock fraction (r) are related. Drag the sliders around to visualize how weathering extent varies nonlinearly with rock fraction at different ΔX. As you slide the rock fraction toward zero on either panel, note how the implied weathering extent goes haywire: this is the singularity that naturally arises from the 1/r limit baked into the mass balance equations used for solid-phase MRV. The hatched strip at small negative r is non-physical (rock fractions can't go below zero), but is shown to make the bidirectional nature of the swing across r = 0 visible. Suppliers that report only clipped values τ ∈ [0, 1] can mask this mathematical instability in their data.

Despite this model limitation, however, these mass-balance equations are at the heart of solid-phase MRV and comprise the primary carbon-crediting mechanism for enhanced rock weathering. Almost all suppliers use a version of these mass balances, so it's critical to figure out the model's limitations and the implications for CDR deliveries.

We'll note that the singularity is just one challenge amongst many for the ERW practitioner. In practice, soils are heterogeneous, sampling captures only a small slice of any given field, and analytical measurements carry their own error bars. These real-world sources of uncertainty compound with the model's inherent instability, making the process of pulling a distinct weathering signal out of field data more demanding than the equations alone might suggest.

That all said, we think we've figured out an approach that is robust and statistically significant.

Resolving the weathering signal from the noise

Navigating the combination of theoretical instability, operational noise, and associated measurement uncertainties for MRV requires more than just inspection of the equations described above. To that end, we developed an interactive tool that allows anyone to explore how solid-phase MRV datasets lead to CDR claims.

Specifically, this tool evaluates the statistical robustness of solid-phase MRV datasets. Its inputs are the essential components of any solid-phase MRV dataset: baseline soil composition, feedstock composition, and the changes in the soil's elemental composition after rock application and re-sampling. You can adjust the sliders to represent any study parameters, compositions of soil and feedstock, changes in mobile and immobile concentrations, and the variance (distributional width) of those changes.

The tool then outputs the following results:

Statistical significance of each tracer's observed change — are the measured changes in the immobile and mobile tracers robust? Do the data constrain the credible intervals (e.g. 68% or 95% CI)?
False-positive rate — what are the chances that these data represent a false-positive signal for weathering?
Estimated rock application and weathering fractions — based on the tracer data alone, how much rock was applied to the field, and how much of that rock weathered away?

Passing statistical tests greatly improves the chances that the associated CDR claim is robust, which is why we assess statistical significance in two ways, forwards and backwards. Our forward tests look at the raw data and ask how likely it is that noise alone could produce the observed change in concentration relative to the soil baseline. Our backward (Bayesian) tests interrogate the claim itself by asking how plausible a range of rock fractions and weathering extents are given the data.

The tool presented here also visualizes the likelihood of rock-fraction and weathering-extent combinations being consistent with the data (i.e., the Bayesian posterior). It creates a bridge between the simplistic mixing theory laid out by previous workers and messy real-world data: for a given data set, you can see how well the data constrains r and τ, as well as how close the data sits to the unstable (low-r) regime.

Given the asymptotic behavior — where small signals in tracers tend to inflate CDR claims — testing the significance of the tracer signals is paramount. If, for example, an apparent immobile tracer shift was observed in a set of samples not because of the presence of rock powder, but just because of natural variability in the soil, then the mobile tracer might not be correlated; the addition of raw, unweathered feedstocks might never have been detectable those particular samples. In effect, the resulting data would suggest that the feedstock completely weathered away when it was never even sampled in the first place.

If a dataset successfully clears the statistical gates we've imposed, then our tool allows one to quantify not just what fraction of the rock was weathered, but also how much rock (as a fraction of the soil sampled) was applied in the first place — because rock application is a noisy process too! The methodology we developed to work with these datasets allows us to account for variable rock spreading rates, something incredibly important for supporting ERW projects with smallholder farmersJordan, Jacob S., et al. "Enhanced rock weathering for improved smallholder farmer welfare: An at-scale case study for rice agriculture in India." CDRXIV preprint, 2026.. We imagine it will be useful for those in other contexts, too.

Technical notes

Mass-balance equations for solid-phase MRV

The forward model is a three-endmember mass balance. A post-application soil sample is a mixture of untouched soil (s), unweathered basalt (b), and the residue of weathered basalt (w). For any element, the post-application concentration is a weighted sum of the three endmembers,

E post = f s E s + f b E b + f w E w

with the closure constraint that the three fractions sum to one,

f s + f b + f w = 1.

Two quantities determine CDR. The rock fraction (r) is the fraction of basalt-derived material in the sample. The weathering extent (τ) is the fraction of that basalt that has dissolved. Both follow directly from the three component fractions:

r = f b + f w, τ = f w r .

An immobile tracer such as titanium doesn't dissolve, so its post-application change isolates the rock fraction.

Δ I = r (I b - I s).

A mobile tracer (calcium, magnesium) is released by weathering; its observed change combines basalt arrival with subsequent dissolution:

Δ X = r [(1 - τ) X b - X s].

We can solve this for the weathering extent. The rock fraction (r) sits in the denominator. This is the structural reason small r inflates the implied τ, the asymptotic regime visualized in the explorer earlier:

τ = 1 - Δ X + r X s r X b .

Statistical significance of the observed change, paired and unpaired t-tests.

Firstly, the tool will tell you if the data describes a statistically significant change in both the immobile and mobile element concentrations of the soil between the baseline and the post-sampling. This is an extremely important element of any solid-phase dataset: it gives you the chance that, given an observed change in a tracer, how likely the observed change could have been generated just by randomly resampling the baseline data. Before any data should undergo any further processing (e.g., bootstrapping) the statistical robustness of the magnitude of the signal observed should be constrained; otherwise, this risks resampling noise being interpreted as signals — understating the likelihood that a signal wasn't even detected at all. Imperatively, the distribution of a noisy signal can still produce a weathering rate entirely above 0 — this does not mean the signal is statistically significant.

In our tool, we show the results of both a paired t-test and a Welch's t test to demonstrate the effect of pairing. Some datasets with more consistent sampling and analytical pipelines will produce paired data (samples that were collected from the same sampling location) that provides information to constrain rock application that is unobservable on a population scale. Conversely, some datasets will represent more noisy sampling and analytical pipelines, but have constrained such a large signal that the change is statistically significant on a population scale. We note that, here, we hardcoded the correlation (Pearson's r) between pre and post-deployment sampling to be 0.75. By default, this also modulates the level of information gained from pairing. Higher correlation would translate to more information gained from pairing, whereas 0 correlation would result in no information gained from pairing.

We also output the "retroactive" statistical power of the test, noting that statistical power is generally useful as a study design tool, not an evaluation metric — as it is functionally a reparameterization of a t-test. Still, we think it has some value here if this tool is used for theoretical signals instead of real cases.

Estimated rock fraction and weathering rate

We use the Bayesian construction proposed in Jordan et al. (2026) to show the inferred rock fraction in soil and weathering fraction produced by the dataset. The HPD (highest-posterior density) describes the high confidence region for these two parameters given the observed dataset. It's important to visualize the data in this way, because it underlines how the rock fraction parameter — which is derived from the immobile tracer — directly interplays with weathering rate. Importantly, a tracer shift of any size or significance coupled with an insignificant change in mobile cation concentration will generate a high weathering rate.

False Positive Rate (FPR)

We also describe another method for scrutinizing data using a "permutation null" test. The rationale for this test is simple: if the data are real, then randomly swapping the pre- and post-weathering data (to yield new synthetic ΔI and ΔX values) will yield a new dataset that looks different from the original one; the swap nulls the signal. If the datasets are just random noise, however, then the permuted dataset will look similar to the original one. After a large number of random draws, the p-value for the permutation null population represents the probability that the two datasets come from the same underlying (noisy) distribution.

Because both data distributions in the web tool are modeled as Gaussian distributions, the permutation null test is realized in the web app using a Monte Carlo approach — drawing from Gaussian distributions centered at zero, with 1σ corresponding to the slider values in the immobile (I) and mobile tracer (X) sections. A Pearson's correlation coefficient of 0.75 is applied to the two draws for the "paired" scenario to simulate the effects of sampling from the same physical locations. Each null draw runs through the same Bayesian inversion as the user's data, and counts as a failure when its inferred r or r·τ meets or exceeds the value from the user's data. The p-value is the fraction of 5,000 null draws that fail.

In the code we've uploaded to GitHub, which uses real data as an input, we use a full permutation-null sign flip for paired data or a pool-and-reshuffle approach for unpaired data using the distributions of real data. See the associated documentation for more details.

Data's importance to the posterior

Bayesian inversion for this problem asks: What is the likelihood that a certain amount of weathering occurred, given the data? To answer this question, one first needs to "guess" how much weathering occurred (a weathering "prior" in Bayesian parlance) and let the data turn that guess into a range of likelihoods (the weathering "posterior"). If the presence of data doesn't really influence the outcome, then you would say that the data aren't important to the posterior; the outcome primarily reflects the prior. In that case, the implied weathering extent is largely an artifact of the assumptions; the data are not decisive.

For example, the use of an uninformative or uniform prior for weathering extent (τ) bounded by the physical limits [0,1] is ultimately an assumption, and can generate artifacts in the posterior distribution at small sample sizes and/or large scatter in the data, in part because of the mass-balance formalism described above. Notably, results from the bayesian inversion will be drawn towards the median of the prior.

We therefore evaluated the role of data in shaping the posterior by examining the change between the prior and posterior for weathering extent. The specific variable we use is the posterior shrinkage factor, which characterizes the width of the posterior compared to the prior.

Figure 3. Schematic showing how a posterior distribution for τ can be characterized by its shrinkage factor. Left: the uniform prior on [0, 1]. Right: three posteriors with the same mean centered at τ = 0.5 but increasingly tight variances — no shrinkage (the data added nothing), moderate shrinkage (~ 0.5, equal importance of data and prior), and high shrinkage (~ 1, the data dominate).

Mathematically, the posterior shrinkage factor s is defined as s = 1 − Var_post/Var_prior, with values ranging from zero to one. The narrower the posterior distribution, the smaller its variance, and thus the higher its shrinkage factor (see Figure 3).

Rearranged, the shrinkage factor gives the relative weight of the data versus the prior in shaping the posterior as the odds s/(1 − s). Shrinkage factors of 0.5, 0.75, and 0.8 therefore correspond to 1:1, 3:1, and 4:1 data-to-prior odds, respectively.

We have higher confidence in the estimates and credible intervals derived from posteriors with high shrinkage factors, and suggest a threshold of 0.8 (i.e., 4:1 data importance on the posterior) to mirror the minimum 80% statistical power required from the forward tests. While not strictly a measure of statistical significance, the data's importance to the posterior is nevertheless vital for our confidence in how a given ERW dataset is being interpreted, because it means that the results are primarily caused by the data rather than assumptions about the data in the forward model.

Significant immobile tracer additions and insignificant mobile tracer additions

Some solid-phase datasets may produce results where a significant immobile tracer addition is accompanied by an insignificant mobile cation addition. Actually, this is exactly what you would expect at the completion of weathering: the signal from the immobile tracer is left behind, but all of the mobile elements have leached away, so that distribution should look like the baseline. However, at very small rock fractions, the same behavior (insignificant difference in mobile cations from the baseline) will be introduced at much earlier weathering extents. We show how this works in the figure below:

Figure 4. Bayesian posterior over rock fraction (r) and weathering extent (τ) when the immobile tracer change is small but significant (here: r ~ 1%), and the mobile tracer change is statistically insignificant. The hatched band marks where the mobile tracer cannot distinguish τ from 1. Drag the sliders to change the observed mobile change (ΔX) and its spread (σ_X); watch the posterior shift across the hatched region. The stats below report where the posterior actually sits — its median τ, its 30th-percentile τ, and the lowest dissolution fraction consistent with the data.

The hatched area in this figure represents the area where τ is statistically indistinguishable from 1. In this instance, the median τ is estimated close to 0.75 — but the entire region between 0.5 and 1 is statistically indistinguishable. This poses a challenge for two reasons: first — what should actually get credited? Should it be the minimum possible τ that could be explained by the data? Secondly, how will we understand the progression of weathering in this dataset — when would we expect τ = 1 to actually be achieved in this system? We don't claim to know the answer, but we do know such situations can commonly arise when application rates are small and sampling depths are large — and as such, this dynamic should play an important role in MRV design for ERW deployments.

A note on Spatial Autocorrelation

These tests do not assess one important component of any solid-phase dataset: spatial structure. Spatial autocorrelation — the relationship in soil composition of sampling locations based on their spatial distance — can overstate the number of truly independent measurements, which is what our forward and backwards models assume. As such, we should clarify here that the sample size slider is meant to represent the number of independent measurements, which may or may not equate to the number of sampling locations, depending on their degree of spatial autocorrelation.