Methodology

How we build, validate, and monitor our automated valuation model. Full model documentation available on request.

1. What We Do

The Gadsden Valuations AVM produces estimates of residential property market value for properties in England and Wales. Each valuation includes a point estimate, a valuation range, a per-property confidence tier, and the comparable evidence that supports the estimate.

The model estimates the price at which a property would transact between a willing buyer and a willing seller — consistent with the RICS definition of Market Value under IVS 104.

Intended use

We operate as a specialist service organisation under IVS 105.20. Our outputs are designed for review by a RICS Registered Valuer who applies professional judgement before issuing a Red Book compliant valuation. The AVM is not a standalone replacement for a physical valuation where one is required by regulation or lender policy.

Use cases include portfolio revaluation for Basel 3.1 compliance, remortgage lending where physical inspection is not proportionate, audit and cross-check against physical valuations, and desktop review support for Registered Valuers.

2. Valuation Approach

The model implements a market comparison approach — the same fundamental method used by human valuers when they identify comparable transactions and adjust for differences. This is the primary RICS-accepted method for residential property valuation.

Rather than manually selecting and adjusting a small number of comparables, the model learns implicit adjustment factors from millions of observed transactions. The relationship between property characteristics and transaction prices is learned statistically, then applied to properties where the sale price is unknown.

Algorithm

The model uses LightGBM (Light Gradient Boosting Machine), a gradient-boosted decision tree ensemble. This is the dominant method in academic property valuation literature and commercial AVM implementations. We chose LightGBM over neural networks and deep learning alternatives because academic evidence consistently shows that gradient-boosted trees match or exceed deep learning on structured tabular data, while offering substantially greater transparency.

LightGBM natively handles missing values without imputation — critical given that data completeness varies significantly across properties. It is also fully interpretable via SHAP (SHapley Additive exPlanations) values, which decompose every valuation into per-feature contributions. This satisfies the IVS 105.30(d) transparency requirement and the EAA ESSVM requirement that outputs must not be produced “through the sole deployment of black-box techniques.”

How comparable reasoning is encoded

The model’s learned representation captures the same logic a surveyor applies:

Location

Latitude, longitude, and derived spatial features capture micro-market effects. Properties in the same street with similar characteristics receive similar valuations.

Temporal adjustment

The ONS House Price Index and local price momentum features adjust for market movement between comparable transaction dates and the valuation date, analogous to a surveyor’s time adjustment.

Property characteristics

Floor area, bedroom count, property type, construction age, and condition proxies function as the adjustment factors a surveyor applies when comparing properties of different specifications.

Spatial comparable pricing

The median price per square foot of recent same-type sales in the immediate area provides a direct hyperlocal market signal.

3. Data Sources

Principles

All data sources are either UK government open data published under the Open Government Licence (OGL v3.0) or publicly available datasets with appropriate licensing. No proprietary mortgage data, no surveyor panel data, and no commercially restricted datasets are required.

This is a deliberate architectural choice. It ensures no intellectual property encumbrance on the model’s operation, full reproducibility, compliance with PS 1.4, and independence from any single commercial data supplier.

Primary sources

Source Publisher Records Role
Price Paid Data HM Land Registry ~31M Ground truth — actual sale prices for training and validation
Energy Performance Certificates DLUHC / MHCLG ~29M Floor area, construction age, rooms, energy efficiency, wall/roof type
House Price Index ONS National series Temporal adjustment — normalising prices across time periods
Census 2021 ONS Output area level Neighbourhood demographics, deprivation, income
Schools data DfE / Ofsted ~52K School proximity and quality features
Flood risk Environment Agency National coverage Flood zone classification
Crime data Home Office / Police.uk LSOA-level Local crime rate features
Heritage listings Historic England ~380K Listed building status
Transport ORR / NaPTAN ~2,500 stations Station proximity and journey times

Derived features

In addition to direct inputs, the model uses computed features including HPI-adjusted previous sale prices (where a property has transaction history), spatial neighbour pricing (median price per square foot of recent nearby sales), school proximity metrics, and local transaction volume indicators.

Data quality

Land Registry Price Paid Data is the ground truth. We exclude transfers at nil consideration, right-to-buy sales, auction premiums, and transactions below £10,000 or above £10,000,000. EPC data is matched to properties via UPRN (Unique Property Reference Number) with a 99.3% match rate.

4. How We Handle Missing Data

LightGBM handles missing values natively by learning optimal split directions for absent features during training. We do not impute missing values, because missing data is itself informative — a property without an EPC may correlate with age or type — and imputation introduces bias if the missing-data mechanism is not random.

This approach directly supports our confidence model. Properties with more complete data receive higher confidence scores; properties with significant gaps are flagged accordingly.

5. Validation

Walk-forward backtesting

We use temporal splitting, not random sampling. The model is trained on transactions up to a cutoff date and tested on transactions after that date. This prevents data leakage and mirrors real-world conditions: the model never sees future market data during training.

The current test set comprises approximately 295,000 transactions — a statistically robust sample across all regions and property types.

What we measure

PE10 — the proportion of valuations within ±10% of the actual Land Registry sale price. This is the industry-standard accuracy metric for AVMs. We also track PE15, PE20, MdAPE (median absolute percentage error), mean signed error (directional bias), and Forecast Standard Deviation (FSD) on the EAA 0–7 scale.

What we measure against

All accuracy metrics are measured against actual Land Registry completion prices — the amount paid in arm’s-length transactions. This is a harder benchmark than surveyor opinion (which is itself an estimate with ±5–10% variability) or asking prices (which reflect seller aspiration). We believe testing against actual sale prices produces more honest accuracy figures.

Live metrics are published on the accuracy page.

6. Confidence Model

Why confidence matters

A valuation supported by 20 recent comparable sales within 500 metres, for a standard property type with known floor area, is fundamentally more reliable than one for a unique property in a thin market with no comparable evidence. Our confidence model quantifies this difference, as required by IVS 105.30(a) and the EAA ESSVM.

How it works

Each valuation is assigned a confidence tier based on data availability and comparability:

Tier 1 (High)

Dense comparable evidence, known floor area and bedrooms, common property type, active local market. The valuation range is narrow. Suitable for valuer review in lending decisions.

Tier 2 (Medium)

Reasonable comparable evidence with some data gaps. Moderate valuation range. Desktop review recommended.

Tier 3 (Low)

Sparse comparables, missing key data, or unusual property. Wide valuation range. Physical inspection recommended.

Declined

Insufficient data for a meaningful valuation. The model does not produce an estimate.

Confidence validation

Our confidence model has been validated against the backtest dataset. Higher-confidence valuations are empirically more accurate than lower-confidence ones, with strong rank correlation between predicted confidence and observed accuracy. The confidence tiers effectively separate properties into groups with meaningfully different accuracy profiles.

Upgrading confidence

Confidence is not fixed. Where a property is classified as Tier 2 due to missing data, supplying an EPC (which provides floor area, rooms, construction age, and energy data) can upgrade the valuation to Tier 1. An EPC costs £60–120 — substantially cheaper than a physical surveyor.

7. Known Limitations

We take the view that documenting limitations honestly builds more trust than ignoring them. These are real constraints on any AVM, including ours.

What the model cannot see

Internal condition

A recently refurbished property and one requiring complete renovation may have identical data profiles but materially different values. This is the primary limitation of any AVM and the principal reason IVS 105 requires professional judgement alongside model outputs.

Bespoke improvements

Extensions, loft conversions, and landscaping that have not triggered a new EPC or planning record are invisible to the model.

Legal encumbrances

Restrictive covenants, rights of way, party wall issues, and leasehold complications beyond Land Registry records.

Micro-location effects

The specific street, view, aspect, or neighbour that a local agent would factor into their advice.

Property types not valued

The model does not value commercial property, mixed-use property, agricultural land, new-build developments prior to first sale, non-standard dwellings, or properties above £10,000,000. It covers England and Wales only.

Market conditions where accuracy may deteriorate

In rapidly moving markets, the ONS HPI temporal adjustment may lag by 1–3 months. In thin markets (fewer than 50 transactions per year for the relevant property type), comparable evidence is sparse. The confidence model flags both scenarios.

Data lags

Land Registry data lags 2–4 months from completion to publication. In a fast-moving market, the most recent comparables visible to the model may be 3–4 months old. The HPI adjustment partially compensates, but extreme short-term movements may not be fully captured.

Accuracy is not uniform

The model performs best for standard property types in areas with high transaction volumes and worst for unusual properties in thin markets. Our confidence model makes this variation explicit and actionable rather than hiding it behind a single headline number.

8. Explainability

Every valuation is explainable at three levels, satisfying IVS 105.30(d) and the EAA requirement for outputs produced “in a replicable, explainable, traceable manner.”

Global feature importance

SHAP analysis reveals which features most influence valuations across the entire property stock: floor area, location, property type, previous sale history, and neighbourhood characteristics dominate.

Per-valuation decomposition

For every individual valuation, SHAP values show why the model valued this specific property at this specific price. A Registered Valuer can see that, for example, floor area contributed −£8,000 (below average), while location contributed +£12,000 (desirable suburb), and challenge any element they disagree with.

Comparable evidence

Each valuation is accompanied by the most similar recent sales in the area, showing address, sale date, sale price, key characteristics, and distance. This gives the Registered Valuer the same evidence they would seek independently.

9. Ongoing Monitoring

Quarterly retraining

The model is retrained quarterly to incorporate approximately 250,000 new Land Registry transactions, updated EPC certificates, and refreshed ONS HPI data.

Drift detection

We monitor:

  • Accuracy drift — quarter-on-quarter comparison of all headline metrics
  • Feature drift — changes in input data distributions
  • Bias drift — the mean signed error should remain near zero
  • Confidence calibration drift — the relationship between predicted confidence and actual accuracy

A sustained PE10 decline of more than 2 percentage points triggers investigation. Material market events may trigger ad hoc retraining outside the quarterly cycle.

Version control

Every model version is retained with its training script, data specification, and full metric report. Any valuation can be reproduced using the model version that was live on the valuation date, satisfying IVS 106.20 documentation requirements.

10. Standards Alignment

This model and its documentation are prepared with reference to:

IVS 105 (Valuation Models)

Model characteristics, selection, use, and documentation requirements. The full model documentation maps every IVS 105 requirement to a specific section.

RICS Red Book Global Standards (January 2025)

PS 1 (compliance for written valuations incorporating AVM output), VPS 5 (valuation models), VPS 6 (reporting).

PRA SS1/23 (Model Risk Management)

Documentation expectations for third-party vendor models used by regulated firms. Our documentation is structured to support lender vendor assessment processes.

EAA ESSVM 3rd Edition (2022)

European AVM Alliance standards for model description, methodology disclosure, accuracy reporting, and confidence scoring. We build to ESSVM standards with the intention of membership application once eligibility criteria are met.

11. Full Documentation

This page summarises our approach. The complete model documentation — including detailed feature descriptions, training parameters, segmented accuracy analysis, bias reporting, extreme scenario testing, and governance procedures — is available on request.

If you are a Registered Valuer, lender risk team, or regulator evaluating our model, please contact us at [email protected] to request the full documentation package (document reference GV-AVM-DOC-001).

© 2026 Gadsden Valuations. Not a RICS-regulated firm. Valuations are automated statistical estimates, not formal Red Book valuations.