Written by: Kacper Osiewalski, Lead Backend Engineer, Digital Colliers

The LinkedIn post said the hardest problem in high-transaction systems isn't throughput, it's agreement. This is the longer version of that argument. If you run a backend at a licensed operator, you already know the shape. Three teams look at the same customer, pull the same numbers, and land on three different answers. Nobody is wrong on their own terms. That's the problem.

What a semantic layer actually is

Strip the data-warehouse marketing off the term and a semantic layer is just this: a single, versioned place where business terms get defined in code. Not in a wiki. Not in a Confluence page nobody reads. In code, with tests, sitting between your raw event stream and everything that reads from it.

So when finance says "net deposits", risk says "net deposits", and the compliance team says "net deposits", all three are hitting the same function. Same filters. Same treatment of reversed transactions, bonus money, chargebacks, and cross-brand wallet transfers. If the definition changes, it changes in one place and every downstream consumer picks it up on the next run.

That's it. There's no magic. The magic is the discipline of not letting anyone bypass it.

The three-team disagreement pattern

Here's the pattern I keep seeing at operators between roughly £50M and £500M GGY. Finance owns a warehouse. Risk owns a real-time feature store. Compliance owns a set of SQL queries that get run against production replicas whenever the regulator asks.

Each team defines the same customer metrics slightly differently:

Finance treats a bonus-funded stake as revenue when the bonus is wagered through. Compliance treats it as exposure the moment it's granted.
Risk counts a deposit at authorisation. Finance counts it at settlement. Compliance wants both, timestamped separately.
All three handle self-excluded accounts, dormant accounts, and multi-brand wallets differently.

None of this matters until a regulator asks a specific question. Then it matters a lot. The UK affordability threshold kicks in at £150 net deposits per rolling 30 days, and if your three teams can't agree on what "net deposits" means, you cannot prove you flagged the right customers at the right time. Around 1 in 4 UK-licensed operators fails to achieve a satisfactory AML rating on first assessment, and in the reviews I've seen up close, definition drift is usually somewhere in the root cause.

What a good semantic-layer contract looks like

A usable contract has five properties. None of them are exotic. All of them get skipped under delivery pressure.

Every metric has one owner and one definition file. Not one team. One human name in the CODEOWNERS.
Every definition ships with example inputs and expected outputs. Golden records. Regression tests run on every merge.
Every definition is versioned. When you change how "lifetime deposits" treats reversed transactions, the old version keeps running for reports already in flight.
Every consumer is registered. You know which dashboards, which risk models, and which compliance queries read from each metric. Change impact is a query, not a Slack thread.
The RCI-relevant metrics are tagged as such. UK Gambling Commission RCI guidance has been in force since 31 August 2022 and expanded in 2024, and the regulator increasingly wants to see the logic behind your interventions, not just the outcomes.

The contract is boring. That's the point. Boring is what survives a section 116 review.

How this fixes audits, and the left-behind risk

When the compliance team can point at a versioned metric definition, show the test suite that proves it behaved correctly on the day in question, and produce the list of customers it flagged, an audit becomes a query instead of a project. Kindred publicly reported a £14M compliance-team cost in 2023. A material chunk of that kind of spend, at any operator, is people manually reconciling numbers that a semantic layer would reconcile once.

The left-behind risk is real and it's near-term. UK penalties for the most serious AML breaches reach up to 15% of gross gaming yield. GDPR fines still sit at €20M or 4% of global turnover. Operators shipping into 2026 with three teams and three definitions are one bad quarter from a finding they cannot defend in writing.

The teams that get out ahead of this aren't buying a product. They're picking a small number of metrics, usually the ones tied to affordability, AML, and RCI, and putting them under contract first. Then they expand. It's unglamorous work. It's also the difference between an audit that takes two weeks and one that takes two quarters.

What a semantic layer actually is

The three-team disagreement pattern

What a good semantic-layer contract looks like

How this fixes audits, and the left-behind risk

Sources

The £150 affordability trigger is a data-integration problem

The AML data problem is not a compliance problem

The hidden compliance overhead of poor data integration