Essay · dashboard-vincolo-regole-codificate

The dashboard is not the product. It's the rule that stops it from lying.

For the class of problems with subtle domain rules and very few end users, the cost of building the tool that encodes those rules in-house has collapsed. The real asset isn't the dashboard, it's the library of codified rules written as tested functions.

Andrea Iorio
Executive · AI builder
Tuscany / IT
MARGIN · MONTHLY READING NAIVE · DRIFT RULE-ENFORCED Q1 Q2 Q3
FIG. 01 — il margine, senza e con il vincolo

Anyone running a business unit that sells through a third-party channel, an international marketplace, a distributor, a wholesaler, knows this scene. Every month a settlement file arrives: orders, returns, refunds, commissions, payment fees, contributions, duties, adjustments. Dozens of columns, a few thousand rows. From that file you reconstruct the channel's real margin: what came in, what went out, what is left. From those numbers you decide the next buying campaign, the next push on a category, the next negotiation with the partner.

The problem isn't volume. The problem is that this file is someone else's accounting summary, written according to rules that serve that someone else, and your P&L reads it differently. Three recurring asymmetries trip up every generic tool.

First: the same cost is written on two separate rows. Marketplaces issue, for every order, an "order" level row (shipping, contributions, duties) and an "item" level row (commissions, payment fees, discounts, refunds). For costs that can appear at both levels, the credit card fee is the typical case, the same amount is written on both rows. A naive sum counts it twice. On a channel doing hundreds of thousands of euros in volume a month, the double count distorts the real margin by a few percentage points every month. No generic BI knows this specific trap, because it's a channel convention, not a property of the CSV.

Second: the manufacturer name isn't in a dedicated field. It has to be inferred from the description text, usually a split on the hyphen, with all the edge cases (multi-word names, hyphens inside the model, malformed descriptions, suppliers using proprietary schemes). If brand classification gets it wrong, every "by brand" aggregation that buying decisions rest on is wrong.

Third: record classification (sale, return, partial refund, adjustment) and the join against your own internal price list require field combinations that vary by season. The same product may have been sold across two collections with different codings. The internal price list and the marketplace file use different identifiers and neither guarantees retroactive consistency.

Every generic tool solves 80% of the read pipeline. It shows charts, aggregates, filters. What it cannot do is the remaining 20%: apply these three families of subtle rules. For years that 20% was handled by hand by a person reopening the file in Excel and patching the totals with formulas only she knew, or it was simply ignored, and a few important decisions were made on numbers that were systematically off by some percentage. A percentage that, across the year, becomes a line on the balance sheet.

This article is about how that calculation has changed, and why anyone running a P&L today has an option they didn't have until recently. The thesis: for the class of problems characterised by subtle domain rules and very few end users, the cost of building the tool that encodes those rules in-house has collapsed, enough to make it rational, in many specific cases, to stop bending to the generic tool and start building your own. Not as a tech whim: as an allocation decision. The real asset you build isn't the dashboard. It's the library of codified domain rules, written as tested functions, which becomes a portable, vendor-independent piece of IP.

For the thesis to hold, the mechanism has to hold. It is worth looking at what actually happens underneath.

The need, translated into three operational guarantees

Put yourself in the position of someone running the channel's P&L. There are three needs, and each translates into a property the system must own, not as a soft objective, but as an enforced constraint.

Trust in the number. The monthly margin reported to the board has to be right to the cent, or at least not systematically wrong. A random error of a few euros is folklore. A systematic error that deflates or inflates the margin by a few percentage points every month steers long-term decisions. Translated into a constraint: the calculation of cost items that can appear at both levels (credit card fee, effective commission, promo code cost, price differences, return contributions, cross-border duties) has to be written once, in a pure function, and there has to be an automated test that states in explicit language: if the same fee appears at order level with value X and at item level with the same value X, the total is X, not 2X. That test has to sit on the list of non-negotiable tests, run in CI before every release. It's one line of code. It's worth, in many businesses, tens of thousands of euros of correctly seen margin every year.

Repeatability. A report done by hand by a smart person isn't a system, it's a service. If the person leaves, the report breaks; if she gets sick, the board waits; if this month she applies an "improved" formula the historical series becomes incomparable. Translated into a constraint: the process has to be idempotent by construction. The same file re-imported two, three, five times has to produce zero new rows. The implementation doesn't require a staging table or a diff system: it requires a composite UNIQUE constraint on the orders table, in the typical case UNIQUE(order_id, sku, transaction_type, entry_type, transaction_date, posting_date, version), combined with INSERT OR IGNORE and the whole file handled as a single SQL transaction. On unhandled exception, ROLLBACK and the file stays in the drop directory waiting for retry. On success, commit and move the file to archive with a timestamp prefix. The property that comes out, "I can always redo the process, without fear", is the kind of guarantee that those who have never had it don't understand they want. Anyone who has lost it once, after an afternoon spent cleaning up a duplicated database, demands it forever.

Control of the data. Settlement files contain PII (customer names, addresses, tracking codes, VAT numbers), confidential prices from your own list, per-product margins. Translated into a constraint: PII fields are never copied into the artefacts the dashboard consumes. They stay confined to the local database. The distinction isn't a policy written in a compliance PDF: it's a function, at export time, that explicitly lists the fields to exclude (Ship-to Name, Ship-to Address, Ship-to Phone, Ship-to ZIP, Tax Registration Number, Tracking Code, AirWayBill No, Invoice No) and filters them at source. PII minimisation by default: what isn't needed, doesn't leave. The difference between a policy and a function is the same as the difference between a good intention and a guarantee.

These three needs, trust, repeatability, control, aren't obvious to satisfy together. The options the market has offered for years satisfy them only partially, each in its own way.

Yesterday's three options, and what each one gave up

Buying a cloud BI. It satisfies repeatability (the process is the platform's process) and gives you excellent charts out of the box. It gives up on point one: the vendor doesn't know the double-level rule of your specific channel, and to teach it you need custom configuration, semantic layer, calculated fields, derived measures, which costs weeks of consulting and which stays an asset of the vendor, not yours. It also gives up on point three: data has to leave the company to be useful, history settles into someone else's platform, and any change of supplier three years later requires a migration controlled by whoever owns the destination database. For those paying a licence every month, the math is "I give up 20% of accuracy to get 100% of a nice view". For many businesses, that's rational. For many others, it's what you do while waiting for a real solution.

Commissioning custom software from a software house. In theory it satisfies everything. In practice it introduces two costs that often aren't budgeted. The first is the cost of translation: the people who know the domain rules don't write the code, the people who write the code don't know the rules, and what comes out after three rounds of "not quite this, try again" is a compromise between the two misunderstandings. The second is permanent dependency: the day a change to the effective commission rule would take half a day, it goes through a quote request, a sprint, an invoice. The core property of the software you want is "modifiable from the domain", and this option generally doesn't guarantee it.

Excel. It survives not because it's good, but because it's under control. It satisfies point three (the file lives on disk) and ignores the other two on purpose. Excel doesn't offer an enforced UNIQUE constraint: copying a wrong formula one cell down can warp a quarter without anyone noticing. It doesn't offer a test that turns red when someone modifies a formula incorrectly. It doesn't offer separation between source of truth (the structured raw data) and visualisation (the pivot you look at). It doesn't offer idempotency: re-opening a file twice is trivial, re-applying a transformation twice knowing it won't duplicate anything is expensive, and Excel doesn't help. Above all, it doesn't offer the practice of writing the rule in explicit language before applying it: the double-level rule, in Excel, is a nested SUMIFS that whoever reads it a year later can't interpret; in an engineered system it's a SQL CASE WHEN entry_type='Line' THEN ... ELSE 0 clause called by a compute_monthly_kpis function with an associated test named, in human language, test_cc_fee_no_double_count.

For years these three options covered the space. Those running P&Ls with their own rules knew they were living with a compromise. It was naturalised enough to no longer be discussed.

The fourth option: why it holds technically, and what it costs

The option that exists today, and that wasn't practical until recently, is the fourth: the business unit lead has a tool built, with the help of a mature AI, that belongs to him, that encodes the rules of his own domain in explicit, testable form. Single-user or small team. Deliberately minimal stack: Python with pandas to parse the CSV, SQLite from the stdlib as the database (zero external dependencies, file on disk), static HTML with Chart.js served from disk for the dashboard. No Docker, no container orchestration, no runtime CDN, no monthly subscription to anyone.

This stack holds technically for a precise reason: the real volumes of the problem. A few thousand order rows a month, a few tens of thousands of price list rows. Over a five-year horizon you're looking at a few hundred thousand total order rows, which in denormalised form, a single data/orders.json with everything in it, weighs a few dozen megabytes. The browser loads it once when the page opens and filters in JavaScript in memory. A full re-render of filters on fifty thousand rows stays under half a second; year-scale aggregations stay under a hundred milliseconds. It's an architecture that at a million rows would be wrong and that at a hundred thousand is unbeatable for total cost, perceived latency, and debuggability, the data opens in any text editor and reads.

The compression trick that makes it light deserves a moment. The export JSON isn't an array of {order_id: "X", sku: "Y", ...} objects repeated for every row. It's an object with two keys: schema (the list of column names, once) and rows (an array of arrays of values). Size reduction around forty per cent compared to the object-per-row format, and the dashboard reads the schema once on startup, builds a column-name → array-position index, and from then on indexes each row with row[IDX.brand_norm]. It's a choice that takes a few lines of code and shifts the asymptote of the problem by a factor of two, the difference between instant loading and perceptible loading.

The system runs in one-shot batch, no scheduler, no daemon. Drop a CSV into an input directory, run a script, move the processed file to archive with a timestamp, open the dashboard in the browser. A portable bundle that unpacks on a corporate Windows machine in about ten minutes, served behind VPN. No Docker, no cloud, no monthly subscription to anyone.

The concrete solutions, seen as answers to needs

It's worth looking, paragraph by paragraph, at how each business need translates into a technical constraint, and what follows for the person deciding.

Don't sum twice a cost the file repeats at two levels. The subtle rule: item-level costs (effective commission, credit card fee, promo code cost, price differences, free returns contribution, cross-border duties) are summed only from entry_type='Line'; order-level costs (shipping subsidies, shipping order, partner contributions on shipping, origin and destination digital services tax) are summed only from entry_type='Header'; special items (special payment, adjustment, no-stock voucher) follow the Line row if it exists, otherwise the Header. Implemented as CASE WHEN entry_type='Line' directly in SQL, not as a Python filter post-fetch, so the constraint lives in the calculation engine and not in an application layer that might not be called. Associated test: test_cc_fee_no_double_count, declared explicitly in the mandatory regression suite. And so: the margin reported to the board never deflates from double-count, and any future system change that breaks this property fails in CI before reaching production. The difference between the system with this constraint and the one without is measured in percentage points of wrong margin every month.

Reconstruct the brand when the file doesn't provide it cleanly. The fallback chain is explicit and ordered: (1) split description.split(" - ") with normalisation, (2) lookup against the price list on (brand_norm, sku_norm, season) with the brand taken from the matched key, (3) manual override curated in brand_overrides.csv on designer_id, (4) final fallback brand='UNKNOWN' with WARN in the log. Every row carries, in a brand_source field, which level of the chain resolved it: description, listino_lookup, override, unknown. Consequence: the data quality question anyone running the P&L should ask themselves every quarter, what percentage of my orders is going through the weakest fallback?, becomes a one-line SQL query. Without the trace, the question can't even be asked. With the trace, the check costs thirty seconds and enables a concrete conversation with the supplier when the description percentage drops below threshold.

Match orders against the internal price list to measure the real margin. Two-pass match. Step 1 (preferred): supplier product identifier (product_id) against item_id in the price list for the same season, exact, ignoring brand noise and normalised SKU. Step 2 (fallback): tuple (brand_norm, modello_variante_norm, season) with four increasing tolerances, strict (exact match), season_fb (adjacent season, e.g. AW25 found in SS25 if AW25 is missing), relaxed (ignore season), relaxed_fb (combine the two), plus special rules for suppliers with non-standard SKUs. Every row carries how it was matched in prezzo_match_source. Result: the day a margin looks strange, the question "is it a calculation error or an overly generous fallback?" has an answer in thirty seconds via a filter on provenance, not in an afternoon of investigation. And when you update the current season's price list, incremental caching via mtime reloads only the modified file (not the entire seventy-thousand-row list), and a backfill function automatically re-evaluates existing orders with the new price. The operational property that comes out: updating the price list is a one-minute action, not a migration.

Handle errors without bringing everything down. Explicit three-level taxonomy, with different effects on file and on DB. FATAL (CSV header mismatch, empty file, encoding not utf-8-sig, unrecoverable exception during pd.read_csv): rollback of the whole transaction, the file stays in fatture/ for retry, structured log in errori_import_YYYYMMDD.log. ROW_REJECT (unparseable decimal, invalid dates, mandatory fields NULL): the single row is discarded and tracked in errori_righe.csv, the file proceeds. WARN (unknown transaction type, unresolvable brand, season outside the price archive): the row is imported with an explicit flag and tracked in the type log (warn_*.csv), the file proceeds. And so: whoever operates the pipeline always has a deterministic summary after every import (X rows inserted, Y skips, Z warns), and errors go into three separate queues that can be handled independently. No files in limbo, no database in inconsistent state, no "I don't know what happened".

Handle supplier corrections without losing history. Marketplaces periodically correct orders issued months earlier, and re-send them with the same order_id and an incremented version field. The UNIQUE constraint includes version, and the upsert_row function DELETEs the previous version before INSERTing the new one only if the new one is actually greater. Re-importing v1 after v2 has already been ingested is a no-op. It follows that: the historical series stays consistent over time even in the face of asynchronous supplier corrections, and anyone looking at a retroactive margin always sees it as the most recent version, not as the first submission. This is a property you notice only when it isn't there: it usually shows up as a duplicate that inflates revenue three months later, and that nobody can explain anymore. An integration test in the regression suite (test_versioning.py) guarantees that v1→v2 substitution works identically at every release.

Aggregate cost items that don't come from the supplier file. There are periodic costs (Performance Program, Main Incentive, Packaging Incentive) the channel file doesn't contain and that live in a separate Excel sheet curated by hand. The system loads them into a monthly_extras table with key (year, month, cost_type). Source-of-truth is the file: absent in DB → INSERT, different → UPDATE (the xlsx wins), identical → no-op. Rows in DB no longer present in the file stay, so historical corrections aren't lost. Sign convention: positive = cost incurred, negative = credit. And so: the "unstructured" cost items every company has, the ones that in many generic BIs end up in a free-text field or, worse, outside the calculation, enter the P&L waterfall in a controlled way, and a correction to the Excel sheet propagates to the margin at the next import without technical intervention.

Visualisation and PDF. The dashboard is multi-page in Vanilla JavaScript, no framework, no bundler. Every page exposes the same control contract, and an export page concatenates all sections into a single A4 layout on top of a print stylesheet. The user presses Ctrl+P, picks "Save as PDF", and gets a board-ready monthly report with header and page counter. Generation engine: the browser itself. Additional server-side dependencies: zero. And so: the monthly PDF doesn't pass through any server pipeline, doesn't need Playwright or Puppeteer, has no extra infrastructure cost. Single-user tool served by an architecture coherent with its own scale.

What follows, for the person deciding

The recurring move in every paragraph above is the same. Each business need is translated into a small, local, explicit constraint: a one-line test, a composite SQL constraint, a PII filter function, a fallback chain with tracked provenance. Each constraint is written once and protects forever, not because someone remembers to respect it, but because the system enforces it. That's exactly the difference between a practice and a system: the practice depends on the discipline of whoever applies it today, the system depends on the constraint written yesterday.

From this move flow four consequences the C-level should look in the face.

The library of codified rules becomes a capitalisable asset. The value that accumulates over time isn't the dashboard, it's the list of rules, the double-level rule, the brand extraction chain, the price match tolerances, the cost items at both levels, the PII filter at source, the UPSERT versioning. Every rule written is a piece of the company's operating model in machine-readable form. Three years out it's worth more than the tool that contains it, and it's easily portable to any later technology, because it lives as pure functions with associated tests, not as proprietary vendor configuration. For a CFO or COO it's a transferable asset, not an operating cost.

Vendor lock-in disappears for this class of tool. History, three, five years of margin by brand, season, gender, geography, lives in a local SQLite file you copy, version, archive. You don't negotiate with anyone to get it back. It's a concrete operational risk item that leaves the register. Anyone who has tried to extract history from a cloud BI when changing vendors knows how heavy that item is.

Who can start an internal project changes. Until yesterday, to get a tool dedicated to your own P&L you had to open an RFP, secure budget, choose a vendor, wait four or six months. It was a long enough path to discourage most of the projects that, in the abstract, would have been useful, and that stayed undone. Today the business unit lead can prototype the tool in 40-120 hours spread over two-three months, find out if it holds on real data, and only if needed hand it to a development team to industrialise it. For anyone running an organisation, that's good news: most of the internal tools that today never get built, and whose absence is endured in silence, become cheap to try.

Asymmetry of the bet. A week of prototyping with possible abandonment costs little. Years of subscription to a SaaS that doesn't answer to your rules cost a lot, in licence, in time lost reconciling numbers, in wrong decisions made on "close enough" data. When the ratio between what you risk and what you can gain is this skewed, the rational default changes.

What does NOT change, and which objections hold

Intellectual honesty: this option doesn't replace everything. For systems that serve many users concurrently, that integrate many external suppliers, that have enterprise security requirements, that need to scale to millions of records, professional software built by professional teams remains indispensable. The point isn't "anyone builds anything". It's that there is today a subset of problems, the ones with one or a few users and business-specific rules, that has left the perimeter of necessary consulting and entered the perimeter of reasonable in-house production.

And let's say it clearly: AI alone isn't enough. Without the client's engineering discipline, the practice of writing the rule before the code, testing it in isolation, locking architectural decisions with their rationale, keeping an open technical-debt register, what comes out is a jumble of plausible features that breaks at the first change. In the system described above the discipline is explicit: a .planning/ directory with sixteen requirements tracked with stable identifiers, thirty-five architectural decisions locked with rationale, a regression suite of five mandatory tests (UPSERT versioning, CC fee double-count, unknown transaction type, re-import idempotency, FATAL header mismatch), a debt tracker with open and closed items, an architectural reference of six hundred lines that documents SQLite schema, import pipeline, Header/Line rule, fallback chain, price match, KPI engine, frontend, deploy. Discipline doesn't come free from the model. It comes from the client demanding it.

Three objections deserve a serious answer.

Who maintains this tool if the person who built it is no longer there? Honest answer: the risk isn't zero, it's manageable on one condition, that documentation is part of the deliverable from the start. Architectural reference, decision register with rationale, debt tracker. With this base, a third-party developer picks up the system in days, not weeks. Without, the risk is high. The difference between "makes sense" and "doesn't make sense" sits here.

Excel with macros does the same thing. No, for the reasons seen above: no enforced UNIQUE constraint, no test that turns red on a wrong formula, no separation between source of truth and visualisation, no practice of writing the rule in explicit language before applying it. The difference isn't cosmetic.

AI makes mistakes, code has bugs. True. The mitigation is the same one you would apply to any fallible code writer: TDD on the pure functions (realistic one hundred per cent branch coverage, because they are pure functions), non-negotiable regression suite in CI, manual E2E against a set of reference totals computed by hand (for example tests/e2e/test_full.py comparing the KPIs computed on four thousand real rows against a manual_totals.json to the cent). It isn't a guarantee of zero errors. It's a standard control arsenal applied to the collaborator you have. It works if you actually apply it, fails if you write code without tests and hope.

What changes, from tomorrow, for the person deciding

For anyone running a business unit with their own sensitive data, and domain rules the generic SaaS doesn't see: the default of the "buy or build?" choice deserves to be reconsidered for the specific category of low-user internal tools. Not for everything, not always. But a week of experimentation has a low enough cost to pay for it to find out which side of the calculation you're on. Concretely: take the settlement file of your most important channel, identify the three subtle rules that today are applied by hand (or ignored), and try to encode them as tested functions. If in two weeks there's a pipeline running with a green test, the math is done. If there isn't, you've learned something about the specific problem, and the cost was still lower than a quarter of subscription to the wrong BI.

For vertical and generic SaaS vendors: margin erodes where the customer simultaneously has an AI available and a domain expert willing to apply discipline. It doesn't erode everywhere, and not on everything. It erodes exactly in the cases where the difference between "close enough" and "exact" is worth money visible in the accounts. The sensible response isn't to deny the phenomenon; it's to ask which part of the product still has value, certified connectors, compliance, enterprise support, scale, complex integrations, and to invest there. The vendor's product that stays competitive is the one whose value proposition isn't "I show you your numbers" (the user can now do that himself) but "I guarantee the integration, the compliance, the SLA, and the scale management you wouldn't want to handle on your own".

And the competitive asset to build over the years isn't the dashboard. It's the library of codified rules. Every rule you write in code is a piece of IP that accumulates: the double-level rule of the settlement file, the brand extraction chain, the price match tolerances, the PII exclusions at source. Four lines of code and a test, each. Summed, they become the company's operating model written so a machine can apply it every month without forgetting anything. In five years that library will be worth more than the tool that contains it. The dashboard is how you look. The rule is what you know to be yours.

The central point, in one sentence: the value isn't in the chart, it's in the constraint that stops the chart from ever lying. Anyone with this intuition who puts it into practice today builds an asset that a few years ago would have been unthinkable to build without a dedicated development team. It's a small, local, unflashy change. It's exactly the kind of change that, summed over time, shifts the balance of entire markets.

Il valore non è nel grafico. È nel vincolo che impedisce per sempre al grafico di mentire. POV
◆ ◆ ◆