The Problem with Manual Quotation at Scale

In traditional distribution operations, sales administrators spend countless hours matching incoming customer enquiries to tens of thousands of SKUs. Enquiries arrive in inconsistent formats — ranging from supplier part codes and colloquial trade names to partial descriptions and mis-spellings — while inventory names follow internal conventions that rarely align with how customers phrase requests.

With over 15,000 items in stock, a single line item in an enquiry may produce dozens of plausible candidate matches. A human operator must then reason across multiple attributes — dimensions, material grade, brand, finish, and packaging unit — to determine the correct SKU. This process is slow, error-prone, and heavily reliant on tacit domain knowledge that takes years to accumulate.

As business volume grows, the challenge compounds non-linearly. Delays and errors increase, staff must scale linearly with enquiry volume, and onboarding new hires creates a continuous knowledge-transfer bottleneck that no SOP document fully solves.

The system described here eliminates this bottleneck. It automates the full quotation pipeline — from raw enquiry parsing through SKU resolution, stock availability checking, and quotation draft generation — enabling distributors to scale throughput without proportional headcount growth.


System Architecture

The solution is built as a modular inference pipeline with four distinct stages:

1. Enquiry Parsing and Normalization

Incoming enquiries — whether from email, WhatsApp, or web forms — are first passed through a normalization layer. This stage performs:

  • Tokenization and entity extraction: Identifies candidate attribute tokens such as dimensions (e.g., 50mm, 1/2"), material grades (e.g., SS316, GI), product categories, and brand names using a combination of regex patterns and a fine-tuned NER (Named Entity Recognition) model.
  • Unit harmonization: Converts mixed unit systems (imperial/metric) and abbreviations into canonical internal representations before embedding.
  • Query segmentation: Splits multi-line enquiries into discrete line items, each treated as an independent matching task.
Enquiry Parsing and Normalization
Enquiry Parsing and Normalization

2. Autoencoder-Based Semantic Matching

At the core of the engine is a sparse autoencoder trained on historical quotation pair data — i.e., (enquiry description, resolved SKU) pairs extracted from years of past transactions.

The encoder maps each text description into a dense, low-dimensional latent vector that captures the semantic "meaning" of a product — independent of surface-level phrasing. Crucially, the latent space is structured such that SKUs sharing attributes (same category, similar dimensions, same material) cluster together geometrically. This allows the system to retrieve correct matches even when the enquiry wording has zero lexical overlap with the inventory name.

Architecture highlights:

  • Input representation: Concatenation of TF-IDF sparse features and character-level n-gram embeddings to handle both known vocabulary and novel abbreviations
  • Encoder: 3-layer feedforward network with ReLU activations and dropout regularization (p=0.2) to prevent overfitting on frequent SKU patterns
  • Bottleneck dimension: 128-dimensional latent space (tuned empirically against top-k retrieval accuracy on held-out validation data)
  • Training objective: Reconstruction loss (MSE) + contrastive loss term to push semantically dissimilar SKUs apart in latent space
  • Inference: Approximate Nearest Neighbour (ANN) search via FAISS over the precomputed SKU embedding index, returning ranked top-k candidates in <20ms
Autoencoder-Based Semantic Matching
Autoencoder-Based Semantic Matching

3. Index-Bound Output — Hallucination Eliminated by Design

This is a critical architectural property that distinguishes this approach from naive LLM-based solutions: the model never generates product information — it retrieves it.

The ANN search returns integer indices into the live inventory database. Every resolved SKU, its description, price, and stock level are pulled directly from the source-of-truth record corresponding to that index. There is no generative step that could fabricate a product code, invent a description, or hallucinate a price.

This means:

  • Non-existent SKUs cannot appear in output — the output space is strictly bounded by the inventory index
  • All quoted attributes are database-sourced — dimensions, unit pricing, and availability are fetched from live records, not inferred
  • LLM involvement is constrained to prose formatting only — if an LLM is used to compose the quotation email body, it receives only verified, index-resolved structured data as context, with no freedom to generate product-level details

This index-bound retrieval architecture provides a stronger correctness guarantee than prompt engineering or output validation ever could — hallucination is architecturally impossible at the product resolution layer, not merely discouraged.

Index-Bound Output — Hallucination Eliminated by Design
Index-Bound Output — Hallucination Eliminated by Design

4. Hybrid Re-Ranking with Fuzzy Logic

Pure semantic similarity can fail on highly constrained industrial SKUs where a single attribute difference (e.g., M8 x 1.0 vs M8 x 1.25 thread pitch) means an entirely different, non-interchangeable product. To handle these edge cases, the system applies a hybrid re-ranking layer that combines:

  • Attribute-level exact match scoring: Extracted structured attributes (from Stage 1) are compared directly against the structured attribute fields of each candidate SKU. Mismatches on critical dimensions apply a configurable penalty weight.
  • Fuzzy string similarity: Levenshtein distance and Jaro-Winkler similarity scores are computed between the normalized enquiry token and each candidate's product code, catching common transpositions and abbreviation variations.
  • Weighted ensemble score: The final ranking is a weighted sum of the semantic similarity score (from the autoencoder), the attribute match score, and the fuzzy string score. Weights are tuned per product category, as criticality of exact attribute matching varies (e.g., fasteners vs. consumables).

This hybrid approach gives the system the recall advantages of neural embeddings with the precision of rule-based systems — avoiding the failure modes of each when used alone.

5. Quotation Draft Generation

Once top-k candidates are resolved and confidence thresholds are cleared, the system queries the ERP/inventory system via API for real-time stock levels and pricing. It then assembles a structured quotation draft, with low-confidence matches flagged for human review rather than silently passed through. This human-in-the-loop design is intentional: it preserves operator oversight on ambiguous cases while automating the high-confidence majority.


Why Training Data Quality is the Real Moat

The phrase "Garbage In, Garbage Out" is repeated often in ML circles, yet consistently underestimated in project planning. Training data quality is not just a concern — it is the primary determinant of whether the system is operationally viable or a costly demo.

Historical quotation records from distributors are typically messy: inconsistent column schemas across years, free-text description fields with no normalization, resolved SKUs sometimes recorded as descriptions rather than codes, and duplicate or contradictory entries from manual corrections. Raw data cannot be fed directly into training.

The data engineering pipeline addresses this through:

  • Deduplication and conflict resolution: Identifying and resolving contradictory (enquiry, SKU) pairs using majority-vote aggregation across transaction history
  • Schema normalization: Mapping legacy column formats across multi-year export files into a unified canonical schema
  • Negative sampling strategy: Generating hard negative examples (near-miss SKUs that differ on a single critical attribute) to force the model to learn fine-grained discrimination rather than coarse category matching
  • Stratified train/validation split: Ensuring rare SKU categories are represented in validation to prevent inflated accuracy metrics driven by high-frequency items

This data pipeline is the unsexy, unglamorous foundation that determines everything. A well-architected neural model on poor training data will underperform a simpler model on clean, well-curated data every time. The boring things done right, not the jargon-heavy architecture diagrams, are the true differentiator in applied ML.


Business Impact

AI has reached operational maturity for mass-adoption in distribution workflows, and is actively replacing repetitive cognitive labour. The competitive gap between early adopters and laggards is widening — not in years, but in quarters.

Distributors adopting this system can expect:

  • Up to 66% reduction in labour cost for the quotation function, as high-confidence matches are handled fully automatically
  • At least 20% increase in revenue through faster turnaround times (minutes vs. hours) and improved quotation consistency that reduces lost deals from delayed responses
  • Scalable operations without linear headcount growth — the system handles 10x enquiry volume with no infrastructure changes beyond compute scaling
  • Staff time refocused on high-value tasks — reviewing flagged edge cases, managing customer relationships, and closing deals rather than searching inventory
  • Zero hallucinated inventory — every item in every quotation is a verified, database-backed record, giving customers and internal teams full confidence in output accuracy

The continuous retraining loop means the model improves over time as operators confirm or correct matches — creating a compounding accuracy advantage that grows with usage.