Case study

AI-Powered Supply Chain Marketplace

Building a scalable, autonomous data engine to index 32,000+ suppliers for a Series Seed startup

03 Albus

Client

Diagon
San Francisco, USA

Project Duration

8 months+
6 people

Client Challenge

The client, Diagon, founded by ex-Tesla and Rivian veterans, set out to revolutionize the industrial machinery market by allowing manufacturers to source equipment "in minutes, not months".

While they had successfully validated their concept with a prototype, they faced a critical execution gap in scaling. The goal was to index and structure product data from over 32,000 disparate supplier websites—a task requiring massive scale and "human-like" reasoning to interpret unstructured technical specifications.

The existing infrastructure faced resource limits when handling heavy AI extraction workloads. To meet investor expectations, Diagon needed to transition from a lightweight MVP to a robust, enterprise-grade cloud architecture capable of autonomous decision-making to keep operational costs viable.

Service Process

Service Process

We structured the engagement as a progressive partnership, adopting a "delivery-first" model to build trust before scaling the team.

  • Phase 1 (Foundation): We started with a DevOps Engineer and Frontend Developer to stabilize AWS infrastructure and streamline the UI.
  • Phase 2 (Core Intelligence): We deployed a Data Engineer and AI Engineer to design the complex scraping "Brain" using LangGraph.
  • Phase 3 (Structure & Scale): We introduced an Engineering Manager, Full-Stack Developer, and UX/UI Designer to manage growing complexity.

Collaboration Model We implemented a Dual-Stream Agile Workflow:

  1. Core Stream: Managed by the Diagon Head of Eng, focusing on Infrastructure and the Web App.
  2. Intelligence Stream: Managed by the Diagon Technical PM, focusing on Data Quality and AI Prompts.

Application UI Designs

Project Results

The partnership successfully evolved Diagon’s platform into a robust, investor-ready data engine capable of handling massive data volume with high precision.

  • Intelligent AI Pipeline (The "Brain"): We moved beyond simple scraping to a cognitive agentic workflow. An AI Agent intelligently scans sitemaps to identify "Product Pages," reducing processing volume by ~80% and significantly cutting token costs.
  • Adaptive Extraction & 95% Recall: The system uses a tiered approach (cost-effective parsers first, auto-healing with tools like Firecrawler second). The pipeline achieves a 95% success rate in identifying and extracting valid product pages.
  • Infrastructure & Performance: We migrated heavy extraction workers to AWS ECS, eliminating timeouts and allowing for massive parallel processing. Terraform was implemented to ensure identical Staging and Production environments.
  • Human-in-the-Loop Validation: We decoupled AI logic using LangSmith, allowing domain experts to adjust prompts ("Prompt-Ops") without needing developer deployment.
  • Analytics & BI: We deployed a Dockerized Metabase instance to give leadership a real-time view of data coverage and supplier indexing progress.

Deliverables

  • Smart Filtering Agent: Reduces processing volume by ~80% by identifying valid product pages
  • Adaptive Extraction System: Auto-heals using advanced tools like Firecrawler only when necessary
  • "Prompt-Ops" Workflow: Decouples AI logic via LangSmith for non-technical expert tuning
  • Golden Dataset Testing: Automatic validation against verified datasets to ensure quality

Benefits

  • Operational Efficiency: Reduced operational costs by ~80% via intelligent filtering
  • Stability: Eliminated timeouts and enabled massive parallel processing via AWS ECS
  • Real-time Visibility: Business Intelligence dashboard tracks data acquisition and coverage
  • Agility: Empowered non-technical experts to tune data quality without coding

Want to Learn More? Need a Project Quote?

Reach Out Today!
We're always ready to help

Blazej Kosmowski

Blazej Kosmowski

CTO
Marek Petrykowski

Marek Petrykowski

CEO
  • Get a reply within 24 hours
  • Discuss your needs with our expert
  • Receive your custom proposal in days

Click for the details

SoftKraft undertakes to process the above information for the purpose of contacting you and discussing your project. If you consent to being contacted for these purposes, please check the box below.

You can unsubscribe from these communications at any time. For more information on how to unsubscribe, please refer to our Privacy Policy.

or

It Staff Augmentation
Kafka Consulting
Software Development Team
It Staff Augmentation
Kafka Consulting
Software Development Team