Mayank Chaudhari
Back to Projects
Microsoft / WHO / Just Analytics

Global Pandemic Data Pipeline

Architecting a high-scale scraping and data normalization engine for WHO and Microsoft during the COVID-19 crisis.

Role:Lead Automation Architect
Impact:Automated aggregation of global COVID-19 data sources; 100% Client Satisfaction.

Tech Stack

Azure Cloud
Node.js
Puppeteer/Selenium
Cognitive Services
RPA

The Data Fragmentation Crisis

During the onset of COVID-19, critical data (case counts, regulations, logistics) was scattered across thousands of disparate government websites, formats (PDFs, Dashboards), and languages. Organizations like Microsoft and WHO needed a unified "Source of Truth" fast.

I architected a Cognitive Scraping Pipeline on Azure to normalize this chaos into structured datasets.

Architecture: The Intelligent Crawler

1. The Cloud Orchestrator

2. Cognitive Extraction

3. Impact