Professional Resume
Vijay Anand Pandian
Senior Data Engineer | Azure Databricks | Kafka | GCP | AWS | GenAI
London, United Kingdom
Professional Summary
Senior Data Engineer with 11+ years of experience designing and delivering modern data platforms, real-time pipelines, and analytics products across retail, media, healthcare, and security domains.
Currently at Marks and Spencer (M&S), building data and platform capabilities for the Sparks loyalty ecosystem. Strong track record in translating business problems into robust engineering solutions, with measurable outcomes in deployment speed, data quality, and operational reliability.
Value I bring
- Build and scale reliable data products from ingestion to governed consumption.
- Deliver automation that reduces cycle time, manual effort, and production defects.
- Partner effectively with product, analytics, and engineering stakeholders.
- Apply governance-first practices (PII controls, retention, access controls, auditability).
- Use GenAI pragmatically for internal engineering productivity and data workflows.
Core Skills
- Languages: Python, SQL, Scala, Bash
- Data Engineering: Azure Databricks, Spark, Kafka, Airflow/Composer, Delta Lake, DBT
- Cloud: Azure, AWS, GCP
- Storage and Warehouses: BigQuery, Snowflake, Redshift, PostgreSQL, Hive
- DevOps and Platform: Terraform, Docker, CI/CD (Azure DevOps, GitHub Actions, Jenkins)
- Governance: Data quality controls, PII masking, retention policies, access controls
- GenAI: LangChain, Azure OpenAI, RAG patterns, vector search (PoC and internal tooling)
Professional Experience
Senior Data Engineer | Marks and Spencer (M&S)
May 2025 - Present | London, United Kingdom
- Building data platform capabilities for the M&S Sparks loyalty programme on Azure.
- Architected and delivered components of the loyalty analytics platform on Azure Databricks using medallion architecture patterns.
- Built Mission Desk, an internal platform to manage mission configurations and deployment workflows; reduced deployment cycle time from multi-day process to same-day delivery.
- Improved operational quality through stronger validation and standardized release controls, reducing production-facing configuration issues.
- Contributed to privacy-by-design implementation, including PII-aware processing and governed data access patterns.
- Delivered GenAI-enabled internal experimentation (RAG and assistant workflows) focused on search and engineering productivity.
Tech: Azure Databricks, Azure Data Factory, PySpark, Python, Delta Lake, PostgreSQL, Terraform, Azure DevOps, Unity Catalog, LangChain, Azure OpenAI
Senior Data Engineer | Sky
Dec 2022 - Apr 2025 | London, United Kingdom
- Designed real-time ingestion architecture from on-prem sources to cloud analytics, reducing operational latency by 30%.
- Built a Python-based code generation framework (Jinja2) that significantly reduced onboarding time for new data acquisitions and minimized manual conversion effort.
- Developed Kafka Connect monitoring and auto-recovery tooling with alerting, improving reliability and reducing repeated task failures.
- Built engineering utilities for BigQuery monitoring and metadata extraction at scale.
- Implemented governance controls including retention standards, encryption practices, and PII compliance across pipelines.
Tech: Apache Kafka, kSQL, Kafka Connect, Python, Terraform, BigQuery, Composer (Airflow), Docker, SQL
Senior Data Engineer (Contract) | Channel 4
May 2022 - Nov 2022 | London, United Kingdom
- Built and optimized MarTech data pipelines integrating multiple ad and marketing data sources.
- Designed dimensional models to support reporting and analytics workloads.
- Reduced reporting latency and improved pipeline efficiency with incremental processing and orchestration best practices.
Tech: AWS (Redshift, Glue, Lambda, Fargate, S3), Snowflake, Spark, Scala, Python
Senior Associate Manager / Data Engineer | Eli Lilly and Company
May 2021 - Apr 2022 | Bengaluru, India
- Led cloud modernization workstreams for clinical data and integration systems.
- Delivered event-driven integration pipelines and improved platform performance and operational efficiency.
- Built Databricks-based ETL workflows for multi-source transformations and quality controls.
Tech: AWS (Lambda, Fargate, MQ, DynamoDB, S3, Step Functions), Azure Databricks, DBT, Snowflake, Python, Terraform
Software Engineer / Data Engineer | Optum (UnitedHealth Group)
Jul 2019 - Apr 2021 | Chennai, India
- Delivered full-stack and data engineering solutions for healthcare products.
- Migrated Oracle workloads to Spark-based data pipelines and data lake architecture.
- Supported recommendation-model data preparation pipelines and production analytics workflows.
Tech: Python, Django, React, PostgreSQL, Spark, Scala, AWS, TensorFlow, Bamboo, Jenkins
Data Analytics Engineer | Gartner (formerly CEB)
Sep 2015 - Jul 2019 | Chennai, India
- Delivered multiple data and analytics products across product, marketing, sales, and survey teams.
- Designed and operated daily ETL workflows moving 200TB+ raw Google Analytics data from BigQuery to Hive, improving downstream data accessibility.
- Built Sqoop-based ingestion pipelines from Oracle to Hive for broader warehouse availability and reporting use.
- Developed the Competitors Dashboard (Data Harvester) that processed data from 100+ online sources/day, with Airflow orchestration and insight dashboards for events, locations, sponsors, and organizers.
- Contributed as data engineer to the Next Best Action Recommendation Model, building end-to-end transformation pipelines and supporting AWS deployment with data science teams.
- Built an Email Bot analytics service using NLP (NLTK), Pandas, and MongoDB to automate sales/marketing email insights; helped improve renewal outcomes and reduced manual analysis effort.
- Developed a Data Cleansing Tool with Sentiment Analysis using supervised ML (Naive Bayes, Maxent, Decision Tree) to convert unstructured records into structured data with retraining support.
- Worked on internal R&D and engineering tools, including Resume Ranker, Android Survey App (Hackathon project), Log Analytics Dashboard, PDF-to-PPTX converter service, and a client-server web scraper.
Tech: Python, SQL, PySpark, Hive, Sqoop, GCP BigQuery, Google Cloud API, Airflow, TensorFlow, NLTK, Pandas, MongoDB, Elasticsearch, Kibana, AWS S3, Scrapy
Software Developer / SQA Engineer | Symantec
Jan 2014 - Aug 2015 | Chennai, India
- Delivered security automation and validation across Norton product engineering and managed security services.
- Automated security test suites for Norton for Mac AntiVirus and Symantec Cloud Security, including scenarios for port scanning, firewall validation, brute-force patterns, and malicious URL detection.
- Built and maintained license validation automation (including offline scenarios), improving consistency between server-side entitlements and client behavior.
- Performed telemetry ping validation by analyzing packet captures and HTTP payloads to verify correctness of client-to-server telemetry signals.
- Automated scheduled scan-policy testing (daily/weekly/monthly) and log validation for antivirus event completeness and correctness.
- Enhanced shared test framework components (logging utilities, wrappers, core helpers), reducing regression suite runtime by 40%.
- Completed 100+ license-related test scenarios under strict delivery timelines with strong quality outcomes.
- Contributed innovation initiatives through hackathon participation, technical write-ups, and internal idea submissions.
- During MSS internship, supported security operations reporting for breach and threat events under compliance requirements.
- Monitored and analyzed SIEM alerts (HP ArcSight, RSA Security Analytics) integrated with enterprise security controls (IDS/IPS, Cisco/Checkpoint firewalls, McAfee ePO, network infrastructure) to identify and escalate vulnerabilities.
- Built Python/Bash automation to simulate real-time security logs for testing and verification workflows.
- Built a Kafka + Java proof of concept to improve security signal processing and incident response visibility.
Tech: Python, Bash, Shell Scripting, Unix, Jenkins, Java, Apache Kafka, HP ArcSight, RSA Security Analytics, SIEM, Virtual Machines
Certifications
- Certified Scrum Product Owner (CSPO), Scrum Alliance (Oct 2024)
- Databricks Generative AI Fundamentals (Sep 2023)
- Preparing for Professional Data Engineer Journey, Google Cloud (Mar 2024)
- Analysis and Visualization, Google Cloud (Nov 2024)
- Google Cloud Big Data and Machine Learning Fundamentals (Apr 2023)
- Astronomer Airflow Fundamentals and DAG Authoring (Sep 2021)
Education
- M.Tech, Cyber Security - Amrita Vishwa Vidyapeetham, India
- B.Tech, Information Technology - University College of Engineering Tindivanam, India
Publications and Recognition
- Published research paper: A Novel Cloud Based NIDPS for Smartphones (Springer, 2014)
- Sky Star Award (2025) for engineering innovation
Open to Senior Data Engineer opportunities in data platform engineering, real-time analytics, and AI-enabled data products.