Country/Region:  IN
Requisition ID:  36163
Work Model: 
Position Type: 
Salary Range: 
Location:  INDIA - PUNE - BIRLASOFT OFFICE - HINJAWADI

Title:  Technical Lead-Data Engg

Description: 

Area(s) of responsibility

Job Description: Azure Data Engineer (Databricks & PySpark)

Experience: 7–10 Years
Employment Type: Full-time


Role Summary

We are looking for a highly skilled Azure Data Engineer with strong hands‑on expertise in Azure Databricks, PySpark, ADF, and SQL. The candidate will be responsible for architecting, developing, and optimizing modern data platforms and analytical solutions on Azure. The role requires strong experience building enterprise-grade data pipelines, enabling ingestion from diverse sources, implementing complex transformations, and supporting scalable analytics initiatives.


Key Responsibilities

1. Data Solution Design & Architecture

  • Design end‑to‑end data engineering solutions using Azure Data Factory, Azure Databricks, PySpark, SQL, and other Azure-native services.
  • Architect and implement scalable and secure Modern Data Warehouse (MDW) and Lakehouse solutions leveraging Azure Data Lake Storage and Databricks.
  • Develop data models, integration patterns, and reusable frameworks aligned with best practices and enterprise architecture standards.
  • Participate in requirement discussions, solution blueprinting, and technical feasibility assessments.

2. Data Pipeline Development

  • Build and optimize robust, high-throughput ELT/ETL pipelines, enabling ingestion, transformation, and curation of structured, semi‑structured, and unstructured data.
  • Integrate data from multiple on‑premise and cloud‑based systems, APIs, and third-party sources.
  • Implement complex transformations using PySpark, ensuring performance efficiency and code modularity.
  • Build orchestration workflows in ADF, including pipelines, triggers, linked services, integration runtimes, and parameterized datasets.

3. Databricks & PySpark Engineering

  • Develop scalable transformation scripts using PySpark on Databricks, applying advanced optimizations like caching, partitioning, and Delta Lake capabilities.
  • Implement Delta Lake features—ACID transactions, schema enforcement, schema evolution, and time travel—across the data lifecycle.
  • Perform performance tuning, handling bottlenecks related to cluster configuration, shuffle operations, joins, and parallelization.
  • Collaborate with platform teams to manage Databricks clusters, jobs, notebooks, and CI/CD integrations.

Mandatory Skills & Experience

  • Minimum 7–10 years of experience in data engineering, with strong hands‑on exposure to Azure data ecosystem.
  • At least 2 years of real project experience in Azure Databricks (not POCs).
  • At least 2 years of hands-on experience building data pipelines using Azure Data Factory (ADF).
  • At least 2 years of experience developing PySpark-based transformations in Databricks.
  • Experience with Azure Data Lake Storage (ADLS Gen2), data partitioning strategies, and file formats like Parquet/Delta/JSON.