Title: Data Engg with Databricks - Technical Lead-Data Engg
Area(s) of responsibility
Role Summary
We are looking for a highly skilled Azure Data Engineer with strong hands on expertise in Azure Databricks, PySpark, ADF, and SQL. The candidate will be responsible for architecting, developing, and optimizing modern data platforms and analytical solutions on Azure. The role requires strong experience building enterprise-grade data pipelines, enabling ingestion from diverse sources, implementing complex transformations, and supporting scalable analytics initiatives.
Key Responsibilities
1. Data Solution Design & Architecture
Design end to end data engineering solutions using Azure Data Factory, Azure Databricks, PySpark, SQL, and other Azure-native services.
Architect and implement scalable and secure Modern Data Warehouse (MDW) and Lakehouse solutions leveraging Azure Data Lake Storage and Databricks.
Develop data models, integration patterns, and reusable frameworks aligned with best practices and enterprise architecture standards.
Participate in requirement discussions, solution blueprinting, and technical feasibility assessments.
2. Data Pipeline Development
Build and optimize robust, high-throughput ELT/ETL pipelines, enabling ingestion, transformation, and curation of structured, semi structured, and unstructured data.
Integrate data from multiple on premise and cloud based systems, APIs, and third-party sources.
Implement complex transformations using PySpark, ensuring performance efficiency and code modularity.
Build orchestration workflows in ADF, including pipelines, triggers, linked services, integration runtimes, and parameterized datasets.
3. Databricks & PySpark Engineering
Develop scalable transformation scripts using PySpark on Databricks, applying advanced optimizations like caching, partitioning, and Delta Lake capabilities.
Implement Delta Lake features—ACID transactions, schema enforcement, schema evolution, and time travel—across the data lifecycle.
Perform performance tuning, handling bottlenecks related to cluster configuration, shuffle operations, joins, and parallelization.
Collaborate with platform teams to manage Databricks clusters, jobs, notebooks, and CI/CD integrations.
4. Data Governance, Quality & Security
Implement data quality checks, audit mechanisms, and validation frameworks to ensure data accuracy and consistency.
Ensure compliance with organizational standards for data security, encryption, access control, and data lifecycle management.
Create and maintain technical documentation, data flow diagrams, and operational support guides.
5. Collaboration & Stakeholder Management
Collaborate closely with BI, analytics, and business teams to understand data requirements and deliver reliable, production-ready solutions.
Work with architects, product owners, and cross-functional engineering teams to align technical delivery with business objectives.
Provide guidance and mentoring to junior engineers when required.