Title: Lead Architect - Data Architect
Area(s) of responsibility
Job Description – Data Architect
Data Architect – Data & Insights
Summary
We are looking for a highly experienced Data Architect (Grade 7A) to lead the design and delivery of the enterprise Data & Insights platform on Microsoft Azure. The role demands deep expertise in cloud data architecture, data models, data lake/lakehouse, vector-store based RAG systems, GenAI, Agentic AI, and governance using Microsoft Purview.
The ideal candidate has strong hands-on skills in Python, LangChain, LangGraph, Azure data services, and end-to-end SDLC execution.
Roles & Responsibilities
- Architect and design Azure-based data lake/lakehouse platforms, domain data models, and ingestion-to-consumption pipelines.
- Develop conceptual, logical, and physical cloud data models aligned with enterprise standards.
- Architect RAG pipelines including embeddings, chunking, vector stores, hybrid retrieval, reranking, and evaluation.
- Build Agentic AI workflows using LangChain and LangGraph; design tool orchestration, memory, and safety layers.
- Implement governance with Microsoft Purview for cataloging, lineage, PII tagging, and policy enforcement.
- Ensure platform security using Entra ID, private endpoints, VNETs, Key Vault, and encryption controls.
- Lead solution architecture reviews, performance tuning, cost optimization, and NFR engineering.
- Oversee CI/CD (Azure DevOps), IaC (Terraform/Bicep), and observability (Azure Monitor, App Insights).
- Mentor engineering teams and standardize best practices, patterns, and reusable components.
Technical Skills
Mandatory
- Azure Data Platform: ADLS Gen2, Synapse/Serverless SQL, Databricks/Spark, ADF/Synapse Pipelines
- Programming: Python, PySpark, SQL
- GenAI & Agentic AI: RAG architecture, vector stores (Azure Cognitive Search, Pinecone, Weaviate, Qdrant), embeddings, reranking
- Frameworks: LangChain, LangGraph
- Data Modeling: Conceptual/logical/physical models, Delta/Parquet patterns, lakehouse modeling
- Data Governance: Microsoft Purview (catalog, lineage, classification, glossary, PII governance)
- Security: Entra ID, RBAC/ABAC, Key Vault, VNET integration, encryption
- SDLC & DevOps: Azure DevOps (CI/CD), Terraform/Bicep, ADRs, HLD/LLD documentation
- Performance & Cost Optimization across compute, storage, vector workloads, and pipelines
Preferred
- Azure Fabric / OneLake; Power BI semantic modeling
- dbt for transformations and testing
- Cosmos DB, PostgreSQL, SQL Server MI
- Knowledge graphs (Neo4j) and graph-based retrieval
- LLMOps: evaluation, telemetry, safety assessment, drift monitoring
- FinOps optimization practices
- Multi-cloud experience (AWS/GCP equivalents)
- API design: REST, GraphQL, gRPC
Qualifications
- Bachelor’s or Master’s degree in Engineering, Computer Science, or related discipline.
- 12–14 years of total experience with minimum 5+ years in cloud data architecture.
- Proven experience delivering Azure-based data platforms and production-grade GenAI/RAG systems.