Country/Region:  IN
Requisition ID:  31654
Work Model: 
Position Type: 
Salary Range: 
Location:  INDIA - NOIDA- BIRLASOFT OFFICE

Title:  Lead Architect

Description: 

Area(s) of responsibility

7A lead architect (only hands-

on) in practice for SRE (Observability and DevOps). Below are the key skills.

  • AWS architecture: VPC, Subnets, Routing, NAT, Security Groups, NACLs, Transit Gateway
  • Compute & container orchestration: EC2, ECS, EKS (Kubernetes), Fargate, Lambda
  • Storage & data: S3, EBS/EFS, RDS/Aurora, DynamoDB, ElastiCache
  • Networking & edge: ALB/NLB, API Gateway, Route 53, CloudFront, Global Accelerator
  • Identity & access: IAM policies/roles, STS, Organizations, Control Tower, SCPs
  • Reliability patterns: multi-AZ/region HA, DR (Pilot Light/Warm Standby/Active-Active), backup/restore automation
  • AWS stack: CloudWatch (metrics/logs/alarms), CloudTrail (audit), X-Ray (tracing), Config (drift/compliance)
  • Metrics & tracing: Prometheus, Grafana, Jaeger, OpenTelemetry (OTLP, SDKs, collectors)
  • Log aggregation & search: ELK/Elastic Stack (Elasticsearch, Logstash, Kibana), Fluentd/Fluent Bit, Splunk
  • APM tools: Datadog, New Relic, AppDynamics, Dynatrace (bonus)
  • SLO/SLI/SLA design, error budgets, golden signals, alert hygiene & runbook quality
  • Pipeline design: GitHub Actions, GitLab CI, Jenkins, AWS CodePipeline/CodeBuild/CodeDeploy
  • Deployment strategies: Blue/Green, Canary, Rolling, Feature Flags; automated rollbacks
  • Artifact & dependency management: Docker registries (ECR), SBOM, supply chain security
  • Release governance: trunk-based development, GitOps (Argo CD/Flux), approvals & gates
  • Terraform (modules, workspaces, remote state, data sources),  
  • AWS CloudFormation/CDK (TypeScript/Python), nested stacks, custom resources
  • Ansible (playbooks, roles, vault), Packer (AMI pipelines), Helm charts for Kubernete
  • Cluster lifecycle: node groups, CNI (Amazon VPC CNI/Calico), storage classes, ingress controllers
  • Service mesh: Istio/Linkerd (optional), mTLS, traffic policies, sidecars
  • Workload ops: HPA/VPA, pod disruption budgets, resource quotas/requests/limits
  • Observability for K8s: kube-state-metrics, Prometheus Operator, Grafana dashboards
  • Multi-tenancy, namespaces, RBAC, network policies; admission controllers & policy-as-code
  • Incident management: on-call practices, escalation, blameless postmortems, RCA depth
  • Chaos engineering: fault injection (chaos-mesh/litmus), game days, resilience scoring
  • Capacity planning & performance tuning: autoscaling, throughput/latency profiling, caching strategies
  • Availability engineering: circuit breakers, retries/backoff, bulkheads, graceful degradation
  • Cloud security: IAM least privilege, Secrets Manager/Parameter Store, KMS, VPC endpoints
  • Container security: image scanning (Trivy/Grype), runtime policies (Falco), admission controls
  • Policy-as-code: AWS Config rules, GuardDuty, Security Hub
  • Compliance: audit trails, encryption in transit/at rest, CIS/NIST/ISO mappings
  • Cost governance: tagging standards, cost allocation, savings plans/reserved instances, rightsizing
  • Strong programming for tooling/automation: Python/Go (preferred), Bash
  • Event-driven ops: Lambda/Step Functions for remediation; webhooks & bots (ChatOps)
  • API-first mindset: AWS SDK/CLI, tool integrations, custom exporters/collectors