Title: Dataiku Developer
Area(s) of responsibility
Job Title: Dataiku Developer
Location:
Pune/Mumbai/Chennai/Hyd/Bangalore/Noida
About the Role:
We are seeking a skilled Dataiku Developer to join our data science workbench program team. The ideal candidate will design, develop, and deploy scalable data pipelines and advanced analytics and AI/ML solutions using the Dataiku platform. You will work closely with data scientists, engineers, and business stakeholders to enable data-driven decision-making.
Key Responsibilities:
- Design, develop, extend and maintain end-to-end data workflows and pipelines in Dataiku DSS.
- Collaborate with data scientists and analysts to operationalize machine learning models.
- Leverage Generative AI models and tools within Dataiku to build advanced AI-powered applications and analytics solutions.
- Integrate Dataiku with various data sources (databases, cloud storage, APIs).
- Develop and optimize SQL queries and Python/R scripts for data extraction and transformation across relational and NoSQL databases
- Work extensively with cloud data warehouses like Amazon Redshift and/or Snowflake for data ingestion, transformation, and analytics.
- Implement automation and scheduling of data workflows.
- Monitor and troubleshoot data pipelines to ensure data quality and reliability.
- Document technical solutions and best practices for data processing and analytics.
Required Skills and Qualifications:
- Proven experience of 4+ years working with Dataiku Data Science Studio (DSS) in a professional environment.
- Strong knowledge of data engineering concepts, ETL/ELT processes.
- Proficiency in Python and/or R for data manipulation and automation.
- Solid SQL skills and experience with relational databases (e.g., MySQL, PostgreSQL, Oracle)
- Familiarity with cloud platforms (AWS or optionally Azure/GCP) and data storage technologies.
- Understanding of machine learning lifecycle and model deployment.
- Experience with REST APIs and integration tools.
- Strong analytical, problem-solving, and communication skills.
- Bachelor’s degree in Computer Science, Engineering, Data Science, or related field.
Preferred Qualifications:
- Experience with big data technologies like Hadoop, Spark, or Kafka and databases like Redshift, Snowflake or AWS S3.
- Knowledge of containerization and orchestration (Docker, Kubernetes).
- Experience with CI/CD pipelines for data projects.
- Prior experience in agile development methodologies.