Data Science Engineer, Assistant Senior Manager at Diamond Trust Bank (DTB)
Diamond Trust Bank View all jobs
- Kenya
- Permanent
- Full-time
- Build and maintain ETL/ELT pipelines that feed modelling datasets from multiple banking systems (CBS, LMS, CRM, Cards, Mobile Banking, Bureau, Collections systems).
- Develop automated data preparation workflows for credit scoring, fraud models, behavioral models, and IFRS9 modelling.
- Create end-to-end ML pipelines integrating feature engineering, data validation, model deployment, and monitoring.
- Manage and Build other Enterprise ETL using tools like ODI , informatica etc.
- Develop scalable data-processing workflows using Spark, Hadoop, Kafka, Airflow, Flink or similar.
- Optimize large datasets (transactional, bureau, behavioural, logs) for modelling in batch and real-time environments.
- Manage distributed computation and ensure reliability and fault tolerance.
- Design and maintain a centralized feature store for credit, fraud, marketing, and customer analytics models.
- Ensure feature consistency between training and serving environments.
- Implement versioning, lineage, documentation, and metadata management for data features.
- Collaborate with data scientists to deploy models using MLflow, Docker, Kubernetes, API gateways, CI/CD pipelines.
- Develop automated monitoring pipelines for model performance, drift detection, data quality, and explainability.
- Ensure models operate efficiently in real-time decision engines and batch scoring environments.
- Implement robust data validation, profiling, anomaly detection, and reconciliation checks.
- Work with Data Governance teams to ensure compliance with IFRS9, Basel, CBK, GDPR, and internal data standards.
- Manage data lineage, cataloguing, and documentation to support audits and regulatory reviews.
- Partner with Data Scientists, Risk, Credit, Fraud, Marketing, and BI teams to align data pipelines with business use cases.
- Work with IT and Infrastructure teams on cluster performance, security, access controls, and SLA adherence.
- Participate in sprint planning, architecture reviews, and model implementation committee sessions.
- Improve the efficiency, scalability, and cost of ML workloads.
- Optimize database queries, Spark jobs, Kafka streams, and storage systems.
- Strong academic foundation with a Bachelor's or Master's in Computer Science, Data Engineering, Data Science, Information Technology, or a related quantitative field.
- 3-7+ years of impactful, hands-on experience in data engineering, big-data processing, or building scalable ML infrastructure-ideally within fast-paced, data-driven environments.
- Advanced programming capability, with strong proficiency in Python, SQL, and PySpark; experience with Scala is an added advantage.
- Demonstrated expertise in modern data and ML platforms, including:
- Big-data technologies: Spark, Hadoop, Kafka, Airflow
- MLOps & containerization: MLflow, Docker, Kubernetes
- CI/CD pipelines: GitLab, Jenkins, GitHub Actions
- Cloud platforms: AWS, GCP, or Azure (highly preferred)
- Experience working with banking systems, risk data, or credit-modelling datasets-a significant advantage that accelerates success in this role.
- Strong understanding of data structures, distributed systems, and ML workflows.
- Excellent problem-solving, debugging, and optimization skills.
- Fast learner with ability to adapt to new technologies.
- High attention to detail, documentation discipline, and data governance awareness.
- Strong collaboration and communication skills.
Jobs in Kenya