August 11
Data Engineer (Data Ops)
About the team We are currently actively building a Data Warehouse a key part of the product. We work with cutting edge technologies (GCP, AWS, Airflow, Kafka, K8s) and make infrastructure and architectural decisions based on data. We are building a large scale data infrastructure for analytics, machine learning, and realtime recommendations.
- Develop the data driven culture within the company
- Develop processes for data processing, storage, cleaning, and enrichment
- Design and maintain data pipelines from collection to consumption
- Develop APIs (REST, gRPC) for highload services
- Create infrastructure for storing and processing large datasets on K8S, Terraform Automate testing, validation, and monitoring of data
- Participate in system design and architectural decision making
- Expert in Python 3.7+,
- Experience with PySpark Deep knowledge of SQL
- Extensive experience building ETLs with Air flow 2
- Industrial experience with Kubernetes
- Understanding of data processing principles and algorithms
- Excellent knowledge of OOP, design patterns, clean architecture
- Productivity, responsibility, and the ability to take ownership
- Would be a plus: Experience with highload services DevOps skills and CI/CD automation experience If you’re interested in working with big data, complex challenges, and cutting edge technologies, we’d love to meet you!
- Stable salary, official employment
- Health insurance
- Hybrid work mode and flexible schedule
- Relocation package offered for candidates from other regions (only for Kazakhstan and Cyprus)
- Access to professional counseling services including psychological, financial, and legal support
- Discount club membership
- Diverse internal training programs
- Partially or fully paid additional training courses
- All necessary work equipment
- Languages: Python, SQL
- Frameworks: Spark, Apache Beam
- Storage and analytics: BigQuery, GCS, S3, Trio, other GCP and AWS stack
- components Integration: Apache Kafka, Google Pub/Sub, Debezium
- ETL: Airflow 2
- Infrastructure: Kubernetes, Terraform
- Development: GitHub, GitHub Actions, Jira
More information about the vacancy: Data Engineer (Data Ops) at inDrive