Data Scientist / (Data Bricks, Data Factories) Engineer

As our team member, you will be interpreting fuzzy business requirements to formalized Data Science solutions. Collaborating with the product team on feature requests, working with multiple data sources of huge and small datasets to develop, validate and deploy machine learning models, tuning their performance & integrating them into the data processing pipeline is what we expect from our new colleague.

Responsibilities:

Set up reproducible experiments with Machine Learning models, create validation schemas, test models, monitor metrics, deliver models to production;
Deal with both structured and unstructured data, implement ETL pipelines to prepare data for machine learning algorithms;
Not only solve technical tasks but understand business needs and offer appropriate solutions, data collection requirements, while describing a chosen approach to non-technical people;
The landscape of ML tasks is pretty diverse ranging from working with tabular data for solving various classification and regression tasks to building models based on textual and visual data, so we expect you to have wide ML experience;
Integrate data preprocessing and model inference to the existing data processing pipeline;
Research new tools, papers, generate ideas for continuous improvement of the Machine Learning part of the projects.

Requirements:

Experience with predictive analytics, statistical modeling, machine learning, deep learning;
Expertise in Python programming;
Hands-on experience with machine learning libraries and frameworks:

python scientific stack;
scikit-learn, lightgbm, catboost, xgboost;
pytorch / tensorflow (keras);

Experience with Computer Vision modeling (classification, visual search);
Ability to implement space and time-efficient algorithms and understand which one is preferable and when;
Strong theoretical knowledge of machine learning and deep knowledge of mathematics and algorithms;
Good oral and written English for communication and reading/writing technical documentation.

Would Be a Plus:

Experience with OCR technologies;
Hands-on experience with developing parallel code in Python;

Experience with workflow composing frameworks (Airflow, Luigi, etc.);
Experience in software engineering, deployment and integration with data delivery systems and other components, building microservices, providing APIs for models access;
Having data visualization skills;

What We Offer:

Competitive compensation;
Flexible schedules available;
Generous benefits package from day one of employment: medical coverage, sport reimbursement, English classes, bonuses for special occasions (birthday, wedding, etc.), paid vacations and sick leaves;
Immense training and growth opportunities.