Вакансии Data Engineer / Big Data Architect
- Apache Spark
- PySpark
- Python
- Palantir Foundry
- CI/CD
- Terraform
- AWS CloudFormation
We are looking for Senior/Lead Big Data Engineer to join our team for a long-term cooperation.
Role Overview:
As a Lead Big Data Engineer, you will combine hands-on engineering with technical leadership. You’ll be responsible for designing, developing, and optimizing Spark-based big data pipelines in Palantir Foundry, ensuring high performance, scalability, and reliability. You will also mentor and manage a team of engineers, driving best practices in big data engineering, ensuring delivery excellence, and collaborating with stakeholders to meet business needs. While our project uses Palantir Foundry, prior experience with it is a plus, but not mandatory.
Key Responsibilities:
- Lead the design, development, and optimization of large-scale, Spark-based (PySpark) data processing pipelines.
- Build and maintain big data solutions using Palantir Foundry
- Ensure Spark workloads are tuned for performance and cost efficiency.
- Oversee and participate in code reviews, architecture discussions, and best practice implementation.
- Maintain high standards for data quality, security, and governance.
- Manage and mentor a team of Big Data Engineers, providing technical direction
- Drive continuous improvement in processes, tools, and development practices.
- Foster collaboration across engineering, data science, and product teams to align on priorities and solutions.
Requirements:
- Bachelor’s or Master’s degree in Computer Science, Engineering, or related field.
- 6+ years in Big Data Engineering, with at least 1-2 years in a lead (tech/team lead) role.
- Deep hands-on expertise in Apache Spark (PySpark) for large-scale data processing.
- Proficiency in Python and distributed computing principles.
- Experience designing, implementing, and optimizing high-volume, low-latency data pipelines.
- Strong leadership, communication, and stakeholder management skills.
- Experience with Palantir Foundry is a plus, but not required.
- Familiarity with CI/CD and infrastructure as code (Terraform, CloudFormation) is desirable.
Информация о компании N-iX
Преимущества сотрудникам
- English Courses
- Гнучкий графік роботи
- Компенсація витрат на спорт
- Компенсація навчання
- Медичне страхування
- Python
- Data lake
- AWS
- AWS Glue
- Kafka
- Redis
- Apache Spark
- Iceberg
- Athena
- AirFlow
- ETL
- Agile
- Docker
- AWS CloudFormation
- Git
Our customer is the leading school transportation provider in North America, being the owner of more than a half of all yellow school buses in the United States. Every day, the company completes 5 million student journeys, moving more passengers than all U.S. airlines combined and delivers reliable, quality services for 1,100 school districts.
N-iX has built a successful cooperation with the client delivering a range of complex initiatives. As a result, N-iX has been selected as a strategic long-term partner to drive the digital transformation on an enterprise level, fully remodeling the technology landscape for 55,000 employees and millions of people across North America.
Responsibilities:
- Design complex ETL processes of various data sources in the data warehouse
- Build new and maintain existing data pipelines using Python to improve efficiency and latency
- Improve data quality through anomaly detection by building and working with internal tools to measure data and automatically detect changes
- Identify, design, and implement internal process improvements, including re-designing infrastructure for greater scalability, optimizing data delivery, and automating manual processes
- Perform data modeling and improve our existing data models for analytics
- Collaborate with SMEs, architects, analysts, and others to build solutions that integrate data from many of our enterprise data sources
- Partner with stakeholders, including data, design, product, and executive teams, and assist them with data-related technical issues
Requirements:
- Proficiency in Python 7+ years
- 3-5 years of commercial experience in building and maintaining a Data Lake
- Experience leading a Data Lake team of 3-5 Engineers (2 years)
- Good knowledge of AWS cloud services, including the Glue framework with integration type of projects (2 years)
- Experience maintaining Apache Kafka
- Steady expertise in data processing tools, including Redis, Apache Spark, Apache Iceberg, Athena.
- Knowledge of job scheduling and orchestration using Airflow
- Experience in events streaming
- Well-versed in the optimization of ETL processes
- Experience of developing high-load backend services on Python.
- Good understanding of algorithms and data structures
- Excellent communication skills, both written and verbal
Nice to have:
- Experience in the schema and dimensional data design
- Collaboration within a scaled team, using Agile methodology
- Decent knowledge of CI/CD (Docker, Cloud formation, Git)
Информация о компании N-iX
Преимущества сотрудникам
- English Courses
- Гнучкий графік роботи
- Компенсація витрат на спорт
- Компенсація навчання
- Медичне страхування
- AWS
- AWS Glue
- Amazon Redshift
- Amazon S3
- AWS Kinesis
- OpenSearch
- Amazon QuickSight
- Tableau
- Microsoft Power BI
- Athena
- ETL
- ELT
- Apache Spark
- PySpark
- Terraform
- AWS SageMaker
- Amazon Bedrock
- Python
Automat-it is where high-growth startups turn when they need to move faster, scale smarter, and make the most of the cloud. As an AWS Premier Partner and Strategic Partner, we deliver hands-on DevOps, FinOps, and GenAI support that drives real results.
We work across EMEA, fueling innovation and solving complex challenges daily. Join us to grow your skills, shape bold ideas, and help build the future of tech.
We’re looking for a Data Engineering Team Lead to build and scale our Data & Analytics capability while delivering modern, production-grade data platforms for customers on AWS. You’ll lead a team of Data Engineers, own delivery quality and timelines, and remain hands-on across architecture, pipelines, and analytics so the team ships fast, safely, and cost-effectively.
Responsibilities
- Manage, coach, and grow a team of Data Engineers through 1:1s, goal setting, feedback, and career development.
- Own end-to-end delivery outcomes (scope, timelines, quality) across multiple projects; unblock the team and ensure on-time, high-quality releases.
- Lead customer-facing workshops, discovery sessions, and proof-of-concepts, serving as the primary technical point of contact to translate requirements into clear roadmaps, estimates, and trade-offs in plain language.
- Support solution proposals, estimates, and statements of work; contribute to thought leadership and reusable accelerators.
- Collaborate closely with adjacent teams (MLOps, DevOps, Data Science, Application Engineering) to ship integrated solutions.
- Design, develop, and deploy AWS-based data and analytics solutions to meet customer requirements. Ensure architectures are highly available, scalable, and cost-efficient.
- Develop dashboards and analytics reports using Amazon QuickSight or equivalent BI tools.
- Migrate and modernize existing data workflows to AWS. Re-architect legacy ETL pipelines to AWS Glue and move on-premises data systems to Amazon OpenSearch/Redshift for improved scalability and insights.
- Build and manage multi-modal data lakes and data warehouses for analytics and AI. Integrate structured and unstructured data on AWS (e.g. S3, Redshift) to enable advanced analytics and generative AI model training using tools like SageMaker.
Requirements
- Proven leadership experience with a track record of managing and developing technical teams.
- Production experience with AWS cloud and data services, including building solutions at scale with tools like AWS Glue, Amazon Redshift, Amazon S3, Amazon Kinesis, Amazon OpenSearch Service, etc.
- Skilled in AWS analytics and dashboards tools – hands-on expertise with services such as Amazon QuickSight or other BI tools (Tableau, Power BI) and Amazon Athena.
- Experience with ETL pipelines – ability to build ETL/ELT workflows (using AWS Glue, Spark, Python, SQL).
- Experience with data warehousing and data lakes - ability to design and optimize data lakes (on S3), Amazon Redshift for data warehousing, and Amazon OpenSearch for log/search analytics.
- Proficiency in programming (Python/PySpark) and SQL skills for data processing and analysis.
- Understanding of cloud security and data governance best practices (encryption, IAM, data privacy).
- Excellent communication and customer-facing skills with an ability to explain complex data concepts in clear terms. Comfortable working directly with clients and guiding technical discussions.
- Fluent written and verbal communication skills in English.
- Proven ability to lead end-to-end technical engagements and work effectively in fast-paced, Agile environments.
- AWS certification – AWS certifications, especially in Data Analytics or Machine Learning are a plus.
- DevOps/MLOps knowledge – experience with Infrastructure as Code (Terraform), CI/CD pipelines, containerization, and AWS AI/ML services (SageMaker, Bedrock) is a plus.
Информация о компании Automat-IT
Преимущества сотрудникам
- English Courses
- Team buildings
- Компенсація витрат на спорт
- Компенсація навчання
- Медичне страхування
- Надається ноутбук
- Оплата роботи в коворкінгу
- Оплачувані лікарняні
- Освітні програми, курси
- RAG
- OpenSearch
- Elasticsearch
- Python
- LlamaIndex
- LangChain
- Pinecone
- Qdrant
- FAISS
- LLM
- AWS
- Azure
- GCP
- Docker
- Kubernetes
Our mission at Geniusee is to help businesses thrive through tech partnership and strengthen the engineering community by sharing knowledge and creating opportunities. Our values are Continuous Growth, Team Synergy, Taking Responsibility, Conscious Openness and Result Driven. We offer a safe, inclusive and productive environment for all team members, and we’re always open to feedback. If you want to work from home or work in the city center of Kyiv, great – apply right now.
Requirements
- 8+ years of experience in software engineering, with a focus on AI/ML systems or distributed systems;
- Hands-on experience building and deploying retrieval-augmented generation (RAG) systems;
- Deep knowledge of OpenSearch, Elasticsearch, or similar search engines;
- Strong coding skills in Python;
- Experience with frameworks like LlamaIndex or LangChain;
- Familiarity with vector databases such as Pinecone, Qdrant, or FAISS;
- Exposure to LLM fine-tuning, semantic search, embeddings, and prompt engineering;
- Previous work on systems handling millions of users or queries per day;
- Familiarity with cloud infrastructure (AWS, GCP, or Azure) and containerization tools (Docker, Kubernetes);
- Experience with vector search, embedding pipelines, and dense retrieval techniques;
- Proven ability to optimize inference stacks for latency, reliability, and scalability;
- Excellent problem-solving, analytical, and debugging skills;
- Strong sense of ownership, ability to work independently, and a self-starter mindset in fast-paced environments;
- Passion for building impactful technology aligned with our mission;
- Bachelor’s degree in Computer Science or related field, or equivalent practical experience.
What you will do
- Design, build and scale a production-grade inference stack for RAG-based applications;
- Develop efficient retrieval pipelines using OpenSearch or similar vector databases, with a focus on high recall and response relevance;
- Optimize performance and latency for both real-time and batch queries;
- Identify and address bottlenecks in the inference stack to improve response times and system efficiency;
- Ensure high reliability, observability, and monitoring of deployed systems;
- Collaborate with cross-functional teams to integrate LLMs and retrieval components into user-facing applications;
- Evaluate and integrate modern RAG frameworks and tools to accelerate development;
- Guide architectural decisions, mentor team members, and uphold engineering excellence.
Информация о компании Geniusee
Преимущества сотрудникам
- English Courses
- Team buildings
- Без бюрократії
- Гнучкий графік роботи
- Компенсація витрат на спорт
- Компенсація навчання
- Медичне страхування
- Надається ноутбук
- Оплата роботи в коворкінгу
- Оплачувані лікарняні
- Регулярний перегляд зарплатні
- SQL
- Python
- Java
- Scala
- DWH
- Dataform
- SQLMesh
- Snowflake
- BigQuery
- Amazon Redshift
- AirFlow
- Airbyte
- Kafka
How exactly you can influence the development of the Company:
- developing and maintaining the Data Platform based on dbt and Snowflake to accelerate obtaining high-quality insights from data
- implementing modern approaches to solving tasks
- proposing solutions for optimizing data workflows
- monitoring Data Platform metrics
- participating in code review
- closely collaborating with Data Analytics and Data Scientists
Challenges for three months:
- getting acquainted with the existing data model in the Data Platform and its architecture
- conducting analysis to identify bottlenecks and suboptimal processes
- setting up a process for systematically improving the most suboptimal queries
Certainly, we will teach you, but it’s good to have:
- experience in working with Airflow, Airbyte
- experience in working with Kafka, Kafka Connect
To achieve the results, you will need:
- at least 2 years of experience as a Data Engineer
- practical experience with SQL at a confident level
- experience in developing with one of the programming languages (Python, Java, Scala, etc.)
- experience in creating data models for DWH and their implementation (dimensional modelling, data vault, etc.)
- experience in working with dbt or alternative tools (Dataform, SQLMesh)
- experience in working with cloud data warehouses and understanding their architecture and operating principles (Snowflake, BigQuery, Redshift, etc.)
- English proficiency at Intermediate level
Информация о компании Uklon
Преимущества сотрудникам
- English Courses
- Work-life balance
- Медичне страхування
- Освітні програми, курси
- SQL
- ETL
- DataBricks
- Python
- AWS
- Amazon S3
- Amazon Redshift
- Athena
- AWS Glue
- AWS Lambda
ELEKS Software Engineering and Development Office is looking for a Senior Data Engineer in Ukraine, Poland, or Croatia.
About client
The customer is a British company producing electricity with zero carbon emissions.
Requirements
- Experience in Data Engineering, SQL, ETL(data validation + data mapping + exception handling) 5+ years
- Hands-on experience with Databricks 2+ years
- Experience with Python
- Experience with AWS (e.g. S3, Redshift, Athena, Glue, Lambda, etc.)
- Knowledge of the Energy industry (e.g. energy trading, utilities, power systems etc.) would be a plus
- Experience with Geospatial data would be a plus
- At least an Upper-Intermediate level of English
Responsibilities
- Building Databases and Pipelines: Developing databases, data lakes, and data ingestion pipelines to deliver datasets for various projects
- End-to-End Solutions: Designing, developing, and deploying comprehensive solutions for data and data science models, ensuring usability for both data scientists and non-technical users. This includes following best engineering and data science practices
- Scalable Solutions: Developing and maintaining scalable data and machine learning solutions throughout the data lifecycle, supporting the code and infrastructure for databases, data pipelines, metadata, and code management
- Stakeholder Engagement: Collaborating with stakeholders across various departments, including data platforms, architecture, development, and operational teams, as well as addressing data security, privacy, and third-party coordination
Информация о компании Eleks
Преимущества сотрудникам
- English Courses
- Relocation assistance
- Велопарковка
- Гнучкий графік роботи
- Компенсація витрат на спорт
- Медичне страхування
- Оплачувані лікарняні
- Освітні програми, курси
- Парковка для авто
- Python
- RDBMS
- NoSQL
- AWS
- MLOps
- ETL
- Microservices
- Kubernetes
- Azure
- GCP
We’re looking for a creative and impact-driven Senior Data Engineer eager to design and build powerful data solutions that empower millions of users. If you enjoy solving complex challenges, architecting systems that scale, and turning raw data into actionable insights – this opportunity is for you.
Join a purpose-led team that’s transforming how people in emerging markets access, manage, and grow their financial lives through technology.
Responsibilities
- Build data pipelines that collect and transform data to support ML models, analysis and reporting
- Work in a high volume production environment making data standardized and reusable, from architecture to production
- Work with off-the-shelf tools including DynamoDb, SQS, S3, RedShift, Snowflake, Mysql but often push them past their limits
- Work with an international multidisciplinary team of data engineers, data scientists and data analysts
Requirements
- At least 5+ years of experience in data engineering / software engineering in the big data domain
- At least 5+ years of coding experience with Python or equivalent
- SQL expertise, working with various databases (relational and NoSQL), data warehouses, external data sources and AWS cloud services
- Experience in building and optimizing data pipelines, architecture and data sets
- Experience with ML pipelines and MLOps tools
- Familiarity with data engineering tech stack – ETL tools, orchestration tools, micro-services, K8, lambdas
- End to end experience – owning features from an idea stage, through design, architecture, coding, integration and deployment stages
- Experience working with cloud services such as AWS, Azure, Google Cloud
- B.Sc. in computer science or equivalent STEM
- English – Upper-Intermediate+
Информация о компании Newxel
Преимущества сотрудникам
- Work-life balance
- Бухгалтерський супровід
- Медичне страхування
- SQL
- PL-SQL
- Oracle
- PostgreSQl
- API testing
- Postman
- cURL
- Bruno
- AWS
- Azure
- GCP
- Boomi
- Talend
- SnapLogic
- Apache NiFi
- Azure Data Factory
- GraphQL
- Cognos
- Argos CI/CD
- Tableau
- Metabase
- JavaScript
- Node.js
- Python
- Java
We are looking for a proactive Data Integration Tech Lead to lead and collaborate with a strong, high-performing team of engineers. The role involves owning integration strategy, leading complex data pipeline development, and guiding engineers while ensuring scalable, secure, and high-quality solutions. We seek an active team player with strong leadership skills, deep expertise in integrations, and the ability to align technical solutions with business needs.
Our client is a leading provider of software and technology solutions for higher education. From student information systems and enrollment management to finance, HR, and analytics, their products empower universities and colleges to operate more efficiently and deliver a better experience for students.
Requirements
- Bachelor’s degree in Computer Science, Information Technology, or related field.
- 7+ years of experience in integration development, backend/API engineering, or data engineering roles.
- 2+ years in a lead or senior capacity, with proven ability to guide engineers and projects.
- Demonstrated ability to mentor engineers, conduct code/design reviews, and set technical direction.
- Strong communication and presentation skills; able to interact with business stakeholders and technical teams.
- Experience leading cross-functional integration initiatives.
- Experience building and optimizing data pipelines for both real-time and batch processes.
- Deep SQL and PL/SQL knowledge (Oracle, PostgreSQL).
- Understanding of stored procedures, views, DB packages, and performance tuning.
- Familiarity with API testing and debugging tools (Postman, curl, Bruno).
- Hands-on experience with cloud platforms (AWS, Azure, or GCP).
- Experience migrating on-prem integrations to SaaS/cloud platforms.
- Expertise in iPaaS or ETL platforms (Boomi, Talend, SnapLogic, NiFi, Azure Data Factory, etc.).
- Understanding of GraphQL and modern API access patterns.
- Familiarity with BI/reporting tools (Cognos, Argos, Tableau, Metabase).
- Proficiency in one or more languages: JavaScript (Node.js), Python, Java.
Responsibilities
- Own the integration strategy and architecture for enterprise data flows and APIs.
- Lead design, development, and optimization of complex data integration pipelines.
- Define and enforce standards for API usage, data mappings, error handling, and performance.
- Evaluate new technologies and recommend adoption where appropriate.
- Ensure integrations are scalable, secure, and compliant with enterprise requirements.
- Lead and mentor a team of integration engineers, providing technical guidance and career development.
- Conduct design and code reviews, ensuring high-quality deliverables.
- Champion best practices in integration, testing, and DevOps.
- Coordinate workload distribution and ensure alignment with project milestones.
- Work closely with software engineers, architects, business analysts, and external vendors.
- Translate business requirements into technical solutions and present integration designs to stakeholders.
- Act as primary escalation point for integration-related issues.
- Document integration architectures, workflows, and governance frameworks for transparency and knowledge sharing.
Информация о компании ABCloudz
Преимущества сотрудникам
- English Courses
- Бухгалтерський супровід
- Компенсація витрат на спорт
- Компенсація навчання
- Медичне страхування
- Оплачувані лікарняні
- JavaScript
- React
- JSON API
- Rest API
- Java
- AWS
- PostgreSQl
- Oracle
We are seeking a talented and motivated Integration Engineer to join our dynamic team. As an Integration Engineer, you will be responsible for designing, developing, and maintaining integrations between various systems, applications, databases, and reporting tools. You will collaborate closely with cross-functional teams to understand requirements, develop solutions, and support ongoing integration needs.
Requirements
- Bachelor’s degree in Computer Science, Information Technology, or a related field
- Proven experience in one or more of the following areas:
- Integration engineering or a related role
- Languages/technologies as JavaScript, React, JSON Rest APIs, Java, AWS
- Database management as PostgreSQL, Oracle
- Reporting and data visualization tools
- Excellent communication and interpersonal skills
- Strong problem-solving skills and the ability to work in a fast-paced, dynamic environment
Responsibilities
- Design, develop, and implement integration solutions that align with business needs and technical requirements
- Work with APIs to connect various systems and applications
- Develop data mapping and transformation rules to ensure accurate data exchange between different platforms
- Collaborate with software development teams, business analysts, and stakeholders to gather integration requirements and ensure alignment with business objectives
- Conduct performance tuning, optimization, and testing of integration solutions
- Maintain and document integration processes, standards, and best practices
- Develop data visualization inside a huge 1st in class SaaS enterprise system
- Develop data flows, APIs between different databases (such as PostgreSQL, Oracle, etc.) and cloud storage (flat files)
Will be a plus
- Experience with cloud AWS
- Experience with data warehousing and business intelligence concepts
Информация о компании ABCloudz
Преимущества сотрудникам
- English Courses
- Бухгалтерський супровід
- Компенсація витрат на спорт
- Компенсація навчання
- Медичне страхування
- Оплачувані лікарняні
- AWS
- Amazon S3
- EMR
- AWS Glue
- AWS Lambda
- Amazon Redshift
- Hadoop
- HDFS
- Hive
- MapReduce
- Apache Spark
- PySpark
- Spark SQL
- RDBMS
- NoSQL
- Python
- Java
- Scala
- AirFlow
- AWS Step Functions
- ETL
- ELT
- Kafka
- AWS Kinesis
- Apache Flink
- Docker
- Kubernetes
What you will do
- Design, build, and maintain large-scale data pipelines and data processing systems in AWS;
- Develop and optimize distributed data workflows using Hadoop, Spark, and related technologies;
- Collaborate with data scientists, analysts, and product teams to deliver reliable and efficient data solutions;
- Implement best practices for data governance, security, and compliance;
- Monitor, troubleshoot, and improve the performance of data systems and pipelines;
- Mentor junior engineers and contribute to building a culture of technical excellence;
- Evaluate and recommend new tools, frameworks, and approaches for data engineering.
Must haves
- Bachelor’s or Master’s degree in Computer Science, Engineering, or related field;
- 5+ years of experience in data engineering, software engineering, or related roles;
- Strong hands-on expertise with AWS services (S3, EMR, Glue, Lambda, Redshift, etc.);
- Deep knowledge of big data ecosystems, including Hadoop (HDFS, Hive, MapReduce) and Apache Spark (PySpark, Spark SQL, streaming);
- Strong SQL skills and experience with relational and NoSQL databases;
- Proficiency in Python, Java, or Scala for data processing and automation;
- Experience with workflow orchestration tools (Airflow, Step Functions, etc.);
- Solid understanding of data modeling, ETL/ELT processes, and data warehousing concepts;
- Excellent problem-solving skills and ability to work in fast-paced environments;
- Ability to work German TimeZone (~ 6-7 am to ~ 2-3 pm Brazil/ ART time);
- Upper-Intermediate English level.
Nice to haves
- Experience with real-time data streaming platforms (Kafka, Kinesis, Flink);
- Knowledge of containerization and orchestration (Docker, Kubernetes);
- Familiarity with data governance, lineage, and catalog tools;
- Previous leadership or mentoring experience.
Информация о компании AgileEngine
Преимущества сотрудникам
- Гнучкий графік роботи
- Зарплатня вище ринку
- Регулярний перегляд зарплатні
- SQL
- Python
- ETL
- BigQuery
- DataFlow
- Apache Beam
- Apache Spark
- PySpark
- dbt
- Dataform
- DWH
- CI/CD
- Terraform
- GCP
- MSSQL
- PostgreSQl
- MySQL
- NoSQL
Big Data & Analytics is the Center of Excellence's data consulting and data engineering branch. Hundreds of data engineers and architects nowadays build data and analytics end-to-end solutions from strategy through technical design and proof of concepts to full-scale implementation. We have customers in the healthcare, finance, manufacturing, retail, and energy domains.
We hold top-level partnership statuses with all the major cloud providers and collaborate with many technology partners like AWS, GCP, Microsoft, Databricks, Snowflake, Confluent, and others.
If you are
- Skilled in SQL, Python, and building/optimizing ETL pipelines
- Experienced with BigQuery, Dataflow (Apache Beam) or Dataproc (Spark/PySpark), dbt/Dataform
- Aware of data modelling and DWH architecture
- Familiar with Data Quality metrics, checks, and reporting
- Comfortable with DataOps practices (CI/CD, Terraform)
- Open to exploring AI tools for Data Engineering (code generation, profiling, etc.)
- Holding (or aiming for) a GCP certification
- Knowledgeable about Dataplex, MSSQL, PostgreSQL, MySQL, or NoSQL (would be a plus)
And you want to
- Be part of a team of data-focused engineers committed to continuous learning, improvement, and knowledge-sharing
- Design and optimize scalable data pipelines in GCP
- Engage with customers from diverse backgrounds, including large global corporations and emerging startups
- Work with modern data engineering stack (BigQuery, dbt/Dataform, Dataflow/Dataproc)
- Apply best practices in DataOps, CI/CD and Terraform
- Ensure data quality through checks and reporting
- Leverage AI-powered tools to accelerate engineering tasks
- Contribute to data architecture design and innovation in DWH solutions
- Grow your expertise with cloud certifications and advanced GCP tools
- Participate in the entire project lifecycle, from initial design and proof of concepts (PoCs) to minimum viable product (MVP) development and full-scale implementation
Информация о компании SoftServe
Преимущества сотрудникам
- Fitness Zone
- Гнучкий графік роботи
- Компенсація витрат на спорт
- Медичне страхування
- Оплачувані лікарняні
- Azure Data Factory
- Azure Synapse
- Microsoft Azure
- Azure Data Lake Storage
- Azure Databricks
- PySpark
- ETL
- ELT
- SQL
- PostgreSQl
- Azure Cosmos DB
- Informix
- Azure Analysis Services
- Python
- Bash
- Unix
- Elasticsearch
- Docker
- Kubernetes
The primary goal of the project is the modernization, maintenance and development of an eCommerce platform for a big US-based retail company, serving millions of omnichannel customers each week.
Solutions are delivered by several Product Teams focused on different domains - Customer, Loyalty, Search and Browse, Data Integration, Cart.
Current overriding priorities are new brands onboarding, re-architecture, database migrations, migration of microservices to a unified cloud-native solution without any disruption to business.
Responsibilities
We are looking for Data Engineer who will be responsible for designing a solution for a big retail company. The main focus is to support processing of big data volumes and integrate solution to current architecture.
Must have skills
- Strong, recent hands-on expertise with Azure Data Factory and Synapse is a must (3+ years).
- Experience in leading a distributed team.
- Strong expertise in designing and implementing data models, including conceptual, logical, and physical data models, to support efficient data storage and retrieval.
- Strong knowledge of Microsoft Azure, including Azure Data Lake Storage, Azure Synapse Analytics, Azure Data Factory, and Azure Databricks, pySpark for building scalable and reliable data solutions.
- Extensive experience with building robust and scalable ETL/ELT pipelines to extract, transform, and load data from various sources into data lakes or data warehouses.
- Ability to integrate data from disparate sources, including databases, APIs, and external data providers, using appropriate techniques such as API integration or message queuing.
- Proficiency in designing and implementing data warehousing solutions (dimensional modeling, star schemas, Data Mesh, Data/Delta Lakehouse, Data Vault)
- Proficiency in SQL to perform complex queries, data transformations, and performance tuning on cloud-based data storages.
- Experience integrating metadata and governance processes into cloud-based data platforms
- Certification in Azure, Databricks, or other relevant technologies is an added advantage
- Experience with cloud-based analytical databases.
- Experience with Azure MI, Azure Database for Postgres, Azure Cosmos DB, Azure Analysis Services, and Informix.
- Experience with Python and Python-based ETL tools.
- Experience with shell scripting in Bash, Unix or windows shell is preferable.
Nice to have
- Experience with Elasticsearch
- Familiarity with containerization and orchestration technologies (Docker, Kubernetes).
- Troubleshooting and Performance Tuning: Ability to identify and resolve performance bottlenecks in data processing workflows and optimize data pipelines for efficient data ingestion and analysis.
- Collaboration and Communication: Strong interpersonal skills to collaborate effectively with stakeholders, data engineers, data scientists, and other cross-functional teams.
Информация о компании Luxoft
Преимущества сотрудникам
- Relocation assistance
- Team buildings
- Багатонаціональна команда
- Велика стабільна компанія
- Освітні програми, курси
- SQL
- Snowflake
- AirFlow
- Kafka
- AWS
- Terraform
- Docker
- Python
- Playwright
- Selenium
- Puppeteer
CodeIT is a service product development company. We know how to transform business ideas into profitable IT products. We are looking for a skilled and experienced Flutter Developer to join our team.
Our customer creates ia solution which is collecting player level data for our brands, which helps us to track revenue on various pages/brands and better understand how our users are using our products and what’s trending. The vision for this is to become the central source of truth for user journey insights, empowering our company to make smarter, faster, and more impactful decisions that drive commercial growth and product innovation.
Required skills
- 3+ years of experience as a Data Engineer or Software Engineer working on data infrastructure.
- Strong Python skills and hands-on experience with SQL and Snowflake.
- Experience with modern orchestration tools like Airflow and data streaming platforms like Kafka.
- Understanding of data modeling, governance, and performance tuning in warehouse environments.
- Ability to work independently and prioritize across multiple stakeholders and systems.
- Comfort operating in a cloud-native environment (e.g., AWS, Terraform, Docker).
- Python side:
- Must have is experience in pulling and managing data from APIs
- Nice to have would be web scraping via browser automation (e.g. Playwright / Selenium / Puppeteer)
Responsibilities
- Design, build, and maintain ETL/ELT pipelines and batch/streaming workflows.
- Integrate data from external APIs and internal systems into Snowflake and downstream tools.
- Own critical parts of our Airflow-based orchestration layer and Kafka-based event streams.
- Ensure data quality, reliability, and observability across our pipelines and platforms.
- Build shared data tools and frameworks to support analytics and reporting use cases.
- Partner closely with analysts, product managers, and other engineers to support data-driven decisions.
Информация о компании CodeIT
Преимущества сотрудникам
- English Courses
- Компенсація домашнього офісу
- Оплачувані лікарняні
- Оплачувана відпустка
- Освітні програми, курси
- Юридичний супровід
- Python
- Scala
- SQL
- AWS
On behalf of our Client from Japan, Mobilunity is looking for a Data Engineer for a long-term engagement.
About project:
Our client is a fast-growing fintech company based in Tokyo. The company offers its real-time monthly consolidated credit service all across Japan. Our client started Japan’s first instant post-pay credit service for e-commerce customers in October 2014. The main product is an online payment platform which requires no pre-registration or credit card to use; via this service the customers can purchase products online using only a mobile phone number and email address and settle a single monthly bill for all their purchases, either at a convenience store, by bank transfer or auto debit. Customers can use credit funds during a month and then return the balance to zero without paying credit interests.
This service also supports multi-pay installments and subscriptions. There are currently over 4 million accounts in use. This service got the largest investment to date in the Japanese financial tech industry, including PayPal Ventures investment.
We are looking for a Data Engineer, reporting to the Head of Data Platform Engineering, who will work closely with product engineers and business stakeholders to turn data into information. From which insights can be provided back to product and business leaders to inform our product design and features. Your responsibilities will include conducting full lifecycle analysis to include requirements, task and design(s). You will develop both analysis on source data systems as well as support reporting activities and capabilities. Our ambition is to be a Data Driven firm and you will assist in this endeavor in part by monitoring performance of source domain data systems, cleansing and identifying improvements.
Key Role and Responsibilities:
- Interpret data, analyze results using statistical techniques and provide ongoing reports.
- Develop and implement systems to analyze, report, and present data for use by other systems and human stakeholders.
- Review, understand, and improve existing reporting systems to optimize performance, add new functionality, and verify correctness.
- Work with the Data Integration team and stakeholders to understand reporting requirements, identify necessary data, and determine additional options for improvement.
- Identify, analyze, and interpret trends or patterns in product domains in source data systems.
- Ensure data quality and correctness, to meet government regulatory standards.
Requirements:
- Over 3 years of experience leveraging data to drive business outcomes, with additional expertise in reporting related to user credit, collections, and finance considered a strong plus.
- Deep domain knowledge in data, with hands-on experience in data layering, data modeling, data mining, and other end-to-end data implementation practices.
- Skilled in designing and developing efficient data processing workflows to extract valuable insights from both structured and unstructured data, enabling data-driven decision-making.
- Familiar with widely used data warehouse tools, with practical experience in working with both relational and non-relational database systems.
- Strong programming skills in Python, Scala, or other languages, combined with solid SQL optimization capabilities; experience with AWS services is a plus.
- Possesses strong data intuition, analytical thinking, and excellent communication and cross-functional collaboration skills.
- Level of English – Upper-Intermediate and higher.
Информация о компании Mobilunity
Преимущества сотрудникам
- English Courses
- Бухгалтерський супровід
- Медичне страхування
- Оплачувані лікарняні
- Регулярний перегляд зарплатні
- Python
- Ruby
- AWS
- ECS
- EKS
- AWS Lambda
- Terraform
- AWS SQS
- Kafka
- Apache Spark
- DataBricks
- AirFlow
We’re looking for a Senior Software/Data Engineer to join our client, a leading digital platform in the subscription-based content industry. Their product handles millions of daily interactions, providing seamless access to ebooks, audiobooks, and other media.
This remote position, ideally suited for candidates located in Europe or LATAM, is perfect for someone who enjoys backend development with a strong emphasis on cloud architecture and data processing.
About the project
Client is an American e-book and audiobook subscription service that includes one million titles. The platform hosts 60 million documents on its open publishing platform.
Core Platform provides robust and foundational software, increasing operational excellence to scale apps and data. As a Backend Engineer, your focus will be on building and optimizing scalable backend systems and data pipelines that support real-time processing and event-driven architectures.
Tech Stack: Python, AWS Cloud Services (eg. ECS, EKS, Lambda), SQS, Kafka, IaC, Terraform, Apache Spark, Databricks, Airflow
Must-have for the position
- 5+ years of experience as a professional combined experience in software engineering and data engineering with Python (Familiarity with Ruby is a plus).
- Proven track record of designing and delivering complex software systems with minimal supervision.
- Experience building and maintaining high-throughput systems or data pipelines handling millions of requests daily.
- 3+ years of deep experience with AWS cloud services.
- Hands-on experience deploying solutions to production environments using ECS, EKS, or AWS Lambdas.
- Ability to test, profile, and optimize systems for performance and scalability.
- Proficiency with Infrastructure as Code using Terraform.
- Experience working with real-time data processing, queues, or event streams (e.g., SQS, Kafka).
- Bachelor’s degree in Computer Science or equivalent practical experience.
- Strong communication skills and a collaborative mindset.
- Excellent English skills and the ability to engage with both technical and non-technical stakeholders.
Will be a strong plus
- Experience with Apache Spark, Databricks, or similar distributed data processing frameworks.
- Workflow orchestration experience using Apache Airflow.
- Exposure to Machine Learning pipelines or ML-driven applications.
Responsibilities
- Design and deliver complex software systems, ensuring scalability, reliability, and maintainability.
- Develop high-performance backend solutions, primarily using Python; optionally contribute to Ruby-based components.
- Build and maintain robust data pipelines and high-throughput systems processing millions of requests per day
- Deploy and operate cloud infrastructure leveraging AWS services (ECS, EKS, Lambda).
- Test, profile, and optimize system performance for efficiency and scalability.
- Work with real-time data processing systems, queues, and event-driven architectures (e.g., Kafka, SQS).
- Implement infrastructure as code (IaC) using Terraform for consistent and reliable deployments.
- Collaborate cross-functionally with other engineering and product teams, sharing knowledge and participating in architectural decisions.
- Conduct code reviews and provide mentorship, promoting engineering best practices and continuous team improvement.
Информация о компании KitRUM
Преимущества сотрудникам
- English Courses
- Team buildings
- Work-life balance
- Гнучкий графік роботи
- Довгострокові проекти
- Оплачувані лікарняні
- Освітні програми, курси
- Trino
- Python
- Streamlit
- AirFlow
- Pyless
- Generative AI
- RAG
- LLM
We’re the Wix Payments team.
We provide Wix users with the best way to collect payments from their customers and manage their Wix income online, in person, and on-the-go. We’re passionate about crafting the best experience for our users, and empowering any business on Wix to realize its full financial potential. We have developed our own custom payment processing solution that blends many integrations into one clean and intuitive user interface. We also build innovative products that help our users manage their cash and grow their business. The Payments AI team is instrumental in promoting AI based capabilities within the payments domain and is responsible for ensuring the company is always at the forefront of the AI revolution.
As a Data Engineer on the Wix Payments AI Team, you’ll play a crucial role in leveraging data to uncover insights, refine user experiences, and drive strategy within the payments domain. You’ll develop & maintain infrastructure for both generative AI and classical data science applications and while pushing the limits of what can be built with emerging AI technologies, often requiring research into new technology stacks and identifying optimal combinations to create solutions.
In your day-to-day you will:
- Develop scalable solutions to address data gaps and process design needs
- Monitor data pipeline performance, troubleshoot issues and guarantee data infrastructure availability
- Engage with various stakeholders, including data engineers, analysts, data scientists and product managers, to support data-driven decision-making
- Take part in designing, developing and maintaining the generative AI infrastructure of the Payments product
- Ensure timely and seamless exchange of information between AI applications, users and our data warehouse
- Maintain existing classical data science infrastructure, involving updating and bug fixing multiple distributed Python libraries
Requirements
- Proficient in Trino SQL, with the ability to craft complex queries
- Highly skilled in Python for developing data pipelines, with expertise in Python frameworks (e.g., Streamlit, Airflow, Pyless, etc.)
- Ability to write, test and deploy production-ready code
- Fluent in English with strong communication abilities
- An independent and quick learner
- Open-headed, capable of coming up with creative solutions and adapting to frequently changing circumstances and technological advances
- Experience with Generative AI concepts, including hands-on experience or knowledge of Prompt Engineering, Fine-Tuning, Automatic Evaluation and RAG – an advantage
- Experience serving and/or fine-tuning open-source LLMs – an advantage
Информация о компании Wix
Преимущества сотрудникам
- Fitness Zone
- Кава, фрукти, перекуси
- Python
- Pandas
- PySpark
- SQLAlchemy
- ETL
- ELT
- DataBricks
- AWS
- CI/CD
- Pytest
Beaconcure is transforming the clinical research space by automating data validation processes with its AI-powered platform, Verify. Their solution accelerates drug and vaccine approval, ensuring the accuracy and integrity of statistical analysis in clinical trials. As a Senior Data Engineer, you’ll be instrumental in building robust data pipelines and infrastructure to support rapid, secure medical advancements. You'll work alongside data scientists, analysts, and engineers to ensure data flows reliably and at scale.
Required skills:
- 3+ years of experience in Python development (or 1+ year in Python and 2+ years in another language)
- Hands-on experience with Pandas, PySpark, SQLAlchemy
- Strong understanding of ETL/ELT pipelines and data architecture
- Proficiency with Databricks or similar platforms
- Experience with AWS cloud services
- Good communication and problem-solving skills
Nice to Have
- Experience with CI/CD and orchestration tools
- Familiarity with MLOps frameworks
- Background in life sciences or pharma
- Experience with PyTest and regular expressions
Scope of work:
- Design and implement scalable ETL pipelines using Python and modern data frameworks
- Optimize structured and semi-structured data workflows for performance and reliability
- Collaborate with cross-functional teams to support data-driven decision making
- Work with Databricks and distributed data processing platforms
- Ensure high standards of data quality, documentation, and production readiness
- Contribute to the evolution of Beaconcure’s cloud-based data infrastructure (AWS)
Информация о компании AllStarsIT
Преимущества сотрудникам
- English Courses
- Team buildings
- Work-life balance
- Відпустка по догляду за дитиною
- Допомога психотерапевта
- Компенсація витрат на спорт
- Медичне страхування
- Оплачувані державні свята
- Оплачувані лікарняні
- Освітні програми, курси
- Kafka
- Apache Spark
- Hadoop
- Yarn
- MySQL
- AirFlow
- Snowflake
- Amazon S3
- Kubernetes
- PySpark
- Python
- AWS
As a Data & Application Engineer, you are responsible for the engineering team and the technology that the team owns. You will not only work as a coach for your team but also as a technical leader, ensuring that the right technical decisions are made when building our data and reporting product(s).
As a data and analytics team, we are responsible for building a cloud-based Data Platform for our client and its stakeholders across brands. We aim to provide our end users from different Finance departments, e.g.., Risk, FPA, Tax, Order to Cash, the best possible platform for all of their Analytics, Reporting & Data needs.
Collaborating closely with a talented team of engineers and product managers, you'll lead the delivery of features that meet the evolving needs of our business on time. You will be responsible to conceptualize, design, build and maintain data services through data platforms for the assigned business units. Together, we'll tackle complex engineering challenges to ensure seamless operations at scale and in (near) real-time.
If you're passionate about owning the end-to-end solution delivery, thinking for future and driving innovation and thrive in a fast-paced environment, join us in shaping the future of the Data and Analytics team.
Responsibilities
- Strategy and Project Delivery
- Together with the business Subject Matter Experts and Product Manager, conceptualize, define, shape and deliver the roadmap to achieving the company priority and objectives
- Lead business requirement gathering sessions to translate into actionable delivery solutions backlog for the team to build
- Lead technical decisions in the process to achieve excellence and contribute to organizational goals.
- Lead the D&A teams in planning and scheduling the delivery process, including defining project scope, milestones, risk mitigation and timelines management including allocating tasks to team members and ensuring that the project stays on track.
- Have the full responsibility in ensuring successful delivery by the D&A teams to deliver new products on time, set up processes and operational plans from end to end, e.g., collecting user requirement, design, build & test solution and Ops Maintenance
- Technical leader with a strategic thinking for the team and the organization. Visionary who can deliver strategic projects and products for the organization.
- Own the data engineering processes, architecture across the teams
- Technology, Craft & Delivery
- Experience in designing and architecting data engineering frameworks, dealing with high volume of data
- Experience in large scale data processing and workflow management
- Mastery in technology leadership
- Engineering delivery, quality and practices within own team
- Participating in defining, shaping and delivering the wider engineering strategic objectives
- Ability to get into the technical detail (where required) to provide technical coach, support and mentor the team
- Drive a culture of ownership and technical excellence, including reactive work such as incident escalations
- Learn new technologies and keep abreast of existing technologies to be able to share learnings and apply these to a variety of projects when needed
Role Qualifications and Requirements:
- Bachelor degree
- At least 5 years of experience leading and managing one or multiple teams of engineers in a fast-paced and complex environment to deliver complex projects or products on time and with demonstrable positive results.
- 7+ years' experience with data at scale, using Kafka, Spark, Hadoop/YARN, MySQL (CDC), Airflow, Snowflake, S3 and Kubernetes
- Solid working experience working with Data engineering platforms involving languages like PySpark, Python or other equivalent scripting languages
- Experience working with public cloud providers such as Snowflake, AWS
- Experience to work in a complex stakeholders' organizations
- A deep understanding of software or big data solution development in a team, and a track record of leading an engineering team in developing and shipping data products and solutions.
- Strong technical skills (Coding & System design) with ability to get hands-on with your team when needed
- Excellent communicator with strong stakeholder management experience, good commercial awareness and technical vision
- You have driven successful technical, business and people related initiatives that improved productivity, performance and quality
- You are a humble and thoughtful technology leader, you lead by example and gain your teammates' respect through actions, not the title
- Exceptional and demonstrable leadership capabilities in creating unified and motivated engineering teams
Информация о компании Luxoft
Преимущества сотрудникам
- Relocation assistance
- Team buildings
- Багатонаціональна команда
- Велика стабільна компанія
- Освітні програми, курси
- Python
- PySpark
- SQL
- NoSQL
- Apache Spark
- Kafka
- Hadoop
- Presto
- DataBricks
- AWS
- GCP
- Azure
- Tableau
- Microsoft Power BI
- CI/CD
We are looking for a Senior Data Engineer to join one of our teams and help us build great products for our clients.
You’ll be part of a high-performance team where innovation, collaboration, and excellence are at the core of everything we do. As a Senior Data Engineer, you’ll have the chance to design and develop optimized, scalable big data pipelines that power our products and applications we work on. Your expertise will be valued, your voice will be heard, and your career will be supported every step of the way.
Does this sound like an interesting opportunity? Keep reading to learn more about your future role!
Customer
Our client is an international technology company that specializes in developing high-load platforms for data processing and analytics. The company’s core product helps businesses manage large volumes of data, build models, and gain actionable insights. The company operates globally, serving clients primarily in the Marketing and Advertising domain. They focus on modern technologies, microservices architecture, and cloud-based solutions.
Requirements
- 4+ years of experience in data engineering, big data architecture, or related fields
- Strong proficiency in Python and PySpark
- Advanced SQL skills, including query optimization, complex joins, and window functions. Experience using NoSQL databases
- Strong understanding of distributed computing principles and practical experience with tools, such as Apache Spark, Kafka, Hadoop, Presto, and Databricks
- Experience designing and managing data warehouses and data lake architectures in cloud environments (AWS, GCP, or Azure)
- Familiarity with data modeling, schema design, and performance tuning for large datasets
- Experience working with business intelligence tools, such as Tableau or Power BI for reporting and analytics
- Strong understanding of DevOps practices for automating deployment, monitoring, and scaling of big data applications (e.g., CI/CD pipelines)
- At least Upper-Intermediate level of English
Personal Profile
- Excellent communication skills
- Ability to collaborate effectively within cross-functional and multicultural teams
Responsibilities
- Design, develop, and maintain end-to-end big data pipelines that are optimized, scalable, and capable of processing large volumes of data in real-time and batch modes
- Collaborate closely with cross-functional stakeholders to gather requirements and deliver high-quality data solutions that align with business goals
- Implement data transformation and integration processes using modern big data frameworks and cloud platforms
- Build and maintain data models, data warehouses, and schema designs to support analytics and reporting needs
- Ensure data quality, reliability, and performance by implementing robust testing, monitoring, and alerting practices
- Contribute to architecture decisions for distributed data systems and help optimize performance for high-load environments
- Ensure compliance with data security and governance standards
Информация о компании Sigma Software
Преимущества сотрудникам
- Work-life balance
- Гнучкий графік роботи
- Медичне страхування
- Освітні програми, курси
- Юридичний супровід
- Python
- ETL
- SQL
- AirFlow
- Dagster
- Prefect
- Kubernetes
- Apache Spark
- Apache Flink
- Kafka
- dbt
- Delta Lake
- Iceberg
- FastAPI
We are a Data Engineering Service that provides the collection, preprocessing, and storage of analytical data for data-driven decision-making, including the company's products and process improvements via automation, all types of data analysis, and machine learning.
We are currently looking for a Junior Data Engineer who is enthusiastic about learning and ready to work in a dynamic data environment.
In this role, you will:
- Develop and maintain data pipelines (Data Streaming Processors, ETL for third-party data sources) and data tools (Anomaly Detection System, AB testing system, etc.) for internal clients with the help of a team
- Implement and propose technical solutions based on already used in service to cover all acceptance criteria and other team agreements (e.g, tech documentation, tests, NFRs, etc.)
- Support team in reacting to issues and failures by fixing them according to service agreements and priorities
- Develop and maintain documentation, code, and business logic according to service requirements with the team's help
- Communicate with the service team to clarify implementation details and edge cases, specify input or missing data and possible use cases/flows
- Take part in service duty procedures
Skills you’ll need to bring:
- Experience with Python ETL pipelines
- Strong SQL skills (complex joins, window functions, bulk-loading into a data warehouse)
- Experience with orchestration tools like Apache Airflow or similar: Dagster, Prefect
- Experience diagnosing pipeline issues (Airflow logs, warehouse errors, Kubernetes diagnostics, support‑ticket triage)
- Experience consuming third‑party APIs (authentication, pagination, error handling)
- Knowledge of data engineering fundamentals (data modeling, warehousing concepts, ETL best practices, data quality fundamentals)
- Knowledge of cloud computing fundamentals (computational resources, reading/writing to storage, basic IAM)
- At least an Intermediate level of English & fluent Ukrainian
As a plus:
- Experience with Apache Big Data ecosystem (Spark, Flink, Kafka, etc.)
- Knowledge of dbt
- Knowledge of Open-table formats (Delta Lake, Apache Iceberg)
- Knowledge of data lakehouse concepts
- Knowledge of how to implement APIs (preferably FastAPI)
Информация о компании MacPaw
Страницы
Читайте нас в Telegram, чтобы не пропустить анонсы новых курсов.