Вакансии Data Engineer / Big Data Architect
- ETL
- DataBricks
- Apache Spark
- Python
- Azure Data Factory
- Azure Synapse
- SQL
- Microsoft Azure
- AWS
- Git
Infopulse, part of Tietoevry Create, is inviting a talented professional to join our growing team as a Data Engineer/ETL Developer. Our customer is one of the Big Four companies providing audit, tax, consulting, and financial advisory services.
Areas of Responsibility
- Design, develop, and maintain ETL processes to support data integration and reporting requirements
- Work with Databricks, PySpark (Scala or Python) to create scalable and efficient data pipelines
- Utilize Azure Data Factory for orchestrating and automating data movement and transformation
- Write, optimize, and troubleshoot complex SQL queries for data extraction, transformation, and loading
- Collaborate with cross-functional teams to understand data requirements and deliver high-quality data solutions
- Monitor and ensure the performance, reliability, and scalability of ETL processes
Qualifications
- 3+ years of experience in ETL development or data engineering roles
- Strong experience with Databricks and Spark / Python
- Proficiency in Azure Data Factory and Azure Synapse
- Advanced SQL skills, including query optimization and performance tuning
- Solid understanding of data warehousing concepts and best practices
- Strong problem-solving skills and attention to detail
- Excellent communication skills with the ability to work collaboratively in a team environment
Will be an advantage
- Experience with cloud data platforms (e.g., Azure, AWS)
- Familiarity with data governance and security practices
- Experience with version control systems like Git
Информация о компании Infopulse
- DataBricks
- Databricks Unity Catalog
- PySpark
- Scala
- Python
- SQL
- Azure Data Factory
- ETL
Infopulse, a part of Tietoevry Create, is looking for a skilled and experienced Senior Databricks Developer to join our growing team. Our customer is one of the Big Four companies providing audit, tax, consulting, and financial advisory services
Areas of Responsibility
- Lead the migration of data assets and workloads from legacy Databricks environments to Databricks Unity Catalog
- Design, develop, and maintain scalable ETL processes using Databricks, PySpark (Scala or Python), and other relevant technologies
- Ensure seamless data integration and compliance with data governance standards during the migration process
- Optimize and troubleshoot complex SQL queries for data extraction, transformation, and loading within the new Databricks UC framework
- Collaborate with cross-functional teams to understand migration requirements and deliver high-quality data solutions
- Monitor the performance, reliability, and scalability of the new Databricks Unity Catalog environment post-migration
- Provide administrative support and configuration management for the Databricks platform, ensuring best practices in security and data governance
Qualifications
- 3+ years of experience in Databricks development, including significant experience with Databricks administration
- Proven track record of successfully migrating data environments to Databricks Unity Catalog or similar platforms
- Strong experience with PySpark (Scala or Python) for data pipeline creation and optimization
- Proficiency in SQL, with advanced skills in query optimization and performance tuning
- Familiarity with Azure Data Factory and other cloud-based ETL tools
- Solid understanding of data warehousing concepts, data governance, and best practices in a cloud environment
- Strong problem-solving abilities and attention to detail, especially in migration scenarios
- Excellent communication skills, with the ability to work collaboratively with technical and non-technical stakeholders
Информация о компании Infopulse
- SQL
- ETL
- DataBricks
- Python
- AWS
- Amazon S3
- AWS Redshift
- Athena
- AWS Glue
- AWS Lambda
ELEKS Software Engineering and Development Office is looking for a Middle/Senior Data Engineer in Poland, Croatia and Ukraine.
About project
The customer is a British company producing electricity with zero carbon emissions.
Requirements:
- Experience in Data Engineering, SQL, ETL(data validation + data mapping + exception handling) 3+ years
- Hands-on experience with Databricks 2+ years
- Experience with Python
- Experience with AWS (e.g. S3, Redshift, Athena, Glue, Lambda, etc.)
- Knowledge of the Energy industry (e.g. energy trading, utilities, power systems etc.) would be a plus
- Experience with Geospatial data would be a plus
- At least an Upper-Intermediate level of English
Responsibilities:
- Building Databases and Pipelines: Developing databases, data lakes, and data ingestion pipelines to deliver datasets for various projects
- End-to-End Solutions: Designing, developing, and deploying comprehensive solutions for data and data science models, ensuring usability for both data scientists and non-technical users. This includes following best engineering and data science practices
- Scalable Solutions: Developing and maintaining scalable data and machine learning solutions throughout the data lifecycle, supporting the code and infrastructure for databases, data pipelines, metadata, and code management
- Stakeholder Engagement: Collaborating with stakeholders across various departments, including data platforms, architecture, development, and operational teams, as well as addressing data security, privacy, and third-party coordination
Информация о компании Eleks
Преимущества сотрудникам
- English Courses
- Relocation assistance
- Велопарковка
- Гнучкий графік роботи
- Компенсація витрат на спорт
- Медичне страхування
- Оплачувані лікарняні
- Освітні програми, курси
- Парковка для авто
- Redmine
- Jira
- Wiki
- Confluence
- SQL
- ETL
- ELT
- Looker Studio
- Microsoft Power BI
- Rest API
The Skyvia team is looking for a Business Analyst Lead** to strengthen the consulting domain. Skyvia is a universal no-coding cloud platform for data integration, backup, management, and connectivity. Skyvia supports 160+ connectors, including major cloud apps (CRMs, Marketing tools, Support systems, Management platforms, etc.), databases, and data warehouses.
Requirements
- Manage a team of analysts (set tasks, monitor implementation)
- Create, describe, and implement business processes
- Track the team's performance
- Collect customer challenges and find solutions related to Skyvia product features
- Provide clients with advice on the technical aspects of Skyvia products during the onboarding process
- Develop and implement strategies to improve interactions with clients
- Ensure the high quality of provided PoC and demos
- Facilitate an optimal PoC preparation process that includes the participation of a technical team
- Build and scale the customer onboarding process: drive its formalization, determine the KPIs, implement the ideas
- Establish a process of finding new product use cases
- Streamline competitor research and market analysis
- Create auxiliary materials: presentations, demos, technical documents, user guides, solution sets
- Collect and provide customer feedback to the product team regularly
Responsibilities
- 3+ years of experience as a Data Engineer or Business Analyst
- English: Upper-Intermediate or higher
- Experience in team management, coaching, and team development strategies (e.g., drafting PDP)
- Expert management of overall performance indicators
- Experience in creating, describing, and implementing business processes
- In-depth knowledge of task managers (Redmine, Jira, etc.)
- Familiarity with knowledge bases (Wiki, Confluence, etc.)
- Expertise in conducting user interviews and collecting user requirements
- Active listening and feedback provision skills
- Advanced presenting skillset
- Expert knowledge of SQL
- Deep understanding and hands-on experience in ETL/ELT processes
- Practical knowledge of various databases, data warehouses, and cloud platforms
- Basic understanding and experience with BI tools (Looker Studio, Power BI, etc.)
- Experience in using REST API
- Stress resistance, flexibility, and good communication skills
Nice to have
- Previous experience as a Data analyst or Solution Engineer
- Bachelor's degree in computer science, business management, mathematics, or finance
Информация о компании Devart
Преимущества сотрудникам
- English Courses
- No overtime
- Team buildings
- Гнучкий графік роботи
- Медичне страхування
- Оплачувані лікарняні
- Elasticsearch
- Golang
- Python
- RDBMS
- SQL
- CI/CD
- Agile
- AWS
As a Senior Search Engineer, you will be a key player in our dynamic product development team to help our customers achieve this goal. This role calls for a deep commitment to crafting high-quality, user-focused solutions at terabyte scales.
You will be working primarily with Python, Go, Elasticsearch, and AWS Services: Aurora PostgreSQL, S3, and SageMaker for vectorization.
Key Responsibilities:
- Engage in back-end development to meet our product's evolving needs, ensuring robust, scalable, and high-performance solutions.
- Contribute to all stages of the product life cycle, from conception to deployment, refinement, and scaling.
- Design, implement, and maintain real-time search environments, aligning with the product’s performance and scalability requirements.
- Work collaboratively with product managers, architects, and other team members to deeply understand and address user needs and business objectives.
- Uphold high standards in software development with a focus on clean, testable, and maintainable code.
- Participate actively in code reviews, architecture discussions, and continuous refinement of our development processes.
- Foster a culture of ownership where team members feel empowered and accountable for their contributions.
Technical Skills:
- In-depth knowledge of building efficient data models for Elasticsearch or related full-text search and vector databases at scale.
- Proven experience in Golang or Python development, and a strong willingness to learn and adapt.
- Strong proficiency in SQL, including experience with complex queries, database design, optimization techniques, and working with large datasets.
- Solid foundation in software development for production systems, including OOP principles, test-driven development, CI/CD, and Agile methodologies.
Desired Qualifications:
- 2+ years of production-level Elasticsearch or related experience.
- 4+ years of software development experience for production systems.
- 4+ years of relational database (RDBMS) experience.
- Excellent English communication skills.
Информация о компании CreatorIQ
Преимущества сотрудникам
- Допомога психотерапевта
- Компенсація витрат на спорт
- Компенсація домашнього офісу
- Медичне страхування
- Оплачувані лікарняні
- Оплачувана відпустка
- Python
- PySpark
- Apache Airflow
- Pandas
- SQLAlchemy
- MLflow
- SQL
- Hadoop
Our data lake contains over 2.5 petabytes of marketing metrics, game events, and operational parameters. We do everything possible to ensure there's no doubt about the completeness, relevance, and reliability of the data we provide. We pay special attention to processing speed and data quality. This allows us to make the right decisions when developing our games.
We want to boost the gaming and marketing event attribution business and are looking for a Senior Data Engineer to join our Data Core team, which is responsible for organizing the storage, transformation and access to gaming and marketing analytics data.
Tasks
- Automating external source data extraction processes
- Automating quality control and verification of supplied data, configuring monitoring and regulations
- Transforming and uploading data for further analysis within game analytics and marketing in the most client-friendly form
- Developing and maintaining a continuous data delivery pipeline
- Developing and maintaining the integration services for interaction with partners
Our stack
Python is our main language, and data access is in SQL. Our Data Lake is built on the S3 and Delta Lake format in Databricks. For our DWH, we use Redshift/PostgreSQL. We work in the AWS cloud infrastructure.
We use solutions from mainstream vendors, including MonteCarlo and DBT. We use a serverless approach when working with resources, as well as horizontal scaling and predictive models. We pay close attention to both code and architecture refactoring. For CI/CD, we use TeamCity.
Requirements
- 5+ years of general experience in Data Engineering
- 3+ years of development experience in Python: OOP, skills in designing frameworks and libraries, ability to read and analyze code, experience in profiling and performance/scalability optimization
- Skills in working with popular DE/ML frameworks: PySpark, Airflow, pandas, SQLAlchemy, MLflow
- Skills in writing and optimizing SQL queries, ability to work with query plans, and experience working with repositories of different architectures: MPP, columnar, relation, Hadoop, in-memory
- Experience with storage of different architectures like mpp, columnar, relation, hadoop, and in-memory
- Fluency in Russian
Информация о компании Playrix
Преимущества сотрудникам
- English Courses
- Relocation assistance
- Work-life balance
- Гнучкий графік роботи
- Допомога психотерапевта
- Компенсація витрат на спорт
- Компенсація навчання
- Медичне страхування
- Оплачувані лікарняні
- Освітні програми, курси
- Підтримка Covid-19
- SQL
- PostgreSQl
- MySQL
- Microsoft SQL Server
- Python
- R
- Hadoop
- Apache Spark
- Kafka
- ETL
- AWS
- GCP
- Microsoft Azure
CommerceCore is looking for an experienced Data Engineer. This position will play a pivotal role in fostering collaboration and defining processes essential to the company's operations.
What you will do:
- Design, develop, and maintain scalable data pipelines and ETL processes to handle large volumes of structured and unstructured data;
- Collaborate with other teams to understand data requirements and ensure the availability and reliability of data;
- Optimize and tune the data infrastructure for performance, cost, and security;
- Implement data quality checks, validation, and monitoring processes to ensure the accuracy and integrity of the data;
- Develop and maintain documentation related to data architecture, processes, and workflows.
- Work with cloud platforms to manage and scale data storage and processing;
- Participate in the design and development of data warehouses and data lakes to support analytics and reporting;
- Stay current with industry trends and emerging technologies to ensure that our data infrastructure remains cutting-edge.
What we expect:
- Proven experience as a Data Engineer or in a similar role;
- Strong proficiency in SQL and experience with relational databases such as PostgreSQL, MySQL, or SQL Server;
- Proficiency in programming languages such as Python, R;
- Experience with big data technologies such as Hadoop, Spark, or Kafka;
- Familiarity with data modeling, data warehousing concepts, and ETL processes;
- Experience with cloud platforms such as AWS, Azure, or Google Cloud;
- Knowledge of data governance, security best practices, and compliance;
- Excellent problem-solving skills and the ability to work in a fast-paced, collaborative environment;
- Strong communication skills, with the ability to convey complex technical concepts to non-technical stakeholders.
Информация о компании CommerceCore
Преимущества сотрудникам
- Gaming room
- Team buildings
- Гнучкий графік роботи
- Кава, фрукти, перекуси
- Компенсація навчання
- Надається ноутбук
- Python
- SQL
- Apache Spark
- Hadoop
- AWS
- GCP
- Microsoft Azure
- Apache Airflow
- Prefect
- Kafka
Our client aims to facilitate access to precise data for companies and tech leaders in marketing, e-commerce, sales, cybersecurity, social media performance, and lead generation. They do so by offering a direct look at the web using over 1 million global IPs.
Today, they're seeking innovative and creative individuals to help provide products and services to an expanding client base. Apply now to join their ambitious team of market leaders.
About The Position
MWDN company is seeking an experienced Data Engineer to join our client's team and play a crucial role in designing, building, and maintaining their data infrastructure. You'll work with large-scale data sets, create efficient data pipelines, and collaborate with cross-functional teams to drive data-informed decision-making across the organization.
Requirements
- 5+ years of professional experience in data engineering or a similar role.
- Bachelor's degree in Computer Science, Engineering, or a related technical field; Master's degree preferred.
- Strong proficiency in Python and SQL.
- Extensive experience with big data technologies such as Spark, Hadoop, or similar.
- Familiarity with cloud platforms (e.g., AWS, Azure, GCP) and their data services.
- Experience with data modeling and dimensional data design.
- Knowledge of data warehousing concepts and technologies.
- Familiarity with workflow management tools like Airflow, Prefect, or similar.
- Strong problem-solving skills and attention to detail.
- Excellent communication skills and ability to work in a collaborative environment.
- Level of spoken English: at least an upper-intermediate.
Preferred Skills:
- Experience in the healthcare or entertainment industry.
- Knowledge of HIPAA, HITECH, or other relevant data privacy regulations.
- Familiarity with real-time data processing and streaming technologies (e.g., Kafka).
- Experience with data visualization tools.
- Contributions to open-source projects or active participation in the data engineering community.
Responsibilities
- Design, architect, and maintain scalable data infrastructure using modern technologies such as Python, Spark, Databricks, and cloud platforms.
- Develop and optimize ETL processes and data pipelines to handle large volumes of data efficiently.
- Create and maintain data models, ensuring data quality, integrity, and accessibility.
- Collaborate with data scientists, software engineers, and product managers to understand data needs and provide solutions.
- Implement data governance processes and ensure compliance with data privacy regulations.
- Optimize query performance and improve data retrieval efficiency.
- Contribute to the development of data architecture strategies and best practices.
- Mentor junior team members and provide technical leadership.
Информация о компании MWDN
Преимущества сотрудникам
- English Courses
- Team buildings
- Англомовне середовище
- Без бюрократії
- Бухгалтерський супровід
- Гнучкий графік роботи
- Закордонні відрядження
- Кава, фрукти, перекуси
- Оплачувані лікарняні
- Оплачувана відпустка
- Освітні програми, курси
- Юридичний супровід
- SQL
- Scala
- PostgreSQl
- MySQL
- Oracle
- MSSQL
- Git
- GCP
- AWS
- Microsoft Azure
- MongoDB
- Kafka
- Apache Flink
Кожен заслуговує на шанс реалізувати себе – навіть початківець із мінімальним досвідом. Ми в NIX доводимо це на практиці, тож зараз шукаємо Junior Data Engineer Scala. У нас цей спеціаліст отримає всі можливості для потужного старту!
Чим ти будеш займатися:
- Співпрацювати з командою над впровадженням нових функцій для підтримки зростаючих потреб у даних.
- Створювати та підтримувати конвеєри для вилучення, перетворення та завантаження даних з широкого спектру джерел даних.
- Відстежувати та передбачати тенденції в інженерії даних та пропонувати зміни відповідно до цілей та потреб організації.
- Ділитися знаннями з іншими командами на різні теми з інженерії даних або пов’язані з проєктами.
- Співпрацювати з командою, для вирішення, які інструменти та стратегії використовувати в рамках конкретних сценаріїв інтеграції даних.
Що тобі знадобиться:
- Мінімальний рівень англійської мови B2.
- 1+ років підтвердженого досвіду розробки програмного забезпечення з використанням Scala.
- Хороші знання SQL.
- Знання однієї або декількох з наступних баз даних – PostgreSQL, MySQL, Oracle, MSSQL.
- Знання основних концепцій та технологій обробки потокових даних (в реальному часі, близькому до реального часу).
- Розуміння принципів зберігання даних та концепцій моделювання.
- Розуміння систем контролю версій (таких як Git).
- Командний гравець з відмінними навичками співпраці.
Буде перевагою:
- Знання базових концепцій будь-якої хмарної платформи (GCP, AWS, Azure).
- Знання базових концепцій підходів, патернів та технологій розподілених обчислень.
- Знання базових концепцій нереляційних баз даних (MongoDB).
- Практичний досвід з будь-якими інструментами оркестрування (Kafka, Flink тощо).
Информация о компании NIX Solutions
Преимущества сотрудникам
- Кава, фрукти, перекуси
- Медичне страхування
- Надається ноутбук
- Освітні програми, курси
- Регулярний перегляд зарплатні
- Python
- dbt
- Snowflake
- Argo Workflows
- Fivetran
- SQL
- AWS
- Microsoft Azure
- Kubernetes
- Golang
What you will do
- Design and implement data pipelines using DBT for transformation and modeling;
- Manage and optimize data warehouse solutions on Snowflake;
- Develop and maintain ETL processes using Fivetran for data ingestion;
- Utilize Terraform for infrastructure as code (IaC) to provision and manage resources in AWS, Snowflake, Kubernetes, and Fivetran;
- Collaborate with cross-functional teams to understand data requirements and deliver scalable solutions;
- Implement workflow automation using Argoworkflows to streamline data processing tasks;
- Ensure data quality and integrity throughout the data lifecycle.
Must haves
- Bachelor’s degree in Computer Science, Engineering, or related field;
- 5+ years of experience working with Python;
- Proven experience as a Data Engineer with a focus on DBT, Snowflake, ArgoWorkflows, and Fivetran;
- Strong SQL skills for data manipulation and querying;
- Experience with cloud platforms like AWS or Azure;
- Experience with Kubernetes;
- Familiarity with data modeling concepts and best practices;
- Excellent problem-solving skills and attention to detail;
- Ability to work independently and collaborate effectively in a team environment;
- Upper-intermediate English level.
Nice to have
- +2 years experience with Golang.
Информация о компании AgileEngine
Преимущества сотрудникам
- Гнучкий графік роботи
- Зарплатня вище ринку
- Регулярний перегляд зарплатні
- ADF
- DataBricks
- Python
- SQL
- Azure DevOps
- Git
- Docker
- Kubernetes
- Microsoft Azure
- RDBMS
- PostgreSQl
- MySQL
- Azure Synapse
- Agile
Svitla Systems Inc. is looking for a Data Architect for a full-time position (40 hours per week) in Ukraine. Our client is a leading global provider of Environmental, Social, and Governance (ESG) performance and risk management software, data, and consulting services focusing on Environment, Health, Safety & Sustainability (EHS&S), Operational Risk Management, and Product Stewardship. That means gathering information and reporting to improve the environmental and social impacts on the customer’s business. They are a leading middle-market private equity firm focused on investments in targeted segments of the software, industrial technology, financial services, and healthcare industries. For over 30 years, they have served over 7,000 customers and millions of users in 80 countries to optimize workflows and navigate the complex and dynamic global regulatory structure.
We are preparing for a new project and need a seasoned Data Engineer/Architect.
Requirements:
- 5+ years of experience with data, data lake, and data warehouse architecture
- 5+ years with data-ingestion and compute technologies (ADF and Databricks).
- Experience in architecting modern data lake platforms.
- 5+ years above average understanding and practical use of programming in a modern programming language (Python, SQL).
- Understanding using version control (Azure DevOps, Git, etc.) for collaborative code development.
- Understanding of container architecture (Docker, Kubernetes).
- Knowledge of cloud architectures (Azure service).
- Knowledge of Azure Analytics Services (preferred).
- Knowledge of RDBMS systems for OLTP workloads, such as Postgres or MySQL and Synapse.
- Bachelor’s degree in Computer Science or a related field.
- Strong passion for data and architectural patterns.
- Excellent communication skills, including a knack for clear documentation and the ability to work using Agile methodologies.
- Ability to work quickly and collaboratively in a fast-paced, entrepreneurial environment.
- Self-motivated and self-managing skills.
- Strong desire to finish and win.
- Team player.
- High integrity and ethical behavior.
Responsibilities:
Your responsibilities may include, but are not limited to, the following:
- Perform as a part of the Data Services Team under the high-profile platform organization.
- Architect and be the resident expert regarding the highly scalable framework for data pipeline, transforming and enhancing data at scale.
- Help design, architect, implement, and bring data lake to reality.
- Manage design, architect, and implement data services and processes.
- Build data models and create data warehouse solutions for the client’s BI applications.
Информация о компании Svitla
Преимущества сотрудникам
- English Courses
- Pet-friendly
- Team buildings
- Work-life balance
- Відпустка по догляду за дитиною
- Гнучкий графік роботи
- Кава, фрукти, перекуси
- Компенсація витрат на спорт
- Компенсація навчання
- Медичне страхування
- Оплачувані державні свята
- Оплачувані лікарняні
- Регулярний перегляд зарплатні
- Apache Spark
- Iceberg
- Kafka
- NoSQL
- SQL
- Microservices
- Docker
- Kubernetes
- Java
- Python
- C#
We are seeking a highly skilled Data Platform Architect to join our team. The ideal candidate will have extensive experience with Apache Spark, Apache Iceberg, Kafka, NoSQL and SQL databases, data warehouses, lakehouses, and microservices architecture. This role is pivotal in driving the strategic direction of our data platform, ensuring alignment with business goals, and facilitating the integration of new technologies.
Key Responsibilities:
- Collaborate with product managers and stakeholders to understand project objectives, functional and non-functional requirements.
- Research, assess, and recommend technologies, frameworks, libraries, and tools.
- Lead or contribute to Proof of Concept (PoC) initiatives to validate architectural solutions or new technologies.
- Create and review high-level design architectural blueprints defining system structure, components, interactions, and interfaces.
- Review software design documents to ensure adherence to high-level designs and architectural principles.
- Identify areas needing refactoring or architectural adjustments to manage technical debt.
- Stay updated with emerging technologies and trends to make informed recommendations.
- Conduct architecture gap analysis when integrating new software products or migrating solutions.
- Perform system risk analysis and recommend mitigation strategies.
- Conduct code reviews and provide technical guidance to development teams.
Qualifications:
- Bachelor's or Master's degree in Computer Science, Information Technology, or a related field.
- 5+ years of relevant experience in architecture.
- Good knowledge in the conception, design, and construction of data platforms, including ingestion, batch and serving layers.
- Experience implementing Data Mesh architectures is a plus.
- Experience implementing a holistic data governance layer is a plus.
- Extensive experience with Apache Spark, Apache Iceberg, Kafka, NoSQL and SQL databases, data warehouses, and microservices architecture.
- Strong knowledge of the end-to-end data lifecycle across data warehouses, relational databases, operational data stores, BI reporting, and big data analytics.
- Experience creating conceptual and logical data models.
- Experience with container technologies (Docker, Kubernetes).
- Proficiency in at least one programming language like Java, Python or C#.
- Strong ability to clearly and effectively communicate complex design solutions and decisions to non-technical audience.
Информация о компании Playtika
- Azure Data Lake
- Azure Data Platform
- SQL
- NoSQL
- ETL
- Microsoft Azure
- Azure Data Factory
- Azure Synapse
- Azure Databricks
- Azure Blob Storage
- Azure Table Storage
- Python
- Scala
- API
- RESTful API
- GDPR
- Git
- CI/CD
Digital transformation project for aftermarket support for Hitachi Global Air Power
Requirements:
- Bachelor’s degree in Computer Science, Engineering or related field
- 5+ years of overall work experience working on Data first systems
- 2+ years of experience on Data Lake/Data Platform projects on Azure
- Strong knowledge of SQL for handling relational databases and familiarity with NoSQL databases to manage structured and unstructured data effectively.
- Understanding of data warehousing concepts, including data storage, data marts, and ETL processes.
- Skilled in using Microsoft Azure services such as Azure Data Lake, Azure Data Factory, Azure Synapse Analytics, and Azure Databricks. These tools are essential for data ingestion, storage, processing, and analytics.
- Knowledge of cloud storage solutions provided by Azure, such as Blob Storage and Table Storage, which are integral for data lakes.
- Familiarity with ETL tools and frameworks for data extraction, transformation, and loading. Skills in Azure Data Factory or similar tools are particularly valuable.
- Ability to perform data cleaning, transformation, and enrichment to ensure data quality and usability.
- Proficient in programming languages such as Python or Scala, which are widely used in data engineering for scripting and automation.
- Skills in scripting to automate routine data operations and processes, improving efficiency and reducing manual errors.
- Understanding of how to develop and maintain APIs, particularly JDBC/ODBC APIs for data querying. Knowledge of RESTful API principles is also beneficial.
- Awareness of data security best practices, including data encryption, secure data transfer, and access control within Azure.
- Understanding of compliance requirements relevant to data security and privacy, such as GDPR.
- Experience with data testing frameworks to ensure the integrity and accuracy of data through unit tests and integration tests.
- Proficiency with version control tools like Git to manage changes in data scripts and data models.
- Familiarity with DevOps practices related to data operations, including continuous integration and continuous deployment (CI/CD).
- Basic skills in data analysis to derive insights and identify data trends, which can help in troubleshooting and improving data processes.
Responsibilities:
- Data Ingestion & Integration:
- Assist in the development and maintenance of data ingestion pipelines that collect data from various business systems, ensuring data is integrated smoothly and efficiently into the data lake.
- Implement transformations and cleansing processes under the guidance of senior engineers to prepare data for storage, ensuring it meets quality standards.
- Data Management:
- Help manage the organization of data within the data lake, applying techniques for efficient data storage and retrieval.
- Assist in managing and configuring data schemas based on requirements gathered by senior team members, ensuring consistency and accessibility of data.
- Support the maintenance of the data catalog, ensuring metadata is accurate and up-to-date, which facilitates easy data discovery and governance.
- API Support & Data Access:
- Support the development and maintenance of JDBC/ODBC-based SQL query APIs, ensuring they function correctly to allow end-users to access and query the data lake effectively.
- Assist in optimizing and supporting simple analytical queries, ensuring they run efficiently and meet user and business requirements.
- Quality Assurance & Testing:
- Conduct routine data quality checks as part of the data ingestion and transformation processes to ensure the integrity and accuracy of data in the data lake.
- Participate in testing of the data ingestion and API interfaces, identifying bugs and issues for resolution to ensure robustness of the data platform.
- Collaboration and Teamwork:
- Work closely with the Principal Data Architect and Principal Data Engineer, assisting in various tasks and learning advanced skills and techniques in data management and engineering.
- Occasionally interact with other business stakeholders to understand data needs and requirements, facilitating better support and modifications in data processes.
- Platform Monitoring & Maintenance:
- Help monitor the performance of data processes and the data lake infrastructure, assisting in troubleshooting and resolving issues that may arise.
- Learning & Development:
- Continuously learn and upgrade skills in data engineering tools and practices, especially those related to Azure cloud services and big data technologies.
- Contribute ideas for process improvements and innovations based on day-to-day work experiences and challenges encountered.
Информация о компании GlobalLogic
Преимущества сотрудникам
- Relocation assistance
- Б'юті послуги
- Допомога психотерапевта
- Компенсація витрат на спорт
- Медичне страхування
- Освітні програми, курси
- SQL
- Python
- GCP
- Apache Spark
- Google Cloud Dataflow
- Apache Beam
- Apache Airflow
- Cloud Composer
- Kafka
- Cloud Pub/Sub
- BigQuery
- dbt
- Dataform
Big Data & Analytics is the Center of Excellence's data consulting and data engineering branch. Hundreds of data engineers and architects nowadays build Data & Analytics end-to-end solutions from strategy through technical design and proof of concepts to full-scale implementation. We have customers in the healthcare, finance, manufacturing, retail, and energy domains.
We hold top-level partnership statuses with all the major cloud providers and collaborate with many technology partners like AWS, GCP, Microsoft, Databricks, Snowflake, Confluent, and others.
If you are
- A Big Data engineer focused on data pipeline creation
- Well-versed in batch or streaming processing
- Proficient in SQL and Python
- Experienced in developing data solutions in GCP
- Skilled in Apache Spark, Cloud Dataflow, or Apache Beam
- Knowledgeable in Apache Airflow or Cloud Composer
- Familiar with Apache Kafka or Cloud Pub/Sub
- Experienced with Google BigQuery
- Used to working with dbt and Dataform
And you want to
- Be part of a team of data-focused engineers committed to continuous learning, improvement, and knowledge-sharing
- Utilize a cutting-edge technology stack, including innovative services from major cloud providers
- Engage with customers from diverse backgrounds, including large global corporations and emerging startups
- Participate in the entire project lifecycle, from initial design and proof of concepts (PoCs) to minimum viable product (MVP) development and full-scale implementation
Информация о компании SoftServe
Преимущества сотрудникам
- Fitness Zone
- Гнучкий графік роботи
- Компенсація витрат на спорт
- Медичне страхування
- Оплачувані лікарняні
- AWS
- EMR
- DataBricks
- EC2
- Amazon S3
- AWS Redshift
- Scala
- Apache Spark
- Hive
- Yarn
- Mesos
- SQL
- NoSQL
- Java
- Python
- Kafka
- Presto
We are looking for a Data Engineer for our client.
Leading programmatic media company, specializing in ingesting large volumes of data, modeling insights, and offering a range of products and services across Media, Analytics, and Technology.
Hybrid work (2-3 days a week from the office in Warsaw).
Responsibilities:
- Follow and promote best practices and design principles for Big Data ETL jobs.
- Help in technological decision-making for the business’s future data management and analysis needs by conducting POCs.
- Monitor and troubleshoot performance issues on data warehouse/lakehouse systems.
- Provide day-to-day support of data warehouse management.
- Assist in improving data organization and accuracy.
- Collaborate with data analysts, scientists, and engineers to ensure best practices in terms of technology, coding, data processing, and storage technologies.
- Ensure that all deliverables adhere to our world-class standards.
Skills:
- Over 3+ years of experience in Data Warehouse development and database design;
- In-depth knowledge of distributed computing principles;
- Experience with AWS cloud services and big data platforms such as EMR, Databricks, EC2, S3, and Redshift;
- Proficiency with Scala, Spark, Hive, Yarn/Mesos, etc.;
- Strong skills in SQL and NoSQL databases, including data modeling and schema design;
- Proficient in programming languages such as Java, Scala, or Python for data processing algorithms and workflows;
- Familiarity with Presto and Kafka is advantageous;
- Experience with DevOps practices and tools for automating deployment, monitoring, and managing big data applications is a plus;
- Excellent communication, analytical, and problem-solving abilities.
Информация о компании Ardura Consulting
Преимущества сотрудникам
- Гнучкий графік роботи
- Довгострокові проекти
- Медичне страхування
- Python
- PySpark
- Hadoop
- Apache Spark
- Kafka
- BigQuery
- AWS Glue
- Azure Data Factory
- Google Cloud Dataflow
- ETL
- ELT
- PostgreSQl
- MySQL
- NoSQL
- OpenAI
- Docker
- Kubernetes
- Palantir Foundry
- JavaScript
- TypeScript
- HTTP
We are looking for a proactive Middle+/Senior Big Data Engineer to join our vibrant team! You will play a critical role in designing, developing, and maintaining sophisticated data pipelines, and using Foundry tools such as Ontology, Pipeline Builder, Code Repositories, etc. The ideal candidate will possess a robust background in cloud technologies, data architecture, and a passion for solving complex data challenges.
Tools and skills you will use in this role: Palantir Foundry, Python, PySpark, SQL, basic TypeScript.
Responsibilities:
- Collaborate with cross-functional teams to understand data requirements, and design, implement and maintain scalable data pipelines in Palantir Foundry, ensuring end-to-end data integrity and optimizing workflows.
- Gather and translate data requirements into robust and efficient solutions, leveraging your expertise in cloud-based data engineering. Create data models, schemas, and flow diagrams to guide development.
- Develop, implement, optimize and maintain efficient and reliable data pipelines and ETL/ELT processes to collect, process, and integrate data to ensure timely and accurate data delivery to various business applications, while implementing data governance and security best practices to safeguard sensitive information.
- Monitor data pipeline performance, identify bottlenecks, and implement improvements to optimize data processing speed and reduce latency.
- Troubleshoot and resolve issues related to data pipelines, ensuring continuous data availability and reliability to support data-driven decision-making processes.
- Stay current with emerging technologies and industry trends, incorporating innovative solutions into data engineering practices, and effectively document and communicate technical solutions and processes.
- Be eager to get familiar with new tools and technologies
Requirements:
- 4+ years of experience in data engineering;
- Strong proficiency in Python and PySpark;
- Proficiency with big data technologies (e.g., Apache Hadoop, Spark, Kafka, BigQuery, etc.);
- Hands-on experience with cloud services (e.g., AWS Glue, Azure Data Factory, Google Cloud Dataflow);
- Expertise in data modeling, data warehousing, and ETL/ELT concepts;
- Hands-on experience with database systems (e.g., PostgreSQL, MySQL, NoSQL, etc.);
- Effective problem-solving and analytical skills, coupled with excellent communication and collaboration abilities;
- Strong communication and teamwork abilities;
- Understanding of data security and privacy best practices;
- Strong mathematical, statistical, and algorithmic skills.
Nice to have:
- Certification in Cloud platforms, or related areas;
- OpenAI/any other LLM API experience
- Familiarity with containerization technologies (e.g., Docker, Kubernetes);
- Basic HTTP understanding to make API calls;
- Familiarity with Palantir Foundry;
- Previous work or academic experience with JavaScript / TypeScript
Информация о компании N-iX
Преимущества сотрудникам
- English Courses
- Гнучкий графік роботи
- Компенсація витрат на спорт
- Компенсація навчання
- Медичне страхування
- Salesforce
- Workday
- JD Edwards
- AWS
- AWS Redshift
- Amazon S3
- AWS Glue
- Snowflake
- SAFe
N-iX is seeking a Lead Data Engineer to join our team in delivering a world-class data platform for one of the largest transportation companies in the U.S. This role is primarily focused on data modeling within a vast data environment, including a data lake and data warehouse, integrating over 30 data sources such as Salesforce, Workday, and other advanced applications.
The initial project is expected to last 2 years, after which the successful candidate will continue working within the CTO office on similar high-impact projects. You'll have the opportunity to collaborate with and learn from a world-class engineering manager and architect, who will provide mentorship and guidance.
Key Responsibilities:
- Lead the design and development of robust data models for a large-scale data platform, encompassing data lakes and data warehouses.
- Integrate and model data from 30+ diverse data sources, including Salesforce, Workday, and other enterprise applications.
- Apply Kimball principles to ensure high-quality data modeling that supports scalable and efficient data solutions.
- Collaborate closely with engineering teams, product managers, and stakeholders to define data requirements and solutions.
- Participate in the ongoing optimization and evolution of the data platform.
- Contribute to the mentorship and development of junior engineers on the team.
Must-Have Skills:
- 5+ years of experience in data modeling, particularly within large-scale data platforms.
- Expert knowledge of Kimball data modeling principles.
- Strong proficiency in English, with excellent communication skills.
- Proven experience in integrating and modeling data from multiple sources, including enterprise systems like Salesforce, Workday, JD Edwards and a bunch of home-made systems.
Nice-to-Have Skills:
- Experience with AWS data services, including Redshift, S3, and Glue.
- Experience working with Snowflake or other cloud-based data warehouses.
- Background in working with large enterprise companies, particularly in building extensive data platforms.
- Familiarity with Scaled Agile Framework (SAFe) principles.
Информация о компании N-iX
Преимущества сотрудникам
- English Courses
- Гнучкий графік роботи
- Компенсація витрат на спорт
- Компенсація навчання
- Медичне страхування
- RDBMS
- С++
- Linux
- Debugging
- Database testing
- ClickHouse
Altinity is looking for a great server internals engineer to work on contributions to ClickHouse. As a ClickHouse Open Source Developer, you’ll be responsible for designing, implementing, and supporting features of ClickHouse ranging from encryption to storage to query processing. We’re looking for imaginative engineers with a background in database internals and in high-performance languages like C++.
We have lots of exciting projects underway as we help the community adapt ClickHouse to the cloud and Kubernetes.
Our ideal candidate has:
- Proven experience in design, implementation, and testing high-performance DBMS features in a complex C++ codebase.
- Excellent background in database internals including query languages, access methods, storage, and/or connectivity.
- Demonstrated ability to read and write good C++.
- Good understanding of networking and I/O on Linux.
- Familiar with performance optimization techniques and tools.
- History of getting pull requests vetted and merged in rapidly evolving open-source projects.
- Sound knowledge of database testing, debugging, and low-level performance optimization.
- Enthusiasm to learn more about database technology and data-related applications.
- Good English language reading and writing skills.
- Eager to work with a friendly, distributed team following open-source dev practices.
- MAJOR PLUS: previous development experience on ClickHouse.
A day in your life as a ClickHouse server engineer may include any or all of the following:
- Write good task-specific C++ code and solidify it with tests.
- Debug issues reported by users, fix them and add tests to make sure they won’t happen again.
- Profile existing code and make it faster (either by applying clever algorithms, adding vectorized intrinsics, or by implementing cool tricks), add performance tests.
- Submit your own pull requests and review pull requests from others.
- Help the Support Team investigate customer problems running ClickHouse.
- Help new community members contribute to ClickHouse.
- Attend meetups and make presentations on open-source development.
- Write blog articles and share information about ClickHouse.
Информация о компании Altinity
Преимущества сотрудникам
- Team buildings
- Англомовне середовище
- Гнучкий графік роботи
- Освітні програми, курси
- Apache Spark
- Java
- Groovy
- Python
- Hadoop
- AWS
- EMR
- Amazon S3
- AWS Lambda
- EC2
- AWS Glue
- RDS
- SQL
- MySQL
- NoSQL
- Elasticsearch
- Scala
- OpenZeppelin
- Apache Airflow
- ETL
Our client is a software company providing client management applications to organizations in the healthcare industry. For the past 15 years, DataArt's specialists have been helping them create online intake services for hospitals, medical centres, and private practices in the US.
An application is designed to help people schedule appointments, pay medical bills, find test results, and receive notifications about appointments with doctors.
Position overview
As a Software Engineer – (Data Engineer) on our team, you will contribute to the development of the fastest-growing direct-to-consumer health information platform in the US. Collaborating with a team of seasoned data engineers and architects, you will work with diverse healthcare data from both the US and international sources. This role will provide you with an in-depth understanding of various aspects of the healthcare industry, particularly healthcare data and technology. You will be responsible for building and maintaining complex data pipelines that ingest, process (using our algorithms), and output petabytes of data. Additionally, you will work closely with architects, product managers, and SMEs to develop and maintain algorithms that generate unique insights, aiding patients in finding better care more quickly.
Responsibilities
- Work closely with data engineers, the product team, and other stakeholders to gather data requirements, and design and build efficient data pipelines
- Create and maintain algorithms and data processing code in Java/Groovy
- Implement processes for data validation, cleansing, and transformation to ensure data accuracy and consistency
- Develop Python scripts to automate data extraction from both new and existing sources
- Monitor and troubleshoot the performance of data pipelines in Airflow, proactively addressing any issues or bottlenecks
- Write SQL queries to extract data from BigQuery and develop reports using Google’s Looker Studio
- Participate in daily stand-ups, sprint planning, and retrospective meetings
- Engage in peer code reviews, knowledge sharing, and assist other engineers with their work
- Introduce new technologies and best practices as needed to keep the product up to date
- Assist in troubleshooting and resolving production escalations and issues
Requirements
- Bachelor's degree or equivalent programming experience
- 4-5 years of overall experience as a backend software developer, with at least 2 years as a Data Engineer using Spark with Java/Groovy and/or Python
- Strong coding skills, and knowledge of data structures, OOP principles, databases, and API design
- Highly proficient in developing programs and data pipelines in Java/Groovy or Python
- 2+ years of professional experience with Apache Spark/Hadoop
Nice to have
- Work experience with AWS (EMR, S3, lambda, EC2, glue, RDS)
- Work experience with SQL (MYSQL is a Plus) and NoSQL Databases
- Experience with Elasticsearch
- Experience with Python
- Experience with Scala (Zeppelin)
- Experience with Airflow or other ETL
- Certification or verified training in one or more of the following technologies/products: AWS, ElasticSearch
Информация о компании DataArt
Преимущества сотрудникам
- English Courses
- Fitness Zone
- Gaming room
- Paid overtime
- Team buildings
- Work-life balance
- Без дрес-коду
- Відпустка по догляду за дитиною
- Велика стабільна компанія
- Велопарковка
- Гнучкий графік роботи
- Довгострокові проекти
- Кімната відпочинку
- Кава, фрукти, перекуси
- Медичне страхування
- Оплачувані лікарняні
- Освітні програми, курси
- Microsoft Power BI
- Snowflake
- DataBricks
- Microsoft Fabric
- Azure Synapse
- Microsoft SQL Server
- Python
Project description
You will be part of the Team tasked to transform and equip a large Manufacturer with digital solutions, skills and knowledge to strengthen World Class Manufacturing for competitive advantage. You will be responsible for making sure the standard solutions and services deployed integrate with the existing landscape in a proper and documented way.
You will be developing Digital Manufacturing Data Warehouse product for 100+ factories across the globe.
Data Engineering, Data Visualization, Data Warehousing, Microsoft SQL Server, Azure Data Lake Storage, Azure SQL Database, Azure Databricks/Spark, Azure IoT Hub, Snowflake, Power BI, Time Series Databases.
Responsibilities
- Work with business users to gather requirements.
- Create reports, dashboards, KPIs, and visualizations from data with Power BI.
- Extract data from multiple sources, manipulating, and validating database performance, integrity, security for Power BI reporting.
- Configure Power BI service.
- Test reports for accuracy prior to deployment.
- Continuously improve performance and reliability of reports.
- Recommend additional options for reporting and analytics.
Must have skills
- 4+ years of relevant working experience, preferably in an international environment
- Hands-on experience with Power BI Dashboards, Reports, Scheduling, Publishing.
- Creation of complex semantic data models.
- Ability to enhance data models as needed for custom visualization requests
- Great numerical and analytical skills
- Degree in Computer Science, IT, or similar fields
Nice to have
- Snowflake, Databricks, Microsoft Fabric, Synapse Analytics, SQL Server, Python, Manufacturing
Информация о компании Luxoft
Преимущества сотрудникам
- Relocation assistance
- Team buildings
- Багатонаціональна команда
- Велика стабільна компанія
- Освітні програми, курси
Страницы
Читайте нас в Telegram, чтобы не пропустить анонсы новых курсов.