Data Engineer

Posted Date 7 months ago
Location Saudi Arabia
Discipline Information Technology
Job Reference 31888
Salary 0.0
Job Title: Data Engineer
Location: Riyadh, KSA
Role Type: Permanent

We are looking for a Data Engineer who will play a crucial role in the design, development, and maintenance of scalable data pipelines and infrastructure for state-of-the-art AI-driven solutions and applications. The ideal candidate is an experienced data pipeline builder and data wrangler who enjoys optimizing data systems and building them from the ground up. They must be self-directed and comfortable supporting the data needs of multiple teams, systems, and products.

Responsibilities
• Design, build, and maintain scalable, efficient, and reliable data pipelines for ingesting, processing, and storing large volumes of textual data required for machine learning and NLP-driven solutions, particularly large language models.
• Assemble large, complex datasets that meet functional / non-functional business requirements.
• Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
• Optimize data storage and retrieval techniques, leveraging big data technologies and vector databases when appropriate, to support the efficient training and deployment of large language models.
• Collaborate with research scientists, engineers, and other stakeholders to understand the data requirements for and ensure the timely availability of high-quality data.
• Build processes supporting data transformation, data structures, metadata, dependency and workload management.

Qualifications
• Minimum 3 years of experience in data engineering, with a focus on machine learning and NLP-driven solutions.
• Proficiency in programming languages such as Python, Java, or Scala.
• Experience with big data technologies such as Hadoop, Spark, and NoSQL databases.
• Knowledge of data integration tools and frameworks such as Kafka, NiFi, or Talend.
• Experience with stream-processing systems: Storm, Spark-Streaming, etc.
• Experience with RESTful services and Micro Services: Spring, Spring Boot, REST, JSON, Micro Services, Django, Django Rest framework.
• Experience building and optimizing ‘big data’ data pipelines, architectures and datasets.
• Familiarity with cloud-based data storage and computing services such as AWS, Azure, or Google Cloud Platform.
• Strong analytic skills related to working with unstructured datasets.