Data Engineer
Posted on March 3, 2026
Basel
English
Temporary
Posted on March 3, 2026
About this role
The Data and Digital Catalyst (DDC) organisation of our pharmaceutical client drives the modernisation of our computational and data ecosystems and integration of digital technologies across Research and Early Development to enable our stakeholders, power data-driven science and accelerate decision-making. The Scientific Content Team is responsible for providing the data platforms for compliant and streamlined access to the company's extensive collection of scientific content such as literature text data.
The Perfect Candidate: We are looking for a technically strong data professional with solid experience in data engineering, data warehousing, and database management, including hands-on expertise with NoSQL databases such as MongoDB. Strong Python scripting skills and practical experience with ETL orchestration tools are essential. A good understanding of Text Analytics, Text and Data Mining (TDM), and Large Language Models (LLMs), and is familiar with applying FAIR principles in data workflows. Experience with scientific or other text-based data, high-performance computing environments, and API development (e.g., GraphQL) is highly desirable. A biomedical background is a plus.
General Information:
Tasks & Responsibilities:
Must Haves:
We thank you for your application!
The Perfect Candidate: We are looking for a technically strong data professional with solid experience in data engineering, data warehousing, and database management, including hands-on expertise with NoSQL databases such as MongoDB. Strong Python scripting skills and practical experience with ETL orchestration tools are essential. A good understanding of Text Analytics, Text and Data Mining (TDM), and Large Language Models (LLMs), and is familiar with applying FAIR principles in data workflows. Experience with scientific or other text-based data, high-performance computing environments, and API development (e.g., GraphQL) is highly desirable. A biomedical background is a plus.
General Information:
- Start Date: asap.
- Latest Possible Start Date: 01.06.2026
- Planned Duration of Employment: 12 months
- Extension: Likely if budget gets approved
- Workplace: Basel
- Home Office: partially possible
- Team: 8
Tasks & Responsibilities:
- Designing and implementing the ETL pipeline and MongoDB scheme.
- Managing the document database, data curation, and FAIRification (cleaning, parsing, disambiguation, deduplication, data harmonization etc.), including ongoing database maintenance.
- Integrating new data sources, including the development of robust parsers and text cleaning workflows.
- Enhancing the user experience (e.g., building a data warehouse and APIs).
- User support for utilizing RoMine datasets (documentation, training, use case support).
Must Haves:
- Data engineering, data warehousing, and database experience *****
- Experience with NoSQL/MongoDB database. *****
- Good understanding in the area of Text Analytics/Mining, TDM, and LLMs.
- Experience or knowledge of ETL orchestration tools, such as Airflow. *****
- Strong proficiency in Python scripting. *****
- Knowledgeable on the FAIR principles and associated methodology.
- Familiarity with scientific literature/published content/literature or/and other text-based data
- Excellent communication and interpersonal skills in English (fluent), German is nice to have.
- Interdisciplinary teamwork, ability to explain technical concepts to non technical stakeholders
- Exposure to high-performance computing environments.
- Experience with API development (e.g., GraphQL API) is a plus.
- Biomedical background/education is a plus
We thank you for your application!
Want more jobs like this?Get IT & technology jobs in Basel delivered straight to your inbox.By signing up, you agree that we may process your information in accordance with our privacy policy.
More jobs from this employer