Data Engineer - Spark - Link (Remote)

Oxford, England, gb
Company: Veeva Systems
Category: Computer and Mathematical Occupations
Published on 2021-08-02 18:10:36
Veeva is a mission-driven organization that aspires to help our customers in Life Sciences and Regulated industries bring their products to market, faster. We are shaped by our values: Do the Right Thing, Customer Success, Employee Success, and Speed. Our teams develop transformative cloud software, services, consulting, and data to make our customers more efficient and effective in everything they do. Veeva is a work anywhere company. You can work at home, at a customer site, or in an office on any given day. As a Public Benefit Corporation, you will also work for a company focused on making a positive impact on its customers, employees, and communities. The Role Veeva Link supports the life sciences industry to connect with key people to improve research and care. It helps professionals to find the right people for e.g. clinical trials, education programs, or advisory boards. This streamlined access helps to reduce time to market of important drugs, conduct trials with the most relevant experts in the respective field and spread information about new treatments to key people in life science community. You can read more about Veeva Link on our product pages at product we build approaches parts of this problem through allocation and aggregation of publicly available information in GDPR conforming manner, respecting people's privacy. As a data engineer with focus on our Apache Spark infrastructure, you take the responsibility for a major part of the Link data processing platform. We value end-to-end ownership, which puts you into the sweet-spot of finding, designing and implementing improvements to the product's data pipeline and adjust it to changing demands in the market. This includes, but is not limited to implementing, tuning and maintaining machine learning models at scale in the AWS cloud, to create a large number of high quality scientific leader’s profiles. In addition, you will work on algorithms to derive insights from our data set and implement tools to ensure a frictionless delivery of data to our customers all around the globe. 

What You'll Do

  • Implement and integrate new machine learning models into our Spark based system
  • Operate new machine learning models which are part of our profile generation pipeline
  • Enhance our data processing pipeline by implementing new algorithms on Apache Spark in Java
  • Stay up to date with new technologies that could benefit Veeva Data Link processing platform
  • Requirements

  • Expert skills ­in Java and Python
  • Experience with Apache Spark
  • Experience writing software for the cloud (AWS, GCP or Azure)
  • Experience with operating machine learning models
  • Good English oral and written communication skills
  • Nice to Have

  • Previously worked in agile environments
  • Experience with expert systems
  • Perks & Benefits

  • Comprehensive benefits package
  • Annual allocations for continuous learning, development & charitable contributions
  • Fitness reimbursement
  • Working from home possible
  • #LI-RemoteVeeva’s headquarters is located in the San Francisco Bay Area with offices in more than 15 countries around the world.

    Jobs you might also be interested in