2022 Archived Content

Data Management

Implement Data Management Solutions to Satisfy Increased Computing Power Demands

18 - 19 October 2022 ALL TIMES CEST

With the increased demand in computing power from life science researchers and scientists tackling big data issues, data storage infrastructure must be able to scale to handle billions of data points and files efficiently. The Data Management track will explore solutions to ensure data is FAIR (Findable, Accessible, Interoperable, and Reusable) to best effect change across the organization. Presentations will discuss FAIR principles and definitions, technology tools and tool differences, as well as application/use cases.

Monday, 17 October

Registration Open (Foyer)15:00

Tuesday, 18 October

Registration and Morning Coffee (Foyer)07:30

ROOM LOCATION: MOA 9

PLENARY KEYNOTE PROGRAM

09:00

Chairperson's Remarks

Allison Proffitt, Editorial Director, Bio-IT World

09:15

PLENARY KEYNOTE PRESENTATION: To Unlock the Power of AI, It’s Time To Stop Thinking Human

Richard Law, PhD, Chief Business Officer, Exscientia

We’ve spent more than a decade hearing about the promise of AI for drug discovery and development. While that promise remains, it’s time to evolve our approach to unlock it. For all the buzz about AI, most companies are still using AI-assisted approaches, where humans focus on leveraging “good data” to fuel AI. Yet AI platforms, at heart, are data agnostic and make “decisions” far beyond human comprehension. No human can think in say, a 2,500-dimensional space. It’s too complicated, requires too much learning, and too much data to be managed by humans. In short, humans are still calling the shots, using AI to problem-solve along the way. To unlock the power of AI for drug development and discovery, it is time to remain patient-centric, but stop thinking like humans and allow platforms to be designed to learn and become increasingly powerful and accurate with each incremental piece of data analyzed. In this talk, we’ll discuss the power of this approach and learn about industry players who are embracing this new “AI First” way of re-engineering drug discovery processes – the leap to full, end-to-end integration of artificial intelligence – to maximize the potential of AI and machine learning to create better medicines faster and smarter.

Grand Opening Coffee Break in the Exhibit Hall with Poster Viewing (Room Location: MOA 11)10:15

ROOM LOCATION: MOA 5

TECHNOLOGY, TOOLS, AND DATA REPOSITORIES TO PREPARE DATA FAIRNESS FOR USE IN RESEARCH & INNOVATION

11:20

Chairperson's Remarks

Christopher Southan, PhD, Competitive Intelligence Analyst, Data Sciences, Medicines Discovery Catapult

11:25

Open-Source Tools for Research Data Management: A Brief Overview and Evaluation

Eelke Van der Horst, PhD, FAIR Engineering Expert, The Hyve

Pharma companies and academic organizations around the world use different tools for managing their research data. Some of the most popular open-source ones are Gen3, iRODS, COLID, CEDAR and Fairspace. As a service provider in the field of (FAIR) research data management, The Hyve has conducted an in-depth analysis and comparison of these tools based on technical and community criteria. We will present the findings and implications in this talk.

11:55

Towards a Sustainable FAIR Ecosystem for Biomedical Digital Twins

Eric Stahlberg, PhD, Director, Cancer Data Science Initiatives, Frederick National Laboratory for Cancer Research

The number of efforts with the aim to develop digital twins in biomedicine are continuing to grow at an increasing rate, enabled by dramatic advances in AI, the availability of essential data, and access to computation to create and compose the elements for biomedical digital twins. With multiple efforts underway globally, questions arise on how to both create and sustain a supporting ecosystem the enables open science, fosters fairness for the community, as well as embraces FAIR (findable, accessible, interoperable, and reusable) resources. Drawing from experiences and examples of several ongoing efforts, including efforts in cancer, insights into key elements for just such an ecosystem will be presented and discussed.

12:25 Components of Ontology management in Automated Workflows

Anuj Singh, Senior Manager, Bioinformatics, Excelra Knowledge Solutions Pvt. Ltd.

Many companies are using ontologies for Data Annotation/ integration, NER, etc. However, managing ontologies is challenging, it is even more challenging when ontologies are served to downstream applications in an automated workflow as in some cases, the process might need to consider the requirements of the downstream applications.

Based on our experience, we present the components of effective ontology management & deployment to support integration of ontologies in automated workflows.

Networking Lunch (Room Location: MOA 11)12:55

Dessert Break in the Exhibit Hall with Poster Viewing (Room Location: MOA 11)13:55

UTILIZING TECHNOLOGY, TOOLS, AND DATA REPOSITORIES TO ENHANCE DATA FAIRNESS

14:25

Chairperson's Remarks

Christopher Southan, PhD, Competitive Intelligence Analyst, Data Sciences, Medicines Discovery Catapult

14:30

FAIR Obstacles for Curating SARS-CoV-2 M-Protease Inhibitors

Christopher Southan, PhD, Competitive Intelligence Analyst, Data Sciences, Medicines Discovery Catapult

Despite COVID-19 vaccine successes there is an urgent need for small-molecule antivirals such as the recent Pfizer M-protease (M-prot) inhibitor nirmatrelvir, PF-07321332 published  in PMID34726479 and WO2021250648. Since 2020 the Guide to Pharmacology and BindingDB have collaborated on curating FAIR database entries for clinical candidate M-prot inhibitors from papers and patents. Patterns of FAIR disclosure are patchy in varying between an open science web portal from the COVID moon-shot consortium, a paper (but no patent) for Shionogi  S-217622 but blinding (i.e.; neither paper nor patent) of SH-879 from Sosei Heptares. Challenges of extracting and mapping chemical structures between papers and patents for M-prot lead inhibitors for FAIR database curation will be outlined.

15:00

Platforms and Standards for Application of FAIR Principles

Alexander Sherman, Director, Center for Innovation and Bioinformatics, Massachusetts General Hospital

Collaboration among academia, non-profits, and industry are essential for the creation of disease-specific pre-competitive informational resources for applying AI/ML models in identification of disease biomarkers, patient subpopulations, and creating disease progression and disease staging models. To improve the Findability, Accessibility, Interoperability, and Reuse (FAIR) of digital knowledge, i.e., introduction of precision research, the clinical research community shall agree to operate under similar operational guidelines, from regulatory/legal, to patient identification, to data standards.

15:30

Cost-Benefit Decision Making for Dataset FAIRification in Pharmaceutical R&D

Carole Goble, CBE FREng FBCS CITP, Professor, Computer Science, The University of Manchester

The findable, accessible, interoperable, reusable (FAIR) principles for scientific data management and stewardship aim to facilitate data reuse at scale by both humans and machines. Research and development (R&D) in the pharmaceutical industry is becoming increasingly data driven. However, managing its data assets according to FAIR principles remains costly and challenging. To date, little scientific evidence exists about how FAIR is currently implemented in practice, what its associated costs and benefits are, and how decisions are made about the retrospective FAIRification of data sets in pharmaceutical R&D. This talk reports the results of a study with pharmaceutical professionals who participate in various stages of drug R&D in seven pharmaceutical businesses, and the FAIR-Decide decision making tool based on the study outcomes and Cost Benefit Analysis and Multi-Criteria Analysis. Authors: Ebtisam Alharbi, Rigina Skeva, Nick Juty, Caroline Jay, Carole Goble

16:00

The FAIR Cookbook: A Resource for FAIR Doers

Philippe Rocca-Serra, PhD, Group Co-Investigator, Oxford e-Research Centre & Associate Member of Faculty, Department of Engineering, University of Oxford

The notion that data should be Findable, Accessible, Interoperable, Reusable, according to the FAIRPrinciples, has become a global norm for good data stewardship, a prerequisite for reproducibility. Nowadays, FAIR guides data policy actions and professional practices in the public and private sectors. However, despite such global endorsements, the FAIR Principles are aspirational, remaining elusive at best, and intimidating at worst. To address the lack of practical guidance, and help with capability gaps, we developed the FAIR Cookbook, an online resource of hands-on recipes for “FAIR doers” in the Life Sciences. Created by researchers and data managers professionals in academia, (bio)pharmaceutical companies and information service industries, the FAIR Cookbookcovers the key steps in a FAIRification journey, the levels and indicators of FAIRness, the maturity model, the technologies, the tools and the standards available, as well as the skills required, and the challenges to achieve and improve data FAIRness. URL: https://faircookbook.elixir-europe.org/

Session Break and Transition to Plenary Keynote16:30

ROOM LOCATION: MOA 9

PLENARY KEYNOTE PROGRAM

16:45

Chairperson's Remarks

Allison Proffitt, Editorial Director, Bio-IT World

16:50 Plenary Keynote Introduction

Aneesh Karve, CTO, Quilt Data

Next-generation biopharma workflows require user-driven stewardship of data that gives your enterprise custody and validation of the full "data chain of custody": from instrument, to scientist, to filing. This data chain of custody requires a flexible private-cloud storage system that integrates business documents, large instrument files, and semi-structured metadata into a single, cross-functional storage layer that meets the scale requirements of dry scientists and the usability requirements of wet scientists.

17:00

PLENARY KEYNOTE PRESENTATION: Novartis Institute for Biomedical Research Data Strategy

Philippe Marc, PhD, Executive Director and Global Head, Integrated Data Sciences, Novartis Institutes for BioMedical Research

As part of the larger enterprise digital journey, and as part of the Novartis Research Master Plan, the Novartis Institute for Biomedical Research (NIBR) defined an updated data and data management strategy. This data strategy falls into a broader digital strategy which has many additional priority areas: Information technology, Artificial Intelligence, external science, decision support for drug discovery and early development to name a few. The NIBR data management strategy is based around four pillars:
1. Data Culture: Treat data as a corporate asset
2. Data Management: Structure and link data
3. Data Science: Develop products and insights based on data
4. Data Enterprise: Lead the enterprise on data

Welcome Reception in the Exhibit Hall with Poster Viewing (Room Location: MOA 11)17:30

Close of Day18:30

Wednesday, 19 October

Registration and Morning Coffee (Foyer)08:30

ROOM LOCATION: MOA 9

PLENARY KEYNOTE PROGRAM

09:00

Chairperson's Remarks

Allison Proffitt, Editorial Director, Bio-IT World

09:05

Plenary Keynote Introduction

Eric Stahlberg, PhD, Director, Cancer Data Science Initiatives, Frederick National Laboratory for Cancer Research

09:15

PLENARY KEYNOTE PRESENTATION: Digital Twins: The Virtual Future of Medicine

Peter Coveney, PhD, Professor of Physical Chemistry, Honorary Professor of Computer Science, Director of the Centre for Computational Science, University College London

The purpose of building digital twins of ourselves is to create an organizational principle for modern predictive and personalized medicine. This talk will discuss the principles on which such digital twins may be constructed and used for clinical and healthcare purposes. The roles of multiscale modelling and simulation, artificial intelligence and uncertainty quantification will be described as essential elements in the drive to making actionable predictions from digital twin simulations.

Best of Show Award Ceremony and Refreshments in the Exhibit Hall with Poster Viewing (Room Location: MOA 11)10:15

ROOM LOCATION: MOA 5

FAIR Across a Future Ecosystem of Secure Data Environments

11:05

Chairperson's Remarks

Chris Dwan, Independent Consultant, Dwan, LLC

11:10

FEATURED PRESENTATION: FAIR in the Era of Trusted Research Environments (TREs) Providing Access to Data from Whole Health Systems

Tim Hubbard, PhD, Head, Department of Medical & Molecular Genetics, Kings College London

In recent weeks the National Health Service (NHS) in England has adopted a far-reaching policy of “Secure Data Environments” (SDEs), also referred to as TREs - https://transform.england.nhs.uk/key-tools-and-info/data-saves-lives/accessing-data-for-research-and-analysis/ . In future researchers will be able to “access” health data only through SDEs, rather than having an anonymised copy “shared” with them.This builds on a long history of social science researchers only being able to access (small) sensitive datasets through secure environments and recent policy development work (https://www.gov.uk/government/publications/better-broader-safer-using-health-data-for-research-and-analysis ; https://www.hdruk.ac.uk/access-to-health-data/trusted-research-environments/ ). 

Thanks to computing advances of virtualisation and cloud, the SDE approach has been shown to be practical for researcher access to (large) sensitive datasets that require a substantial compute infrastructure to support analysis. An example is the research environment of Genomics England, providing access to more than 100,000 whole human genomes and associated clinical data (more than 80Pb) - https://www.genomicsengland.co.uk/research/research-environment

The much stronger protection of privacy that the SDE model enables has been well received in engagements with patients and public. It should facilitate research over whole health systems and even across multiple health systems, such as envisaged by the European Health Data Space (EHDS). However, use of SDEs will require new ways of working bring code to data and support FAIR principles.  Parts of the required functionality for FAIR across a future ecosystem of SDEs is being provided in UK by Health Data Research UK (HDRUK) through its Alliance of more than 60 data custodians and its searchable metadata catalogue – the HDR Innovation Gateway - https://www.healthdatagateway.org/

Presentation to be Announced11:40

IMPLEMENTING FAIR PRINCIPLES: PRACTICAL USE CASES AND LEARNINGS FROM THE FRONT LINES

12:10

Preparing Health Data for Secondary Use in Research & Innovation – Some Practical Use Cases

Jan-Willem Boiten, PhD, Senior Program Manager, Lygature/Health-RI

The systematic reuse of health data originating from the clinical care system for research & innovation could become the engine of a truly learning healthcare system, in particular when feeding modern techniques such as machine learning and federated data analysis. Applying these approaches at scale still turns out to be highly challenging due to various practical obstacles: lack of FAIR (Findable, Accessible, Interoperable, Reusable) health data, ethical & legal constraints (real or perceived), unclear incentives for sharing, and  lack of common standards, to name a few. Within The Netherlands, government, research institutes, medical centers, and research infrastructures recently joined forces to address these obstacles. Some practical approaches and use cases will be presented how we try to address these hurdles for data reuse, both on a national scale as well in European projects.

Networking Lunch (Room Location: MOA 11)12:40

Dessert Break in the Exhibit Hall with Poster Viewing (Room Location: MOA 11)13:40

14:10

Chairperson's Remarks

Chris Dwan, Independent Consultant, Dwan, LLC

14:15

Strategies for Delivering a FAIR System for Pharma Research Analytical Raw Data

Felipe Albrecht, PhD, Senior Scientist, Pharma Research and Early Development Informatics, Roche

Analytical methods are essential for the progression of a molecule in the pharma research value chain. The data generated is often used only once. The primary purpose of analytical methods is to answer (bio-)chemical structural questions, where. Parallelly, we have data scientists applying AI approaches. This requires consistent and suited data, which are not usually available or accessible. This talk presents our strategy on 1) leveraging analytical data, 2) the analysis and implementation of the workflows for data acquisition, capture, and storage, 3) unlocking the data through conversion, and 4) offering date back to the lab and data scientists through a system built following the FAIR principles.

14:45

Building FAIR Multi-Omics Data Products for Translational Medicine

Magdalena Wienken, PhD, Associate Director Data Operations & Governance, AstraZeneca GmbH

Making meaning out of data at scale largely depends on how FAIR data is, how good the quality is, and the measures you take to combine it. TICA and the IO resistance strike force (IORSF) are two initiatives at AZ TM where scientists, data, and tech experts are building FAIR multi-omics data products to revolutionize the way we detect prevalence, identify biomarkers, and novel resistance mechanisms in a data-centric way. By doing so we have learned how to set up successful data-centric teams, develop workflows and tools to harmonize and FAIRify millions of data points and spark a new data culture. 

15:15

The Real-World Data Store – A FAIR Data Product to Accelerate Evidence Generation

Alexandra Grebe de Barron, PhD, Data Product Owner – Digital Transformation & IT, Pharma Product Platforms - Development, Bayer AG

The Real-World Data (RWD) Store is a human- and machine-friendly self-service shop that facilitates Real-World Evidence generation. It is a place to find and share integrated patient related data on specific brands, therapeutic areas, geographies for cross-functional usage and brings together scattered silos and disconnected information. In the use case will be shown how the FAIR data principles and data product thinking were applied to provide a product customers love.

Close of Conference15:45






BIO IT Online Learning

2023 CONFERENCE PROGRAMS