Federated Computing: Unlocking Unprecedented Opportunities for Biopharma to Leverage Data
After 20 years in healthcare—supporting hospitals, health systems, payors, medical device and biopharma organizations—the need for better ways to share data becomes apparent. Scratch that—for me, it became blatantly obvious and in many cases, frustrating. The opportunities are limitless, if only we could overcome the challenges of data sharing, which today, even with all the advancements in machine learning (ML) and artificial intelligence (AI), feel insurmountable.
I joined the team at Rhino Federated Computing because they gave me hope. What they’ve already accomplished, and how, is exactly what is needed to shift the tides in our healthcare ecosystem. Our Rhino Federated Computing platform (FCP), while the product of technical ingenuity, enables something so simple: users of data can have their cake and eat it too. That’s right—using our Rhino FCP, you can now leverage data from disparate databases (including completely different entities/ organizations altogether), benefit from it, yet never actually engage in any data sharing at all. Wow, right?
Federated computing involves running code on decentralized data, sharing only the code but never moving the data; this derisks collaborative approaches because datasets remain behind owners’ firewalls, stored independently from each other. Ultimately, the algorithm or model can be trained on—or the user can perform (federated) analytics using—a much wider range of data than any one entity has alone, while safely retaining sensitive data within each company’s own infrastructure. It turns out, despite the numerous data sources and collaborations that biopharma already engages in, there’s so much more they could achieve still, by using a federated model. Let’s talk about it.
Data cataloguing—using federation to get the most out of what’s already yours.
After several years of working with biopharma, I can attest that each company has access to a lot of data. But, because there are so many types of data, each used for different purposes and by different teams throughout the organization, rarely does a comprehensive lens into the entire data ecosystem exist. With Rhino FCP, we are changing that; building an enterprise, multimodal data catalogue that can be browsed in a privacy-preserving way is now a reality. Rhino FCP is modality agnostic, supporting a wide range of data—tabular, imaging, waveform, video—and unstructured notes. Even further: the addition of our data harmonization capabilities, which leverage generative AI, transforms the aggregate of all your data into your desired format (e.g., OMOP, FHIR). (To read an example of the Harmonization Copilot at work, go HERE.) This not only streamlines management of the data, but also accelerates time to insight because users can quickly find and view the specific type of data they need. To account for datasets sourced from around the world, Rhino FCP enables a data search mechanism to view metadata, perform descriptive statistics on data, and request permissions to run code on the data. Ultimately, this increases transparency into internal and partner data assets across geographies without the need for data transfer.
Consortia building—an approach that lets you have your cake and (share) their cake.
The amount and quality of data used to train a model directly impacts the resulting performance of that model. To improve a model, you want to consume more high-quality data. This part’s not rocket science. However, because of the proprietary nature of the data and all that’s at stake with drug development, data sharing among biopharma remains almost taboo. Well, until recently, thanks to the rise of federated learning.
The MELLODDY Project, which kicked off in 2019 and ran for roughly 3 years, included 10 biopharma companies and served as the proof of concept that federated learning can elicit marked improvements in a model’s predictive performance. MELLODDY involved an unprecedented data set of >2.6 billion confidential experimental activity data points, documenting >21 million physical small molecules and >40 thousand assays in on-target and secondary pharmacodynamics and pharmacokinetics. Obtaining this much training data just isn’t possible for any one biopharma alone, even with publicly available databases. Now, with Rhino FCP, we’re standing up biopharma consortia in a matter of weeks, not years.
Rhino was tasked by one biopharma to build a federated network that met its rigorous security requirements (i.e., protected its intellectual property [IP]), offered a smooth workflow, and could be leveraged to optimize its protein docking models. Our Rhino FCP secured both model code/ architecture and model weights/ parameters, and used strict role-based permissioning to control who has access to them. The biopharma could encrypt its models and parameters (including homomorphic encryption) to prevent access by even Rhino. Ultimately, implementation of Rhino FCP yielded strong improvements in binding site predictions.
With Rhino FCP, we’re facilitating biopharma collaborations that aggregate insights across sites, fuel rapid analyses, and even develop AI from scratch—all while ensuring data security and privacy. Rhino FCP is even enabling one large biopharma partner to take its existing, internally-built models and improve them using its network of biotech and CRO partners—each with its own unique, proprietary data. For that biopharma, this translates to a more informed corporate strategy (i.e., build, buy, partner), and ideally, selection of the highest-performing assets for further development. Consortia are allowing these biopharmas to reap the benefits of large volumes of diverse data, while avoiding regulatory obstacles and steep licensing fees.
On-demand, RWD network creation—the insights you need, from where you need them.
With our roots in healthcare, Rhino has numerous relationships with provider and payer organizations, including site networks, around the world. This means that Rhino FCP can plug in to light up the data connections you already know you want, and also offer up additional rich sources of RWD that can be made available to you via federation. This has been particularly valuable in cases where key patient populations of interest span different regions of the globe and for diseases associated with a long clinical timeline and high medical burden, making traditional trials challenging. Here, progress hinges on achieving larger datasets that yield better, more representative AI-based predictions.
Biopharma is also leveraging Rhino FCP for establishing globally scalable models of local data supplier partnerships (e.g., biopharma-HMO-provider). RWD from these networks power global evidence generation that in turn supports several key functions: market access, comparative effectiveness, post-market safety, etc. These insights can reveal to a biopharma the efficaciousness of its therapy over others, where its therapy isn’t yet well represented, and potentially even why. Rhino FCP, with its harmonization capabilities, achieves all this in a private, secure and standardized way, and in your desired format (e.g., OMOP, FHIR).
Observational study orchestration—leveraging federated learning for disease detection.
Prospective observational studies are gaining traction as a powerful source of RWD. These trials involve observing (and collecting data on) a select group of patients over a specified period of time; as a result, they produce real-world insights on exposures and outcomes as they occur. Data is typically collected from multiple, independent sites and analyzed centrally, creating a big opportunity for federation to increase the size and representativeness of the patient population studied. The American Heart Association (AHA) is actively using federated learning for its HCM FLIP (Hypertrophic Cardiomyopathy Federated Learning Implementation Platform) study. The association is targeting inclusion of 10-1000 HCM cases and 30-10,000 age/sex-matched controls per institution, with the goal of building and testing a model's system impact to detect HCM via electrocardiograms (ECGs) and echocardiograms (ECHOs). AHA anticipates that their federated ML model will discriminate cases of HCM from those without HCM in a real-world setting. Perhaps not surprisingly, biopharma is considering federated methods to power its prospective RWD generation, too.
Secure access annotation—maximize your imaging data, minimize the hassle.
Moving data, especially protected health information (PHI), is complicated. Moving really large data—such as imaging data—can be prohibitive. Our Rhino FCP provides a solution: Secure Access. This feature allows biopharma to employ third-party annotators to access and annotate images on their behalf, without any data transfer. Secure Access obviates the need for any third-party to take possession of any images, ensuring that both the images and the annotations remain behind the custodian’s firewall. One biopharma partner leveraged optical coherence tomography (OCT) studies and RWD to refine their AI model. Rhino FCP connected the biopharma to leading hospitals within the Rhino network, establishing a diverse and high-quality dataset composed of OCT images linked to longitudinal Electronic Health Record (EHR) data and clinical notes (all without needing to move data). Ultimately, this yielded an accessible “data enclave” containing a gold-standard ophthalmic dataset for biomarker development, representing hundreds of patients across multiple geographies—all harmonized to the same data model via Rhino’s data harmonization capabilities.
Federation is not a walk in the park—that’s why we’re here.
We developed Rhino FCP with enterprise data collaborations in mind. Focusing on security and privacy, incorporating critical technologies, and providing an integrated, end-to-end solution is all on purpose. Rhino FCP into a totally secure, totally extensible collaboration sandbox that allows collaborators to run code on one another’s data. Rhino FCP offers flexible architecture (multi-cloud and on-prem hardware), end-to-end data management workflows (multimodal data, schema definition, harmonization, and visualization), a privacy screen (custom differential privacy budget, custom k-anonymization values), and allows for the secure deployment of custom code and third-party applications via persistent data pipelines. Even security, privacy, and legal teams are comfortable with Rhino FCP—a true testament to the high bar we’ve set. Four key tenants underpin Rhino FCP’s unique ability to connect data partners in a comprehensive ecosystem:
- Federated computing and secure data access;
- Data harmonization and traceability;
- Collaboration and workflow integration;
- Compliance and auditability.
Federated computing represents a paradigm shift in how data collaborations should work—streamlining the arduous process of consortia building and even enabling use cases that would otherwise be impossible due to data privacy or IP confidentiality concerns.
A more detailed description of federated computing and how our Rhino FCP achieves it can be found in our previous blog post, HERE.
To learn more about accelerating your AI agenda using federated computing, or to see a demo of Rhino FCP, reach out to us HERE.