false
Catalog
Challenges and Opportunities for Clinical Registri ...
Challenges and Opportunities for Clinical Registri ...
Challenges and Opportunities for Clinical Registries Collecting Health Information from Patients – December 13, 2022
Back to course
[Please upgrade your browser to play this video content]
Video Transcription
Hello, everyone. Welcome to the CMSS webinar series on registry science and research. This is actually the final webinar of the last six months or so, and all of that can be found at our website, all the prior sessions, as well as this one, once it's done. Today's session is a really great opportunity to think particularly through the lens of collecting data from patients and thinking about doing that in a respectful way and using the best of technology to do it as well. So if I can have the next slide, I think we'll have the pictures of all of the speakers. Nope, you can just introduce them. I don't. Okay. I wasn't sure the picture would be there. Sorry, Julia. So our speakers for today, our session leaders are Liz Garrett-Meyer, who's the Vice President, the Center for Research and Analytics at the American Society of Clinical Oncology, Steve Labkoff, who is the Global Head of Clinical and Healthcare Informatics for Quantory, and then we have a wonderful group of panelists, including Anna McAllister, who's a consultant, patient engagement, use, and governance expert, as well as a very informed patient herself, Samantha Robichaux, who is the Solutions Lead for Registries and Clinical Research Networks at DataVant, and finally, Leon Rosenblatt, the Head of the Registry Practice Center of Excellence at IQVIA. It should be a great session, and I'm going to turn things over to Liz. Thank you. Great. Thank you, Helen, for that introduction. Next slide, please. So the reason that I've been pulled into this group to discuss registries and access of data from patients is from my participation in and ASCO's development over the past several years of a COVID-19 registry. So at the beginning of COVID-19, we decided that it would be an important effort to collect data from cancer patients that contracted SARS-CoV-2 virus, and we established this registry in April of 2020. Next slide, please. So our goal in the outset was to help the cancer community learn more about patterns of symptoms and severity of COVID-19 in patients in cancer treatment, and also about COVID-19 and its impact in the delivery of cancer care and patient outcomes. So we had several objectives relating to looking at the distribution of symptoms and severity of COVID-19, examining the impact of COVID-19 on cancer treatment and outcomes, and also documenting adaptations of cancer care during the pandemic. So as noted, we launched this registry in April 2020, and as of when I last looked, which was earlier this month, earlier this week probably, we have over 6,000 patient cases included in the registry from 64 practices across the United States. And each patient either has or will have up to two years of follow-up from the first positive SARS-CoV-2 test that they had. So we launched this very quickly, and one of the decisions we had to make very early on was how we were going to deal with accessing data. We took the approach of not seeking patient consent, which left us somewhat limited in what we could do. So we engaged practices, we signed agreements with practices, and the approach was for practices to submit data on their patients to the COVID registry. And it was IRB approved, and it turns out it was IRB exempt, but just to let you know, we did get IRB approval for the approach that we took. Next slide, please. So the target patient population for our registry were patients that had a positive SARS-CoV-2 test, in addition to having either active cancer at the time of the positive test, or the patient had to be cancer-free for less than one year, but on adjuvant treatment. So for example, if a patient had a mastectomy and then was receiving adjuvant therapy after mastectomy for breast cancer, and they were still on adjuvant therapy six months later and got COVID, they would be eligible to be part of our patient population. So on the right here, I show you just a snapshot of some of the data that we collect and we have available for public viewing in our data dashboard, and you can see the distribution of cancer types. We have patients from all cancer types, most commonly breast cancer, second most common lung cancer, colorectal cancer, and we also have a relatively large fraction of patients with hematologic malignancies and most notably B-cell malignancies. We unlike some other cancer registries, the largest fraction of our patients have metastatic solid tumors, and that's because we specifically reached out to practices and engage with practices to enroll patients versus having a direct-to-patient registry. However, today I'm going to be talking about some of the limitations that we have by the way that we set up our registry, and it is not direct-to-patient, it really is going through the practices and relying on data that exists in patients' EHRs. So next slide, please. So the way that we set it up is we established that we would collect a limited data set, and what that means is we do not have direct patient identifiers in our data set. So we developed a study design where we did not need consent from patients. That limited us in the data that we could collect. We could collect things like date of birth, but we could not collect things like patient address or social security number. So this created some challenges in our data collection and follow-up of patients. We did overcome them, and I will not give all of the specifics, but it does create challenges when you cannot have direct identifiers in terms of longitudinal tracking of patient information. So unlike some other registries that looked at COVID in various patient populations, we collected data at set intervals over up to two years from the time of the first positive COVID test. So if we're having data submitted in a serial way like this, we have to be able to connect all of the data that is submitted across patients. We also were reliant on extracting data from electronic health records from the practices. So we did not have specific case report forms for this registry. Well, we do have report forms from the sense of we're asking care providers to pull data out of the EHRs, but unless the care team at the oncology practice is specifically asking about the questions we were interested in, for example, symptoms of COVID, treatments for COVID, hospitalizations for COVID, and other related things about COVID, we may not have that data then available to enter into our registry. It also leaves us with an inability to easily connect data to other sources because we do not have direct identifiers to identify our patients. Next slide, please. So some other things that we were aware of and became more aware of over time, at the time that we started the COVID registry in April of 2020, none of us knew what was going to happen with the COVID pandemic. So we did our best to anticipate. But what we have found is that the cases that we probably tend to have in our registry are more likely to be those that were more severe COVID cases. Those that were asymptomatic or mild were less likely to be included in the registry. And while it might be what might have been less true early in the pandemic, as the pandemic has gone on over time, and especially, for example, in the last six months, it's quite possible that patients that have COVID, they might not even be noted in EHRs and oncology records. Now, that could be because it was not noted by the care provider, but it could also be that the patient did not share that information with the care provider. And that leads to a lack of representativeness from the perspective of we do not have necessarily good representation of all types of COVID cases in our registry. We're going to be more likely to have those that were more severe and more symptomatic. We also are prone to having missing data among cases. And I mentioned this a moment ago. So we're reliant on oncology practice EHRs for data on COVID-19. Cancer care providers are not necessarily as much focused on collecting data on COVID-19 as they are on things to treat and follow their patients for oncology-related services. So we do see that the more severe cases of COVID do tend to have more data in the EHRs, but we do find that a lot of the symptoms we might be interested in that might be somewhat mild, you know, coughing, fever, fatigue, they might not be noted in the EHRs. And some of these symptoms might also be related to cancer and not necessarily COVID. Another interesting thing that we've seen over time, something that we may not have been able to anticipate is the change in missingness over the course of the pandemic. Some of this has to do with the change in COVID severity over time. And this leads to, this is due to changes in the variants. Some of the earlier variants were more severe. The more recent variants caused less severe disease in addition to care teams being more well-prepared for dealing with and treating patients that have COVID. We also have changes in the patient population due to the changes in mitigation of the spread of COVID across our population. And then lastly, and importantly, early in the pandemic, practices were very interested in the COVID registry and submitting data so that all of us could better understand the impact of COVID on cancer patients. Not to say that people are not still interested in that, but we have learned a lot and there is less interest, especially given sort of the crunch of other work that occurs in cancer care teams, there's less interest in taking the time to follow up data on patients with COVID or adding new patients to the registry. Next slide, please. One of the last things I'll mention is just issues of de-identification. We were interested in taking our registry data and making it available for external researchers to do research on. And we initially thought we could use Safe Harbor to de-identify the data to share it with external research teams. And what we learned was that we could not do traditional Safe Harbor. And at that time, no accommodations had been made for data on COVID patients. At the time we were interested in doing this, it was, I guess it was either end of 2020 or early 2021. And to do Safe Harbor de-identification, you cannot have any information that will allow someone to determine the date of diagnosis within a 12-month period. So because COVID had not really been present in the United States for more than 12 months, by simply having patients in the registry, we were already going to be violating that if we put any information about dates, any information about anything in the data. So we had to come up with a different approach, and we had some experience with expert de-identification from other data projects at ASCO. And so we used an expert de-identification approach where we used date shifting, we did masking of geographic information, and then we had to do some actually pretty fun things with the social determinants of health data that we were able to link to our patients to ensure that we did not risk de-identification of patients. Next slide, please. So I've tried to highlight some of the challenges that we've had with the COVID registry. It still is a very valuable data set, and we have a number of publications ourselves and also external researchers have made from this registry. There are some benefits that we would have realized had we been able to have a direct to patient approach. We would have been able to have case report forms to directly get information from the patients versus relying on what was already existing in EHRs. We would have had simpler tracking and follow-up because we would have been in control of that at the registry level and not relying on registry practices to do that. We think we would have had some mitigation of the biases that I mentioned. So I think that missing data per case would have been improved. It might not be perfect due to things like recall bias. And then in terms of missing cases, we probably would be likely to not necessarily have a representative patient population. It would depend on our approaches for patient engagement and being able to reach out to identify patients who may have been asymptomatic or have mild disease that may have been less engaged or interested in contributing to the registry. We probably would have had to address issues of consent, and we still would have likely had a change in the case mix over time due to sort of anxiety about COVID, which was at its peak in early 2020. And while we still have anxiety over COVID, in the end of 2022, certainly our anxiety levels have tended to decrease. So I hope that case of our registry was useful for kicking off the session, and I will now turn it over to our next speaker, Steve Labkoff. And just as a point of interest, there will be time for question and answers following the presentations. Thank you. Terrific. Thank you very much. So good afternoon, everybody, and good morning if you're on the West Coast. My name is Steve Labkoff. I'm the Global Head of Clinical and Healthcare Informatics at a bioinformatics consultancy out of Boston called Quantory. I've also recently left a role where I was the Chief Data Officer at the Multiple Myeloma Research Foundation, where I helped construct and lead the implementation of the CureCloud, which is one of the largest registries in myeloma pretty much anywhere in the world. I'm going to be recounting some of the lessons that we took away that actually have almost very little to do with technology, but technology can have an impact on them. So why don't we go next slide? So I do have some disclosures. I usually do these in presentations. I have a pension with Pfizer, shareholder at Quantory, and I'm a venture partner at Boston Millennial Partners. You can read the rest. You can go on. Next slide, please. So I like to do these in a case study, and this is sort of what we did with the MMRF. It was a rare disease research foundation. It had a great deal of interest from donors, and some donors donated a great deal of money to help build this registry to get over 5,000 patients. It was a follow-on to a study called the COMPASS study, which had been run over the preceding 10 years, which garnered an awful lot of new information around what could be found. And so the CureCloud was being rolled out in a way to really increase the size and domain of the numbers of patients who were going to be in there. And the way that it was being crafted was that of a direct-to-patient registry. And I know that it was mentioned a moment ago what that is, and I'm going to go over that in just a minute. After a year, we had some issues. After it was rolled out, the group was extremely homogeneous. It didn't necessarily reflect the true nature of who actually gets myeloma in the real world. Also, we had issues about the registrations. They were very, very strong in year one, but as the program rolled on, registration started to slow. And while there was tons and tons of enthusiasm, how do you keep that enthusiasm going? How do you keep that enthusiasm alive when the study is meant to be 10 years and you're only at the end of year one? So let's go to the next slide, and we'll talk about many of these different issues. So first of all, what's a direct-to-patient registry? This is from an anti-dryer's article that was published by HRQ. Direct-to-patient registry is designed where recruitment and many of the communications and data collection is conducted directly with patients without the guidance from medical professionals or anyone else in the scientific world. In other words, you're interacting directly with the patients. You don't have clinical study managers in the same way that you would in a typical clinical study or other registry. While this has some tremendous benefits, it also brings on another whole set of challenges, much like any other trade-off. Typical registry recruits through other means. This one actually, because you're working with patients, has a lot of other things we'll talk about. Next slide, please. So there are some core differences, and there are some similarities. So with a direct-to-patient registry, you still need a protocol. You still need IRB approval. When it comes to recruitment, as opposed to necessarily having as much of a targeted patient base from a community or from a geographic area, in many cases, these tend to be all-comers from all over the country, which brings interesting challenges on a lot of different levels. Enrollment tends to be self-managed. Patients come to you after you sort of alert them that you exist, and it's different than where you're trying to reach out and bring them in, per se, in a similar way. All of the consents and everything need to get managed by the patient. So the consents, well, you do sign a consent with a typical clinical study or registry. This one needs to be set up in such a way where it's not necessarily going to be handled or explained by a third party. It needs to sort of stand on its own in English that can be read at basically a fairly low reading level. Recruitment strategies, patients tend to be 100% patient-facing, and that means that you need to figure out new ways of advertising the fact that it exists. When you do a typical registry, usually you don't necessarily give the data back to patients. In a direct-to-patient registry, one of the models is to give data back to patients. So when you're going to give back complex information back to people who are not trained in science or medicine, that has implications. And then the issues around loss to follow-up, one of the positive things in a direct-to-patient registry is oftentimes you create methods to stay in touch with the patients, and that actually can help decrease things like loss of follow-up. Next slide, please. The kinds of data that can be in there really vary. They can be things like abstracted medical records. You can add things like genomics and proteomics. Those are going to be gotten through tissue specimen acquisition, which has different sets of challenges in this type of an arena. Patient-reported outcomes, again, this will be self-managed by the patients through survey tools. You'll have outreach notices. You'll have outreach notes, potentially, where you have nurse navigators who will reach out to the patients and take notes, which can become another modality. You might even have imaging. Some of these things are going to be mediated by the patients themselves, and others will be mediated by the patients giving consent for those records or those assays to be run on their behalf. Either way, all of this consent and all of the signatures and everything that needs to be done just have to come through the patient and generally are going to be done through your interface to them. Again, that has to stand basically on its own. Next slide, please. What are some of the challenges? I've only listed a handful here, and there are many additional challenges when you bring up a situation like this. We'll just cover a few of them. The first one is going to be representativeness. How do you handle things around recontacting, gathering tissue when you're not working in an academic institution or a research institution, dealing with patient literacy, and then long-term follow-up and what we call the with them, the what's in it for me? How do you maintain people staying involved when you've got to give them something back, in other words? Technology is not necessarily the way to approach this, but technology can certainly help with all these things. Next slide, please. So, if you look at multiple myeloma, the American Cancer Society says that it basically has a prevalence in the population running around 14, in some cases, 18 percent of African Americans. Well, the reality of what happened in our case was when we started the program after about a year, the representativeness was unfortunately just really not happening properly. Now, the initial challenge that we had was how do you recruit, and the initial recruitment strategy revolved around websites, Facebook notifications, chat rooms, and other types of social media. Well, those types of approaches have their own independent biases in terms of who uses them, and as a result, early on, the numbers of African Americans that were represented in our system unfortunately just didn't come up to snuff. We had to come up with a new strategy and work very hard to redeploy resources in order to figure out how do we get enrichment for that community. So, at the end of the day, when the registry is filled, it actually has a representative nature of who actually gets the disease. It wouldn't be very good if it was skewed to one population or another. So, the lesson learned here is that, and we live this one, you have to actually watch your recruitment. You have to be diligent about your recruitment styles and how you're doing it, and when you start to see things skewed one way or another, you may need to change things around. Next slide, please. So, another thing that we had to think through was representativeness is when is enough enough, how do you know how many patients you need to make representative studies down the road? So, one of the analyses that was done was we said, well, would a thousand patients be enough? And if you look on this, these are on the left side of these two tables, you see the genomic variants that are found in myeloma and their frequencies that came out of the COMPASS study, and if you look at how some of the scientists we were working with wanted to study this disease and the data coming out of the registry, we found that it had different types of newly diagnosed and relapsed one, two, and three, and if you did the math to see what those percentages look like in the general population, you'd see that anything here that is listed in red, I'm not sure why it's blue at the bottom in legend, but it's anything in red there, it doesn't have enough patients to even do a 20-patient cohort, and if you raise the number of patients up to 2,500, more of them turn green, less red. If you go to the next slide, you'll see when you got to 5,000, most of the grid turns green. And, you know, figuring out a priori what you're trying to do and how you're trying to study matters because the minute you do your first query against any population, you're going to cut it back, and, you know, this was one of the guiding things that we used in trying to figuring out how to pre-load or at least have a target of the patients for that study. Lesson learned here is make sure that before you begin, you have a sense as to what you want to do with your data and what your likely first sets of queries are going to be so you can actually ensure that you recruit enough patients. Next slide, please. I mentioned the issue around recruiting the African-American population. One of the ways you need to think this through in this particular case is maybe you need to, instead of reaching out through Facebook or through social media, you might have to go into the community. You might have to figure out ways of gaining access to the religious community. Or in other cases, we actually were at the time to enlist the help of a myeloma patient. It's common knowledge. We're not disclosing anything. Colin Powell, the former Secretary of State, was a myeloma patient, and he came and did some speaking on behalf of our organization at the time. That helps bring in folks from that community. It's a work in progress. I can't really report on where things are today. I'm no longer with the organization, but I do know that they put a lot of energy into making this recruitment strategy come back into line. Next slide, please. The recontacting of patients. This is a thing that in the direct-to-patient registry can often be a plus. It does require that you have special staff or specially trained staff who will talk to patients, nurses or patient navigators, and they'll generate additional data for you, too. In fact, they can actually generate narratives and answer surveys in the course of their contact with patients. They also provide an ability to do some checking with patients if something is actually going sideways in their therapy. We enlisted three patient navigators who generated a lot of data. More importantly, they actually had a relationship that formed up with the patients who were in the study by contacting them every several months. By having that relationship, it also worked on engagement, which I think is the next topic. Next slide, please. Sorry, it will come up next after this one. One of the things we were trying to accomplish here was to bring in genomic data around and collect biospecimens to do that. When you're not working in a university setting or a hospital setting where you have access to labs and lab equipment, all of a sudden, collecting biospecimens can turn into a pretty significant challenge. We worked this out by building out a home phlebotomy strategy. Then we found in COVID, people didn't want phlebotomists coming into their homes, and the phlebotomists didn't want to be going out either at the time. We eventually had to pivot and bring this into bringing in partners with Quest and basically Bricks and Mortar, where the patients would go in and have their blood drawn for this. All of this requires new thinking around how do you handle these things from the real world in when you're not dealing with the resources that you typically have in a hospital setting. Next slide, please. Patient literacy is another area that really is a significant challenge. Every single thing we did as we built this out, and it's the case, it wouldn't just be for the Cure Cloud, but it would be the case for any direct-to-patient registry, you must take into account patient literacy. If patients don't understand what you're asking or don't understand what the uses of the data is going to be or be able to get through the consent, it's going to be a significant challenge to being able to help them through their journey. Every single thing we've looked at for the program, every webpage, every document, every bit of the consent was reviewed, and at the very most, it was at the eighth-grade reading level. There were some times we thought we might actually have to lower that down to the fourth-grade reading level. My point here and the lesson learned here is basically you must take this into account when you're doing this. If you're not doing this, you must take this into account if you're going to be going directly to patients. If they don't understand what you're trying to do and what you need, it's going to be very hard for them to interact with you and to help them through their journey. Next slide, please. This also goes to things like visualizations and handing data back. One of the long-term things that we did was we made sure that everything we did for these patients who gave of their time and literally their blood, we wanted to make sure they gave them information back. Again, giving information back to laypeople as opposed to scientists means you've got to go to great pains to ensure that those findings are understandable. Again, the visualizations had to be vetted. They had to be thought through and reviewed so that they would be at the proper reading level. We also wanted to make sure that we were giving this back. If there were things that we're finding out in the course of the study, we wanted to make sure that the patients actually were being able to get access to their own data as well as the data in the study. That actually has challenges as well in terms of not just the literacy, but also do the patients have the capacity to understand what it all means. That's another thing that would often get spoken to in the course of the care navigators working through the program. Next slide, please. Then long-term engagement. At the end of the day, if they're going to be running something that's going to be out there for five or 10 years, you've got to have ways that really engage the patients in a very meaningful way. The strategy we took was actually we hired a marketing agency, an actual marketing agency that worked with patients to help us deal with the visualizations and help us with planning things for the review of our content. We wanted to make sure that all the things we're giving back were being utilized in a way so that patients could actually digest it. Believe it or not, a marketing agency did that really, really well and frankly, probably better than we could have done it on our own. This was another lesson learned that was a real positive. I would encourage you if you're going to be going this road to consider that if you've got the budget to pull it off. Next slide, please. This is your last slide. The last slide, yes. In review, when you're going into working with patients, you've got to think about a variety of different aspects of it. The ones that I outlined here in this little talk are just the beginning. Trust me, if you wanted to chew on my ear for another hour or two, we could go through all the lessons learned because there's lots and lots and lots of them. Thanks very much. I believe next is Leon. Thanks very much, Steve. Delighted to be here. I'll start introducing myself while the slides are being called up. I'm Leon Rosenblatt at IQVIA. I head the Registry Practice Center of Excellence. In that capacity, our team manages about over 85 at this point, major registries, primarily for not-for-profits, medical specialty societies, and patient advocacy groups. I myself have worked on over 75 registry programs at this point. What I was hoping to do in our next 12 minutes together is build on the excellent presentations by Liz and by Steve, who gave you a beautifully focused case study view of what registry programs are about. What I want to do is zoom out a little bit and talk at a wider angle lens level about how these programs run in general and maybe provide a little bit of a framework for thinking about direct to patient. Next slide, please. Next and last, actually. I'm going to manage this with a one-slide outline. Today, I'm going to tell you about direct-to-patient registries, the opportunities they present for medical specialty societies in particular, the challenges they pose, and the solutions that can address these challenges. Before we proceed, let's set some context. We should note that direct-to-patient registries often involve a major shift in perspective from how medical specialty societies usually look at data collection. Most societies are accustomed to looking through the lens of quality improvement research and assume that data from patients takes the form of PROs collected to enrich their QI use cases. Direct-to- patient registries are generally designed and built not only for QI, but also to support broad research uses. In this brave new world of broad research, the patient is the unit of analysis, not the provider. This patient centricity is not always a natural fit for medical specialty societies. As member organizations representing providers, they're often hesitant to take any steps that position them between the provider and the patient. Doing so is awkward and potentially complex politically. I find it useful to acknowledge that awkwardness and complexity are front because the question is always going to be, are the benefits of going direct-to-patient worth getting past the resistance you're going to get in your natural role as a medical specialty society. Now let's walk through the biggest opportunities for direct-to-patient. Direct-to-patient registries can deliver new and diverse data streams that are not available in a clinical setting. You saw some of them already. You saw some of the work that Liz presented, and Steve. There's actually exciting data types like molecular data, digitally imaging data. The simplest kind to implement and the most common are various kinds of patient surveys or self-reports. These include quality of life, social determinants of health, as well as more traditional PROs and just generic natural history of disease type questionnaires. This year, the most exciting data collection advances in direct-to-patient registries are patient-mediated EMR ingestion using HL7 FHIR endpoints and the US CDI dataset. This approach started appearing as an exciting direction of horizon about four or five years ago. At least that feels like we've been talking about it longer, but I know we've talked about it at least that long and started working on it. I'm happy to report that it's finally working this year. You can now actually see a fully functional mobile app where the patient creates an account, selects their healthcare provider from a list, logs into the healthcare provider's EMR portal, like MyChart, authenticates using their MyChart credentials, authorizes a data connection, and then gets their EHR data loaded into the registry without any additional manual effort by the patient or by anyone else. Today, the patient-mediated EMR ingestion is probably the most exciting short-term opportunity for rapidly wrapping up your clinical data collection from where you are, but I think there's an area with an even bigger long-term potential with direct-to-patient registries, and that's at-home and wearable sensors that automatically collect data outside of clinical setting. These have the potential to enrich and deepen our out-of-clinic data about patient behaviors and experiences, and as we all know, most of life occurs outside of the clinical setting. To spark your imagination just a little bit, I'll offer one brief example that struck me as both achievable and compelling. Patient Advocacy Group, I was speaking to recently that I won't name, is using their registry to combine a patient-reported outcome on joint pain, so a very standard PRO, with an activity monitor using a patient's phone and an approximate GPS location, again using the patient phone. You can use the GPS location to look up the weather at the time of the patient's activity. Now, why would you want to do that? Well, you can now investigate how activity and weather interact to increase or decrease joint pain. That's pretty cool, right? And no sci-fi tech is needed at this point. This is fairly, you know, out-of-the-box commercial technology using a standard connectivity, at least in the modern world. It was all very sci-fi 20 years ago, started. So finally, there's another major opportunity that may be less obvious, but may be more important for long-term sustainability, and that's high patient trust and high patient engagement can translate into valuable relationships, such as clinical trial recruitment and follow-on research, and that is one of the most important outcomes of a direct-to-patient program experience. All right, now we talked about opportunities, let's talk about the challenges, and there are plenty. First, note that the opportunities I mentioned above involve bringing together novel, complex, and diverse data types. That's not easy, and it implies the ability not only to collect the different data streams, but to integrate them effectively into something usable. Second, notice that with direct-to-patient registries, we're inevitably talking about collecting identified data. So, you know, Liz gave an example of where they were able to steer away from that in the ASCO case, but with longitudinal research designs, mean your organizations is handling identified data whether it wants to or not. Yes, you could de-identify downstream, and it's usually wise to do so, but if you're going to follow a patient over time, you're going to wind up doing things like allowing them to have an account where they can log in and recover their password, right? Otherwise, you're just not going to be able to follow them. Thus, your organization is almost always handling PII, at least at the very front end of the data collection pipe. To make your regulatory life more interesting, you may now also be conducting human subjects research as defined by 45 CFR 46, Part A of which is known as the Common Rule, and like your traditional QI registry, it's probably not exempt from IRB review, right? While this may not be an insurmountable obstacle, it does mean you need to think carefully about the regulated research enterprise and have a plan in place for handling any additional requirements that have now been imposed, like IRB approval of protocol changes and consent management. Consent management, by the way, becomes challenging when you have multiple data sources and multiple potential data uses for each patient, as well as regulatory mandates, like enabling withdrawal of consent. With multimodal registries, consent management goes well beyond managing a single ICF for a single study, and the content of the ICF also has a much more multifaceted long-term impact with direct to patient. When designing your ICF language, don't forget the three most important things that you want the patient to consent to in the long term. The first is broad data use, or data use says. Number two is ability to recontact for future research, and three, some clear ethical limits on withdrawal of consent to the extent that your local jurisdiction allows those limits. So I think in an ideal case, a patient in a long-term registry study can withdraw from future data collection activities and any future data distributions, but they shouldn't be able to require that you retract all of their data from any datasets that have already been distributed, released, or published. All right, so your next set of challenges revolve around representativeness and sampling bias, and they've come up in both of the conversations so far, and they're quite serious. With ECRF-based data collection, you're often performing a census of all patients from a clinical site that meet your inclusion criteria. Now, there are other methodological issues. You're actually sampling from a population of sites. You don't have to worry about that, but with the direct patient registries, you must recruit and retain patients in a process that most methodologists would describe as convenient sampling. Unfortunately, you can't avoid sampling bias with convenient samples, but you can work to reduce it, and Steve talked about a few ways that MMRF has done so. Note that the problem is sometimes discussed under diversity inclusion, as the sample demographics often stray quite far from the gender, ethnic, and racial makeup of the population of interest. The problem is made even more difficult where the long-term follow-up durations stretch over a decade, like with MMRF, and it even becomes more acute when you have one to two decades, such as with post-approval safety monitoring for cell and gene therapies. We now don't even know what to call it. We're just starting to report in the very, very, very long-term follow-up studies, so the durations pose their own special challenges. All right, now let's talk about solutions to address the challenges we've mentioned so far. All of these challenges can be addressed with the right planning, the right processes, and the right tools. First, let's talk about sampling bias. You can reduce sampling bias by improving recruitment and retention, and by making it reactive, right? So, as Steve described, for example, having a dynamic process can really make a big difference. You improve recruitment and retention through proactive patient engagement. Engagement drives retention. Return of value is what drives engagement. What does return of value look like? I know you're all thinking gift cards, but it's not just gift cards. Thank you, Steve. So, investing in your story and your branding, providing back genuinely useful information, very thoughtful data displays, for example, giving patients and families personal attention, respect, and belonging, and building a sense of community can all offer value that keeps the patients engaged, even over long intervals. Gamification, good UI, UX design can also help and can reinforce other engagement paths that you've taken. You'll also want to include the right consent management process and tools. Remember, patients are consenting for broad use, and you'll want to reconsent them for supplemental follow-on research activities. I recommend that you think of your consent management as a bit more than that. It's not just consent management. It's a patient privacy and data rights brokerage. You want to empower patients to manage what rights to their data they want to transfer, to whom, for how long, and for what purpose. Why? Empowerment and transparency in this regard can help maintain high levels of trust you need for long-term engagement. At the organizational level, you'll want to establish centralized research data governance. If your organization doesn't yet have centralized governance for the research function, planning a direct-to-patient registry is a really good time to create one. Note that this function may or may not fit under the medical quality umbrella, where a lot of registry projects originate in medical specialty societies. The broad research function often needs its own leadership and sometimes needs specialized staffing and specialized tooling and specialized vendor relationships. Finally, you'll want to support your centralized data governance by systematic research data management process deployed on a robust infrastructure. Now that you're dealing with identified data and a regulated as a human subjects research enterprise, you'll want to insulate research data from other business data, like marketing surveys, membership records, donor lists, etc. They may look like research data, but they ain't. You'll want to run some systematic and predictable processes to integrate data across multiple types, multiple time points, and to ensure an appropriate level of data quality for the intended data uses. This implies you need a platform that can help streamline and automate integration-related data management functions, like data harmonization, data linking, curation, and transformation. Since your projects are also going to be dynamic, they will change. You want to be able to add new studies, new data types, and new data outputs gracefully without having to rebuild every scratch or reinvent all of the tooling and all of the processes. And finally, you need a platform that can manage rapid and extensive metadata evolution while maintaining high levels of data quality. This is a lot, but you can develop a crawl-walk-run plan. You can start small and build up incrementally. More importantly, if expanding to direct-to-patient registries presents a real opportunity in your specialty or your disease area, you can manage the challenges. And as you've heard from both Steve and Liz, and I know you'll hear from our other panel participants. So with that, thank you very much. And we're going to turn it over back to the panel. Great. And I think I'm up next here. Thanks so much. So just as a quick introduction, I'm Samantha Robichaud. I am from DataVant, where I lead our solutions for registries and clinical research networks. I'm thrilled to be part of this panel to talk about new solutions for privacy-preserving record linkage. Julia, if you can jump to the next slide, please. So I wanted to start this section with a quote, which I believe is one of the first mentions of record linkage way back in the 1940s. Each person in the world creates a book of life. This book starts with birth, ends with death. Record linkage is the name of the process of assembling the pages of this book into a volume. And just like in the 1940s and even more so today, we need record linkage because health data is fragmented. Next slide, please. When we think of data fragmentation, we often think of these disconnected first-party data sources like EHR data, claim data, lab data, pharmacy data, but part of this melange are registries. However, I strongly believe registries are in a very unique position to improve on data fragmentation with data linkage or record linkage. Next slide, please. Many of you have probably... Oh, Julia, next slide, please. Thank you. Many of you have probably heard these numbers, but just to further underscore data fragmentation, rare disease patients see on average seven doctors over seven years to get a diagnosis. About 35% of patients are switching doctors every two years and the average person sees 18 physicians across their lifetime. And this causes inherent fragmentation and it begs the question, what could we do if data were linked together? Next slide, please. So this slide has a number of use cases for linking data, many of which involve patient registries. So take here at the top complex disease management, take a disease that impacts many different organ systems and understanding the disease requires a multi-modal approach. A good example could be multi-system inflammatory syndrome in children. The NIH themselves has said that advances in scientific discovery and of course, making the most of their research investments can come from linking data across research efforts on MIS-C. Another example looks at rare disease. Take a rare disease that's treated by a specialist, very specialty focused, and there's a specialty registry that tracks outcomes and treatment. However, that registry lacks contextual information about comorbidities, which could potentially be identified through claims. Understanding a global burden of disease can inform treatment recommendations and future areas for research and investment. A third example of a use case, when going direct to patients, collecting information about their experience, say with a rare or common disease through a patient reported outcomes registry, this information, if linked back to EHR data, can corroborate things like their treatment plan and the medication regimen and build out that longitudinal view of the natural history of disease, which will be much more accurate and impactful by combining that clinical and PRO data. And lastly, take a registry that focuses on quality of care and adherence to evidence-based care guidelines and linking that registry to social determinants of health. That could uncover some of the social and behavioral underpinnings of patient health, allowing for that whole person care, informing care delivery, and highlighting potentially unaddressed health disparities. Next slide, please. And it's worth mentioning that we are in a really unique position today, even more so than 10 years ago, given the complete digitization of patient data. Linkage is more important and provides more possibilities than ever before. Next slide, please. And luckily, there is a real-world data ecosystem out there with de-identified data ready to be linked. Claims aggregators were some of the first organizations to make data available, but there's a growing body of entities that support linkages with their first-party and third-party data. So next slide, please. And let's move on now to approaches to link patient data, whether it's in a registry or a raw data source. So at a high level, there are two approaches. For simplicity, we can call them traditional and privacy-preserving. So traditional linkages take personally identifiable information like a social security number and looks for a match across two different data sources. And that really means that that PII, that personally identifiable information, is being exposed downstream to someone else who is doing that matching. Privacy-preserving record linkage, on the other hand, takes an approach where PII is never exposed. So PII elements are combined and a hashed code is created, which is an irreversible token that can further be encrypted and compared for possible matches. And it's those resulting matches from tokens that can be used to link records across different disparate data sources. And this is often called tokenization, which you may have heard before. So again, under this privacy-preserving model, PII never has to leave or be shared with anyone outside of an institution or an organization that houses the data. Next slide, please. PPRL, privacy-preserving record linkage, has been around actually for a number of decades. Back in the early 1990s, the NIH created one of the first versions of PPRL, a fairly simple, straightforward hash called the Global Unique Identifier, or the GUID. Many of you, I'm sure, are familiar with this. It takes six PII elements, which are on the screen, and creates a single token. First name, middle name, last name, sex, date of birth, and city of birth. I'm sure many of you can see immediately some of the limitations of an approach that requires all six of these data elements. Things like middle name or city of birth may not be readily available in many data sets, or it might be sparsely populated. Modern PPRL approaches are built to handle this type of messy data and actually build out multiple types of tokens based on different combinations. So take different spellings of John and John or Allison and Allison. It handles a lot of that messiness, and it acknowledges that data sources are collecting different combinations of PII, and so different tokens can be created. Before moving on, I did want to make a note that it's worth mentioning that tokenization is not the same thing as de-identification. Tokenization, when you hear that term, the way to think of it is it's a strategy to apply when de-identifying records, so it remains linkable despite eventual redaction of that PII. Next slide, please. So let's look under the hood. So this is just an example of different token design combinations that can yield high-precision matches. These token designs are balancing data uniqueness, data entry patterns, and data availability across different data sets. So even taking first name email as a combination or date of birth email as a combination, there's a lot of different options for different data sources. Let's jump to the next slide, please. So what would a basic data flow look like? Again, very high level. Let's say two registries want to share data and see where they have overlapping individuals. These two registries create tokens based on the PII that they have available. Notice here, Jane Smith exists in both registries, first name, last name, social security, and date of birth. And then notice here, the tokens generated don't match yet. You can see in blue and gray, these are not matches here. And this is intentional and by design so that each group must first agree that they want to share data before a match can be made. Additionally, it's helpful in the event of say a data breach or a bad actor within one organization that it doesn't expose or compromise identifiers in a different data source. So once they've agreed to work with each other, they're then able to decrypt and match their tokens together and find high fidelity linkages across two data sources that previously were not linked together. And they did not have to share Jane Smith. They didn't have to share a social security number or any of that PII with each other. And I'd be remiss in not mentioning that any time two de-identified data sets come together, those added patient elements can increase the risk of re-identification. So it's paramount to ensure that any combined data set remain de-identified through one of the two methods outlined in the HIPAA Privacy Rule, safe harbor or expert determination. Next slide, please. So what lessons learned and opportunities are there for registries, especially registries interested in linkages? In my conversations with groups who are looking to do this, there are three topics that have consistently come to the top. So first, collecting robust PII at the outset. Many groups, maybe with the best of intentions, maybe don't feel comfortable collecting a lot of PII elements like first name, last name, date of birth. I've seen instances where data could be collected anonymously. I would say it's a little bit short-sighted as it can limit data linkages down the line. And as we heard earlier, even from Leon, there's other uses for PII, such as recontacting patients for a long-term follow-up. Second is data ownership. So some registry data is stored on a data platform. And on some of these platforms, data may not always be readily accessible. So a bespoke linkage opportunity could present itself, but the data cannot be accessed by someone outside of that data platform. So another thing to think about. And lastly, for projects that comply with an IRB protocol, these protocols may have provisions for de-identified data sharing, as many do, but it's important to include an option to link de-identified tokenized data to avoid possible roadblocks down the line. So it's good to be thinking about this upfront at the start and make those amendments wherever possible. And so to wrap up my last slide here, I wanted to just throw out, there's some great articles that were published fairly recently, if you're really curious to learn more about privacy-preserving record linkage. The first is from the National Institute of Child Health and Human Development. It actually discusses multiple modern PPRL approaches. And the second is a study that looked at patient overlap across a clinical research network, actually PCORnet, using a PPRL method. And with that, thank you for your time. Looking forward to any questions that you may have at the end. And with that, I will hand it over to Anna. Hi there. I do not have slides, so you just get to stare at me the whole time, but I did want to commend those who came before me for some excellent presentations. I must say that I'm pretty impressed with some of the thoughtfulness that's gone into their presentations and how far we have progressed in developing registries for patients. I'll give you a little bit of background. I used to do other things when I started in economic policy and foreign policy, but over time, because I live with type 1 diabetes, which is very complex, has always been very difficult to control and have all the complications from the disease, I became more and more focused on healthcare. About 11 or 12 years ago, I was so frustrated with the stagnation research in type 1 diabetes and encouraged by the emergence of all of these new forms of what we now call real world data, that I co-founded a company to do big data analytics through a visual interface. And that sort of launched me into the world of health data, health registries, and a variety of other health IT policy activities, which has been fun and engaging and frustrating all throughout the process. Within the context of that, I've done a lot of stuff in type 1 diabetes space as it relates to medical devices, data standardization, policy with FDA, helped kickstart a patient hacker movement, a type 1 diabetes space that's now gone global. And that real world data collected through medical devices, continuous glucose monitors, insulin pumps, is now fueling its own sort of ecosystem of real world data analysis from the devices that this hacker group has created and is now fueling its own peer-reviewed research, all of which is driven by patients. In some cases, there are physician co-PIs, but the stuff is now getting published in peer-reviewed high profile medical journals. So type 1 diabetes in many respects is an outlier because we are a disease community that lives with lots of data. But I do think a lot of the things that we have learned along the way can be very helpful for informing approaches for real world data collection, registry development for other disease states. Some of the key things that I have observed over time, both in the type 1 space as well as in other spaces where I've worked, I now work as an independent consultant focused primarily on data use access and governance and have been working for the past three and a half years with one of the big genetic testing companies and a few other technology companies, helping them think through how do you engage patients in aspects of data governance? How do you build trust through the activities that you undertake to build out your data governance, management and access systems? And how do you use your processes as a way not just to build trust for the organization, but also to help patients understand the important role that they have in contributing to the advancement of science. So this is now what I do with not just my spare time, but also my work time. So one thing that I've noticed that seems to be missing from the awareness and this may be changing, but as you think about how you're going to design all aspects of your registry, especially if it's one where you anticipate trying to get data from patients over a prolonged period of time, you need to realize from the outset that every engagement is an opportunity to build trust and credibility or to lose it. And that equation is something that nobody keeps, very few people I would imagine would ever like keep a tally or a score, but in your head over time, it's just like any relationship. Is this person primarily acting in a trustworthy way? Are they credible? Do I want to devote my time to them? Or is it, are they through their actions, what they're asking of me, in essence, disrespecting me, my time, and do they seem deserving of trust? And that engagement isn't just a request via an email, it's everything about the user experience and user design of your app or your interface. It's about the type of data that's being collected, how relevant that data is to things that patients care about versus things that other people care about. I was in a clinical study, I don't qualify for many because the complexity of my disease and RCTs tend to screen out complexity, but I was in one a few, a decade ago. And as somebody who had worked in the space, I understood that a lot of the questions they were asking were really all about reimbursement, but the way that they did it and the workload of activities that they placed on those of us who were in the study was just so burdensome that it made me angry. And it clearly did not consider how these questions met the needs of individuals. They made no attempt whatsoever to actually explain why they were asking these questions. And it just, everything about the way that study was designed was done in a way that just did not acknowledge how that fit into the workload of patients. As a result, they had an incredibly high dropout rate and I don't believe they were able to keep the numbers that they had to move forward. And that's an extreme case, it just happens to be the one that I was part of and it was a randomized controlled trial, not a clinical, not a registry, but it's just indicative of if you're asking patients to provide data, you need to help them understand what that data means and how that data helps you by context that is meaningful. And make sure that if it's not an end point that has direct relevance to that individual that's easy to grasp, that they have the bigger picture of how that's going to contribute to advancing our understanding of that particular disease or treatment, et cetera. It's incredibly critical as it relates to trusting credibility to make it as easy as possible. And this is, passive data collection is getting easier and easier. When I first started in this business, it was not as easy as it is today. Now with the proliferation of API standards such as FHIR and the use of platforms such as HealthKit, there are ways to get access to data very passively that can be incredibly helpful. And what we have seen in the type one diabetes space, just, I haven't seen the study actually, but just my discussions with some of my fellow patient data nerds and hackers is that if it can be collected passively, people will know that it could be collected passively and they will expect that. And if you choose to not do that, then that loses trust and credibility. You're essentially communicating that you don't value their time. You have to think about user experience, user interface, make it engaging, make it pretty, make it easy to follow and understand to the extent to which it's possible, make it customizable. If you're using patient reported outcomes measures, particularly some of the validated measures can be extremely cumbersome and long, find ones that are easy to do that are still valid or come up with your own. Because if you're engaging with somebody through their mobile phone, the attention span doesn't necessarily change because it's a disease. You still have pings coming in from your friends and text threads and social media pings or emails as you're engaging with the app. One of the things, this is not necessarily focused on registry, but type 1 diabetes has been a disease where everybody's from the very beginning of mobile health, people have discussed how diabetes, particularly type 1, is sort of like the be-all end-all use case for apps and for the promise of mobile health. To my knowledge, I know of exactly two apps that people use on a regular basis. One is for their continuous glucose monitor and the other is for insulin pump or some sort of, I'm running an open-source, crowd-sourced artificial pancreas system. It's all been community-developed and designed. I look at that app and my CGM data and those are incredibly, those are very life-dependent sources of data for me or for anybody with a disease. I don't touch any of the others unless there's a specific reason. So I have lots of other health apps that I use, but if the data isn't very specifically relevant to what I'm doing, I and nobody that I know will use it. So think about the additional burden that you have when you are collecting data for a long-term registry. If you're going to do it through a phone or even through a laptop or desktop, you have to do it in such a way that it's easy, that considers and respects the life and workflow of the individual. Registries are for people who have diseases. Some of those diseases are incredibly complex and draining. It takes a lot of time and effort to focus and collect data. So you really need to think about how does this fit within the workflow of individuals. Importantly, you have to really give and continue to reinforce the reasons for individuals to participate. It's a very high bar for use and engagement, particularly if you're trying to keep people engaged over time. So you need to be very specific about what the benefits are and you need to keep reinforcing those messages over time, both through the interface tools that you're using as well as other types of messages. One of the things that I work with my clients on, and actually one of my clients just released a big report last week, is transparency about how the data has been used. Whether it's de-identified or identified data, what are all of the different ways that you have used that data or allowed it to be used by other entities or companies or whatever over the past year. So we just issued for my client, in this case my client was in Vitae, a genetic testing company, we just issued a data use transparency and impact report that not only covers all of the ways that the company and company's partners used data during the past year and is transparent about that, but it also, as you look at all of the different uses, provides an incredibly compelling list of the different ways that patient data has impacted in advanced science. And that's part of the message that I don't think gets out there enough. You know, I've never been involved in a registry necessarily because most registries are tied to specific clinical practices. I've been involved in the periphery of different registries, but I know that my data is being used, my de-identified data is being used and sold, and I'm fine with that, but I would like to know how that data is being used. And if you're asking me to contribute over time or anybody to contribute over time, it would be really helpful to know how that data that you're using is advancing science and helping patients such as me and everybody in my family who is at risk of the disease or everybody in my community, both in the short term and in the long term. And I don't think that industry or researchers or medical associations are particularly good at helping people understand that they are driving science forward by allowing and enabling their data to be used. And then importantly, I mean, one of the things that I work with is helping companies develop frameworks and processes for embedding patients in all aspects of data governance, use, and access policies. Everybody thinks that they can understand and empathize with perspective of the patient. Oftentimes, especially in healthcare, that's true, but you don't really, you don't get it until you get it. And there are sensitivities and concerns that are very easy to overlook. And there are a lot of different individual perspectives and diversity in many different ways in the patient community. And you need to think very actively about how can we pull those perspectives into how we set up our data governance policies and how do we evolve those policies over time to consider the concerns and needs of patients. And then, you know, one part of that that I like to include when I work with clients on this is, you know, you really need to engage patients and enable them to be part of the research and guide the research questions. A lot of patients aren't going to be interested in participating in a quote-unquote research project, but chances are, if you ask them, there are some questions that they have about their disease that might or might not be able to be answered through the registry. And I feel like if you're asking individuals to contribute their data, it's on you and on all of us that are involved in the research enterprise to think about how can we make sure that this data that we're collecting through these individuals is actually answering the questions that they have about their disease, regardless of whether or not it's going to contribute to, you know, an article in JAMA or NEJM or some of the major medical publications. So those are my primary comments. I guess I would just encourage you all to think about everything that you're asking of patients within the eyes, through the eyes of the individual from whom you're asking, you know, from whom you're asking to provide the data and to participate in this research. You got to give them reasons. You need to earn their trust consistently, reliably, over time. And you need to think about, you know, how does this meet their needs? How do we help them understand how they're benefiting science? And how are we keeping them engaged respectfully over time? And I'm happy to take questions. So this is Steve. We're going to be opening up for questions for the whole audience, and I think we should say thank you very much to all the speakers. I think these have been some really interesting presentations, and I think they dovetailed together pretty well. I think they actually demonstrated a variety of perspectives around this really complicated issue around you know, direct-to-patient registries and how interacting with patients is different when you're working with them directly as opposed to at a university setting or in a typical clinical research setting and what some of those challenges can be. Let's open it up for questions, and I believe Liz and I will be co-officiating the Q&A. And yeah, thanks again to all the speakers, especially Anna. Thank you for rounding it out at the end with your perspective. So there is a Q&A box. It would be ideal if you put your questions in that box. We do have one question that's already there that was addressed to me, and it is how did you define a representative population for the ASCO COVID registry? So as a statistician, I'll go back to sort of first principles of what does it mean to have a representative population, and what that should mean is that the patients in the registry or in the study that you're performing reflect the patient population from whom you are trying to generalize your results, and not just that they are represented, but that they are represented in the appropriate fraction that reflects the population. So to use an analogy, in the cancer clinical trials world, we know that Black patients are not usually represented appropriately in cancer clinical trials. The burden of cancer is such that about 13% of patients with cancer are Black, and yet they are only represented at about 3 or 4% in cancer clinical trials. So that they are not, although trials are open to them, they are not well represented. So how would I define a representative population for the ASCO COVID registry? Ideally, what we would like is for the COVID registry patients to be representative in multiple ways, but just two would be that it accurately reflects the cancer population in terms of the types of cancer and the severity of cancer, meaning early, middle, late stages of cancer, and also that the representativeness would adequately reflect the severity of COVID across those cancer patients that would be eligible for the registry. We think that, and we know that a lot of cancer patients have early stage disease, and the way that we have constructed our registry, for better or worse, really, it does tend towards patients with metastatic disease. And you could argue that that is a bad thing because it's not representative of all cancer patients, excuse me, patients with cancer. But a lot of the most serious questions and the impact of COVID is among patients with metastatic disease, because they're the ones who are most in need of continuing on with systemic therapies, in many cases to ensure that they do not have disease progression. Of course, it's important for all patients to not stop therapies, but the metastatic patients are, in addition, oftentimes immunocompromised. And so they're at risk of both progressive cancer and also potentially more severe COVID outcomes. So it's a little bit tricky. And the other thing I would mention in terms of it being important to be representative is that you have to be really careful when you know that the population in the registry is not representative, that the questions that you ask, the research questions that you pose are not going to end up being biased by the analyses that you do. So for example, a lot of times in registries, like ours, prevalence is not, you can't really estimate prevalence necessarily very well. You can oftentimes, though, get estimates of associations or risk factors. Thank you for that. We have a couple of other questions. There's one in the main chat that spoke to uses of gamification in registries. I actually know a lot about gamification, but actually of all things, I don't do it in registries. Leon, I think may have a thing or two to say to that one. Yeah, thanks, Steve. I do. The question was also about incentives and where do you start? There's actually a good psychological literature and research psychology on incentive models for participating in research. But I think my general observation is that the results are going to be very heterogeneous across disease areas. There's just a vast difference between a serious pediatric disease and a chronic adult disease in terms of what incentives are necessary and what works. So I would say that the best wisdom is use your knowledge of your population to decide where to start and then adjust. That's incentive. So for gamification, I love gamification as a former research psychologist. It's an application of operant conditioning principles to get the behavior you want. Those of you who read on behavioral economics, you can think of it in that framework. It's generally a very nice add-on. So what I'll say is that it's unfortunately too easy to do it badly because I've seen people just add it on as like, oh, and we'll add gamification and get silly badges and silly points. It really requires thoughtful design and it has to be consistent with your brand. You have a very serious brand. If you're a cancer organization, you don't want goofy colors and sparkles. This is a serious business. So it just has to be a part of a thoughtful story. And I think what Steve said about bringing marketing people into it is pretty smart. They tend to know about the stuff and know how to make it play out. In terms of technology, you just need a technology platform to support standard reward models. There's some good ones with points, with badges, et cetera. Those tend to be fairly straightforward. So I think it makes a difference at the edges. I'll give you guys my impression. But again, it has to be part of a thoughtful design. Yeah, I'll just add to that. The concept of gamification also can have some untoward effects as well. It can actually cause some biasing to be entering into data collection. So you've got to be careful about, someone said earlier, about the questions that are going to be asked so that you don't actually inadvertently skew or change the nature of the information you're collecting in that. We have another question in the chat from Hinoos Yazdani. Business models for clinical registries are challenging. Can you comment on what you think the future funding models should be like or should look like? I have a few thoughts about that since I worked on that particular front. But before I chime in, let me toss it out to the rest of the panel. I'll go last. Leon, it looked like you wanted to say something. I got lots to say, but I don't want to hog the floor. So I'm happy to start. I'll go quick. So yes, sustainability and modernization is very challenging with registries. Direct-to-patient improves the situation somewhat because you're now bringing together multiple data streams, usually to create a unique data asset that doesn't exist elsewhere. So sometimes if you have a single data source, like a clinical registry, you're competing with claims data. You're competing with commercial data resources that already exist and they're not expensive. Or if they are, you have competition. So I think here, a lot of the models that I've seen that are successful are really about integrating data that's a little bit hard to bring together, investing in data collection, investing in the passive data acquisition from either sites or sensors, et cetera. And then you have to identify who your buyers are. And I mean, Steve has good practical experience with that. Most often your best source of funding at this point is going to come from life sciences, or ultimately it's going to come from life sciences. And there's sponsorship models where you can get funding before you have any data or before you build the registry on a promise that the sponsor will participate in the benefit of getting the data somehow. There are pure data monetization plays where you sell the data you collect, those could work. But frankly, again, I think a lot of my answers are going to be depends, right? It's going to depend on your disease. And you have to, if your disease is a high demand area, whether there's a scarcity of data, that's great. But if you're working something like diabetes, where there's a lot of data, it's hard to know if the additional data you create is going to be worth anything as an asset in of itself. The third model is, and where I think life sciences gets very interesting. So the sponsorship, data monetization, the third is network building and relationships. That right now is the fastest path to monetization that I'm seeing. Helping recruit patients into clinical trials in the right space is a very, very fast way to get interest from life sciences. So I'll leave it at that because I know other people have stuff to say, but it's a great question. So thanks for asking. Yeah, I'll add to that. So I've worked much of my career in life science companies, and I will add a word of caution to what Leon has said. And the caution is simply this, is that everybody and their brother thinks that they can sell stuff to pharma and pharma will just open their wallets and give money because of course they need the data. Well, it turns out that it's not necessarily of course, and it's not necessarily the data. Pharma companies have very specific requirements for data and those requirements look like things that will help them in understanding how their drugs work in large populations or how diseases fare in large populations. It's not just anybody can put something together and pharma will like it. You've got to be able to a priori have a conversation with them and find out, is this something that will be of interest to them? Will this be something that will help them? Because at the end of the day, this is science, but it's also a business and you've got to, there have to be two sides to the business equation, a buyer and a seller. And the buyer has to be able to know that what they're getting is going to be of value to them. And the seller has to be able to sell something of value to the buyer. And it sounds very, very simplistic to say it that way, but it turns out people make that misassumption all the time that pharma will just be there. Having been in pharma and been on the data buy side for many, many years, I can tell you well, pharma will buy data, but it will only be on their terms and the data that they want, need, and will find useful to their needs. It just won't buy things directly for no good reason, just because you've happened to build it. Any other thoughts about that from the panel? You know, I forgot to make one important connection to what Samantha was saying. Actually, the privacy preserving patient linkages create another business model opportunity for folks who doing de novo data collection and aggregation. If you find a partner that has very large scale data assets, like huge claims repositories, you can actually package your data while maintaining privacy and connect it up and have that being sold as a package. And that's a very interesting model. I only came across it recently, but I'm just seeing a lot of movement in that space. So again, without giving up your patient's names or anything, right? Using the patient preserving privacy approaches, you can package together the data you generate with a large data seller. Can I just add one other thing from the ASCO perspective? So our registry was a COVID registry and we got funding through our foundation, the Concord Cancer Foundation, and it was supported. The Concord Cancer Foundation COVID fund was supported by pharma, but what we're finding, I think in the cancer world is because as cancer is still very prevalent, cancer is becoming more and more many rare diseases because of the precision medicine, because of the understanding of the way cancer behaves. There are numerous subtypes based on genomic signatures of cancer. So we're finding that across the country and across the world, we're having people interested in some of these really rare subtypes of disease that are trying to connect with each other to develop relatively simple registries at their institutions, just to understand how common these diseases are, how they're treated, what people are doing. Like I said, especially in these contexts of these rare diseases where there's just not a lot of understanding and guidance to collect data to understand outcomes. Yeah. And that's exactly what happened when we were at the Myeloma Foundation, that once myeloma was unpacked and realized that it was actually more like 12 diseases than one, all of a sudden to be able to get to those subtypes became the critical feature as to why people wanted the data. So if they wanted a 414 or a 1214 or some other translocation and they only occurred in the myeloma population at the 2%, that was that grid that I showed in my presentation about how to get to cohorts of a large enough size. That was exactly why I presented that grid. Anyway, we're at time at this point. Heidi is back and I think she's going to tell us we've got to end. Thanks everybody for participation. Thanks to the panel. Maybe I'm stealing Heidi's thunder here. No, you're fine. I just also wanted to express my thanks to the panelists today. I think it was a wonderful conversation. I think you could have gone for another half hour. And just to everyone who has been tracking these presentations, everything will be loaded, including the sessions from the annual meeting on the website in the next week. So thank you again to everyone and have a great day.
Video Summary
This enlightening CMSS webinar wraps up a six-month series focusing on registry science and research, particularly emphasizing patient data collection using technology. Hosted by Helen, the session featured several experts: Liz Garrett-Meyer (ASCO), Steve Labkoff (Quantory), Anna McAllister (Consultant), Samantha Robichaux (DataVant), and Leon Rosenblatt (IQVIA).<br /><br />Liz started by discussing the ASCO COVID-19 registry's development. Launched in April 2020, it collects data on over 6,000 cancer patients who contracted COVID. Challenges faced include missing data, over-representation of severe cases, and difficulties in longitudinal tracking due to a lack of direct identifiers.<br /><br />Steve shared insights from building the CureCloud for multiple myeloma. He presented the importance of representative sampling, patient engagement, and collecting broader clinical data. Gamification and regular updates can sustain long-term patient interest.<br /><br />Leon expanded the discussion to direct-to-patient registries, emphasizing novel, diverse data streams like patient-mediated EMR ingestion and at-home sensors. Despite challenges in data integration and consent management, direct engagement can streamline clinical trial recruitment and research follow-ups.<br /><br />Samantha highlighted privacy-preserving record linkage (PPRL) techniques. PPRL using tokenization ensures PII is never exposed, fostering secure, fragmented data integration. Sam advised collecting detailed PII upfront to facilitate future linkages.<br /><br />Anna McAllister closed by stressing patient-centeredness in data collection. Ensuring user-friendly interfaces, contextualizing data relevance, and maintaining transparency in data usage build patient trust and engagement, which are crucial for long-term participation.<br /><br />The session concluded with a Q&A on the complexities and potential business models for sustaining registries, including engaging patients and leveraging privacy-preserving technologies to link data sources securely.
Keywords
CMSS webinar
registry science
patient data collection
technology
ASCO COVID-19 registry
CureCloud
direct-to-patient registries
privacy-preserving record linkage
patient engagement
data integration
clinical trial recruitment
Helen
×
Please select your language
1
English