false
Catalog
COVID-19: Accelerating Real-time Electronic Data C ...
COVID-19: Accelerating Real-time Electronic Data C ...
COVID-19: Accelerating Real-time Electronic Data Capture for Tracking, Learning & Improvement (Webinar)
Back to course
[Please upgrade your browser to play this video content]
Video Transcription
Hi, everyone, and welcome to our webinar today. My name is Helen Burstyn. I'm the CEO of the Council of Medical Specialty Societies, and I'll be your host for today's presentation. This is the second in a series of six webinars that we're doing in collaboration with our friends at AAMC, which have been funded by the Gordon and Betty Moore Foundation, on Advancing Clinical Registries to Support Pandemic Treatment and Response. We began this series a couple months back at our spring meeting on May 1st. That recording is available on our website. The goal of the series is really to see how we can identify and transform clinical registries and clinical research through rapid cycle learning and development. We've had a number of great sessions planned for you for the rest of the summer, and please visit our website as they're finalized and encourage you to register to attend. We will encourage you to have this conversation on Twitter as well using the hashtag COVIDRegistries, and please follow at CMSSmed for frequent updates. Before we get to the great presentations lined up for today, just a few housekeeping details. We will have a Q&A session at the close of the presentation. Questions should be submitted using the question box on the right-hand side of your screen. There's too many folks to do a live Q&A. If you don't see the question box, please click the white arrow in the orange box. It's very specific. Located on the right side of your screen, these sessions will be recorded and posted on the CMSS website for future access anytime to you as a registrant, and you can also immediately access a PDF of the presentation slides by using the GoToWebinar dialog box on the right side of your screen. At the conclusion of the webinar, you can access the evaluation via GoToWebinar and contact us at CMSS.org if you need any assistance. All right. Julia, next slide. I don't seem to be advancing. Thank you. So, today's presentation we're really excited about is COVID-19, Accelerating Real-Time Electronic Data Capture for Tracking, Learning, and Improvement. It'll be moderated by Dr. Atul Butte, who is the Priscilla Chan and Mark Zuckerberg Distinguished Professor at UCSF and Director of the Backharb Computational Health Sciences Institute at UCSF, and also he has many hats. He is, importantly, the Chief Data Scientist at UCHealth, and he'll share with you some of what they've been able to do at UC. I'm going to hand the presentation over to Atul, and he will introduce the session and our panelists. Thank you so much for joining us. Atul. Great. Thank you, Helen, and a special thanks to our participants today. I'm going to introduce them in a moment, but these are challenging times for all of us, so I really appreciate the time they're putting in to help educate all of us on registries. Registries in general are patient care data sets used for a variety of reasons. Some are society-driven, specialty-driven, physician-driven, pharma- and biopic-driven. Sometimes we want to build registries, sometimes we are forced to build registries, but I think registries mean something completely different when you're in the middle of a pandemic where everyone is starving for accurate, timely data as decisions are going to be made in a hurry. So it's interesting to think about the future of registries through the current operational views that we all have through the current pandemic. So I'm going to be introducing our panelists here. Myself and the other four will be giving 10-minute presentations, and then we've left a lot of time open for questions and answers and discussion. So in order, after me, I'm going to be introducing all four first here. I have with me Dr. Andrew Ip, who is a physician data scientist in the Outcomes and Value Research Division at the John Fuhrer Cancer Center at Hackensack Meridian Health. He's representing a health system. He's going to be followed by Dr. Subha Madhavan, who's the chief data scientist at Georgetown University Medical Center. She's representing an academic medical center. And Jesse Tenenbaum, the chief data officer for North Carolina's Department of Health and Human Services, representing a state government body. And then Dr. Dal Bennett, who's the section head in informatics and data science and an associate professor in the Department of Pediatrics, giving the perspective of a national consortium, specifically one run of NCATS and the National Institutes of Health. So I hope you'll join me in thanking these four for joining me and Helen for today's panel. So with that little bit of stumbling, let me get into my first presentation and then I'll be turning it over to the others. Let's get to the next slide. All right. So I'm going to give you the perspective of the University of California Health System, which is multiple academic medical centers. So just to introduce or reintroduce the University of California to you all, we have 10 campuses and three national labs. We have 200,000 employees and a quarter million plus students a year. And like probably all of you, we're trying to figure out what to do with all 200,000 employees and a quarter million students. August and September are coming quickly and we're all still trying to figure this out. We have 19 health professional schools, of which six are medical schools. We also have veterinary, dental, nursing, pharmacy, public health schools, and you can see we train a whole bunch of medical residents and medical students in California. And several of our academic medical centers are in the top 10 and you can see the rest here. 5,000 doctors get a paycheck, but 100,000 doctors write orders on our patients. And in the middle of all this, now we're trying to figure out what to do with COVID here. Now over the past two years, because the directors of the National Quality Forum, from payers, we've been centralizing our data. So each campus has their own version of their data. Of course, we conveniently all run Epic, but we also have a copy centrally of all the clinical data in a central data warehouse, the UC Health central data warehouse here. And you can see, I wouldn't be an IT guy if I couldn't make boxes point to boxes. So here's the one slide like that. So now it comes in COVID. So we've been happy getting data monthly from each of our health systems, monthly. And COVID turned this into a daily operation. So like many of you know who run Epic, at midnight starts the big dump out of Epic into Clarity and then Caboodle to try to make tables out of all the operational data in Epic. Then at 6 a.m., these get sent centrally to us so that we harmonize all the data elements and put them all together for the data release, usually an email that goes out to about 500 people around the country and around the state, and then a Twitter, and I'll talk about the Twitter in a moment. We just tweeted ours out just about 30 minutes ago today. So this is the AOL data, because I had to submit the slides, and you can see we've run 135,000 tests with the patients with results on the right there, of which 4,419 are positive. And the numbers obviously keep growing. We're not done with this at all. You can see the age and sex breakdown of the positive cases on the left. You can see all the cases and their home zip codes in the past seven days. So this is just skyrocketing in recovery cases from all over the state, as well as all over the country. Let's get to the next slide here. All right. So this is... Those are all the tested patients. Of course, most of those folks are not admitted or have anything to do with coming into the hospital, but these are the inpatients. We're at a record 228 patients now. We were down to 100 about five weeks ago, so it's really back in California. And we count the inpatient census on the right. You can see the trend line on the left. Again, this is all running off of our central data repository, which I'll give a little bit more details here. So all of our campuses now take their data out of Clarity and generate what's called OMOP. We'll probably talk about OMOP a little bit. It's an academic medical center-ish standard. It came out of Columbia University and the FDA about 15 years ago, but it's a vendor-neutral standard. Each campus maps their data to OMOP, and then we copy all that data together and then harmonize all the data elements if we need to centrally. Otherwise, we have common translational tables that are pushed to each of the campuses here. So from OMOP, we generate all these dashboards. You can see the ICU patients, ventilators, ECMO. And so far, we've sent home 763 patients, and unfortunately, 102 have passed away so far since the start of this pandemic for us in 2020. Oh, here are the numbers here. So about 3.27% of our tested patients are positive, 29% of the positives are admitted, and then 7% of the admitted ones have passed away, or about 2.31% of the positives. So that gives you a rough kind of case fatality, a crude case fatality rate of 2.3%. So this is still much more than the flu, if anyone has any lingering doubts on that. Now, one of the things I'll talk about is I think it's important to really disseminate this data to the public, because for most people, I believe COVID is still an invisible disease. Unless you know someone with COVID, or if you have a family member, you never see this. You only see the restrictions, the lockdowns, the masks, and you can see the public is not the most satisfied with the current rules and regulations and policies. So I thought it was early to put that data out there. So we take these panels and put them on Twitter, you can see about 1.8 million people have seen our tweets in the last 90 days, routinely, they get 10 to 20 to 30,000 views now. So every afternoon or so we put this out on a Twitter feed. Now, again, this is just the data is the data, you can build facts around this or opinions, I think it's important to put some raw numbers out there. So when we're talking about registries of the future, it means putting numbers all the way out to the patients and the public as well now, not just for specialties. So you can see some of the numbers highlighted here. Another thing we've been doing is generating data and evidence for the Food and Drug Administration, FDA. And because we have this data centralized, we can start to look at all the different drugs we've been talking about, whether it's heparin or hydrocortisone or hydroxychloroquine. This is a slide made by Rohit Bajist, who basically shows the percent of our patients so far who've been treated with any of these. So 15% of our patients have been treated with heparin and all the rest of these, they come out of the NIH treatment guidelines, yes, there are treatment guidelines put up by NIH. And you can just see roughly what percent of our patients are on any of these types of drugs. Rohit's name is at the bottom there. But another way we can do it is actually look by calendar plot. And so this is neat. There's four drugs here, heparin, enoxaparin, azithromycin, hydroxychloroquine. And you can see, we were using hydroxychloroquine until suddenly we were not. So towards the end of April, you can see that was the last patient that got a dose of hydroxychloroquine from us. You know why. I mean, we can all see the press and the science and the preprints, whereas the other drugs have continued. We've stopped using that, at least for now. We'll see how the new evidence changes things there as well. So Rohit's able to come up with some of this. But it shouldn't just be us doing this. And what we've done, and what I'll end with, is that we've created what we call UC CORDS, the UC Health COVID Research Dataset, to open this up to all of our UC Health research faculty, staff, and students. Faculty and students have to sign a data use agreement, which they agree not to share this with others. So we keep this within our sandbox. They can't download data, but you can use and upload your own tools. And each campus then uses their own secure research environments. While we generate this centrally, we work with many people across the campuses to harmonize data, all this ventilator settings, all the rest, and then we push the data back to their secure environments. This is technically a HIPAA-limited dataset, so it's fully de-identified, except we put the dates back. Obviously, dates are irrelevant in a pandemic. And we've got all the UCSF IRB started, and then all the UC Health IRB directors have agreed that this is a limited dataset. And this is HIPAA-exempt, which means this is called non-human subjects research, and no further IRB approval or submissions needed for any University of California end-users. We regenerate this every Wednesday and then transfer this every Thursday and Friday. And that's as close to real-time as we get. I know we put real-time on the panel title here, but real-time is not anything usually faster than daily, if we're lucky, and here we generate this weekly. We put all these instructions, including sample code, right, that registered in the future should come with sample code so that programmers of the future can really quickly get to this kind of data here. So we give this out. I'm going to stop there. Those are probably more than my 10 minutes. And just to give you our perspective, and of course, we'll get to questions in a moment after each of the speakers, but I'm going to throw this over to Tel to give his side of the story. I think Andrew's talking next. Oh, sorry. Andrew's next. Yeah. I was thinking Tel, but Andrew's next. I wish I had a cooler name at Pelham. So, we only had one REDCap administrator to begin with. I think I lost you guys. My screen turned off, but – Yeah, thank you. It's fine. It's fine. There. That was basically – okay. Okay. Just thank you to my team, Stuart Goldberg, the statisticians from Georgetown, and CODA as well. My computer literally just turned off on me, but thank you. That was all I had. Excellent. All right. I think we go next to Subha. I see your webcam on. Go for it, Subha. Yeah. Thank you, Atul. Atul and Helen, thank you for inviting me, and I really want to also add my thanks to all the webinar participants to discuss this very important topic of how can we use data to really accelerate research, learning, and improvement. As Atul mentioned, I'm representing the Georgetown University Medical Center perspective. We are a tertiary care academic medical center in the Washington, D.C. area and work with a number of clinical partners and really are in the forefront of COVID care, research, and education. I'm really delighted to share some of our experiences with COVID today. Julia, can we advance the slide? I'm having a little bit of difficulty here. Perfect. Thank you so much. So what are we facing as academic medical centers today? And the first box here shows some of the data challenges as an academic medical center. We work with regional, national, and global partners for a variety of research activities, and we see a lot of geographic and political divides with different laws and regulations for data sharing, et cetera. There is a lot of burden of emergency orders, and I think, as Andrew mentioned, you know, everything had to be done yesterday, right, and there's a limitation in terms of resources that are available to work on projects. So how do we deal with these emergency orders? There are social, ethical, legal, and trust issues, and many of us working in the area of data science and informatics really think about the trust and verification framework, and we're now questioning or editing our current trust and verification frameworks because the timelines have really shrunk, and you know, if in a different research situation, you would have taken two, three months to complete a project. Now we have to do them in two days. So the trust and verification process is really failing in a lot of the COVID-related projects, and of course, data collection standardization continues to be a challenge with, you know, all of us trying to grapple with how best to collect data, what systems to use, what coding systems to use, data standards, and so on. So many of us are thinking about, you know, what are the best pathways forward, and you know, in order to be immediately responsive, we have to be use case driven, right? So okay, so I have all this data, I have all this infrastructure available, but how am I going to use that to solve this problem at hand? So being use case driven really helps us focus on what resources are needed. Also reusing existing infrastructure. I think as innovators and researchers and data scientists, we always want to build the next cool gadget or the next cool digital technology, but now is not the time to do it. Now is the time to reuse existing infrastructure to solve the problems at hand. Training and education. This is really close to my heart, unless we have, you know, really well trained folks who really know how to access electronic medical records or public health databases and how to use them for analytic purposes, we really have that intelligence gap that we cannot really cross. You know, there could be all kinds of data, but if we don't have talented, intelligent people who are trained, all that data is useless. Convening and coordinating activities. I'll tell you a little bit about one research consortium that we're part of. More and more are emerging almost on a daily basis. The only way for us to really succeed in this global pandemic is by working together. And I want to give you one example of this Teraworld consortium. And last but not the least, security, privacy, and compliance issues, as may relate to the different regional, national, and global data sharing efforts. So I'll share with you today three projects. First is the Teraworld registry, a thoracic cancer registry where we are looking at cancer patients with COVID and how the outcomes that impacted in these cancer patients with a COVID diagnosis. And second, I want to tell you a little bit about an artificial intelligence approach to organize massive amounts of scientific data. In my last count, there were more than 65,000 peer-reviewed articles on COVID, you know, various research, diagnostic, vaccine-related, et cetera. How can we stay on top of it? And can artificial intelligence methods really help us stay on top of the scientific progress? And finally, I'd like to share a little bit about an immunogenomic registry that we're organizing to really compare data from COVID, which is SARS-CoV-2, one type of coronavirus, with another type of coronavirus, which also affects humans, and looking at how the immune profiles are different between SARS-CoV-2 and this other coronavirus. Can you give me the next slide, please? Thank you. So the Teraworld registry is a thoracic cancers registry. And as you can see, the time period, again, was very short that we looked at. And this is collecting data from eight different countries. And we used data from 200 patients in a span of just 18 days. And I just want to emphasize that. I mean, the timeframe for research studies are really shrinking. So in a span of 18 days, we were able to look at the outcomes in these patients, in 200 patients with thoracic cancers and COVID. So these were the types of data collected. I'm not going to read that out. You can see the different demographics and outcomes and treatment data that was collected. I just want to put a fine point on the data standardization here. So if you look at the World Health Organization's coding of COVID patients, we used this ICD code, U07.1, for when a diagnostic test is available for a confirmed COVID case. But there's also the U07.2, which is a probable case of COVID. So you can imagine just plowing through the EHRs, because the clinicians are just getting used to this coding system, and not always do they use these ICD codes. So it's extremely difficult to ascertain patients with COVID without actually reading through the charts and seeing whether the symptoms were there or not. So just wanted to highlight the challenge here. The median age was 68. The ECOG status, and many of you may know this, the ECOG status tells you how good is the activity level of this patient, right? So an ECOG status of 0 or 1 means that they were very active. They were able to do all of their pre-disease activities. So they had a cancer diagnosis, but they were going through treatment, and they were fully active individuals. So majority of patients were 0 to 1 ECOG status. And we used a stage of stage 4 disease, so majority had advanced lung cancers. And over the last few years, there has been a lot of improvement in outcomes for advanced lung cancers because of various targeted therapies. So these were patients that were potentially doing well, and their quality of life was improving, and then COVID hit, right? So let's go to the next slide, please. Thank you. So 76% of these patients, just the 200 patients within that 18-day time window, 76% were hospitalized and 33% died, right? And from the outcomes data that the research consortia collected, we were able to show that 79% of the deaths were due to COVID-related complications. They were not because of cancer-related complications, but they were because of COVID-related complications. Only 10% met the criteria and were admitted to the ICU. So essentially, you know, not so very small number of these patients really were admitted to the ICU. So majority of the deaths were on the general ward or in the hospital, not in the ICU. We found that a few factors were associated with high risk of death, being older than 65. I think this is not surprising, being a current or former smoker, this is not cancer-related death. Again, this is COVID-related death, currently receiving chemotherapy treatment in presence of comorbidities. Whether or not the mortality could be reduced by actually having more of these patients be in the ICU, that is a question that is yet to be answered. And an update to the Teraworld Registry is coming in September at the ESMO, the European Society of Medical Oncology. Thank you. The second project quickly I wanted to share here is the Flattening of the Curve. This was the data visualization challenge. We participated in a number of data challenges. We organized some of them. Here we partnered with the QED group, Amazon, Tableau, and others to organize the Flattening of the Curve challenge. We had 850 participants from around the country. This was specifically data visualization challenge for COVID. Can you take a publicly available data set and organize the data in a useful fashion so that people could utilize that information? It is not just in tables or tab-delimited files, but you have a nice dashboard that people can utilize. We had 90 different entries for this data challenge. If you go to the next slide, I can show you who the first prize winner was. This group is called Primer.ai. They are from San Francisco. They built a visualization tool called COVID Primer to stay on top of the peer-reviewed publications and social media discussions that are ongoing. They update this on a daily basis. They use natural language processing. Basically, you can do topical modeling. If you are interested in vaccines with COVID, you can select that as a category and see what are the top publications or what are the top discussions in social media for that particular topical category that you are interested in. This is a tool as a researcher. For me, it is extremely helpful. I don't have to read the hundreds of papers in the topic of interest. I can go to Primer.ai and visualize and quickly get a summary of that information. Finally, one last project I would like to share with you is immunogenomic analysis. This is happening in my laboratory at Georgetown. We have downloaded RNA sequencing data from lung lavage samples. These are samples taken out of lungs of patients infected with different types of coronaviruses. This is SARS-CoV-2 patients and these are NL63, which is a completely different type of also impacting humans, but it is also a coronavirus. We are studying how the immune cells are quite different in the SARS-CoV-2 versus NL63. You might have seen a number of publications around cytokine storm and how patients' host immune system is reacting to SARS-CoV-2. This is all going to help us build better drugs and better vaccines against SARS-CoV-2. I would just like to point out here, the macrophages M2, and this is a surprising story. There is a huge increase in macrophage M2. If your mind is in the space of immunology, the whole double-edged sword of M1 macrophages versus M2 macrophages, one is protective and one is more kill and attack. Very interesting patterns are emerging. The red ones that you see here on the NL63 are T-cells, memory T-cells. Of course, we see no memory T-cells at all because the human immune system do not have any memory of SARS-CoV-2. Very revealing in terms of these immunosort comparisons and they can be helpful in advancing research around SARS-CoV-2 and how we combat this disease with better drugs and vaccines. Can we go to the next slide, please? There you go. Thank you very much. I mentioned training. This is really close to my heart. Just wanted to point out, if you go to healthinformatics.dorshawn.edu, we actually teach a lot of this stuff. How do you organize data? How do you analyze data, et cetera? It's a one-year master's program in health informatics and data science. Our second cohort is starting this fall amidst the pandemic. As Atul mentioned, we're all trying to figure out the hybrid versus online situation for our undergraduate and graduate students. You'll learn more about that in healthinformatics.dorshawn.edu. This is our fabulous team. Go to icbi.dorshawn.edu to learn about our other work. Thank you very much and I'll stop there. Thank you. Excellent. Then I think we turn it over to Jessie next. Jessie, are you there? I'm here and I'm even going to turn on my camera, if that works. Yes, perfect. Apologies for the background. I'm in the middle of a house move today. The moving truck should be here any moment. Empty room, but we'll go with it. I'm Jessie Tenenbaum, faculty at Duke University. I'll leave for a couple years to serve as Chief Data Officer for North Carolina's Department of Health and Human Services. The area I'm going to talk about is quite different from the clinical registries that we've been talking about. From the state perspective, as you can imagine, we're gathering a lot of data, partly from the epidemiology, the people reporting COVID cases, but then also some different aspects like medical surge data. Do I have control? Yes. Medical surge data, you can think of this as stuff, staff, and space. This is getting at things like how many beds are full, how many beds are available, looking at ICU beds in particular, looking at ventilators. How many COVID positive people in the hospital, in the ICU in particular? How many admitted in the past 24 hours with COVID? How many patients under investigation admitted in the past 24 hours? Then some variation of that of how are they doing for staffing of doctors and nurses? Over time, as priorities have changed, we've asked some other questions of how many people in the waiting room, how many people on ECMO, how many spaces do you have available in your morgue? How many labs are being done? We've been changing over time what are the important things to be collecting, but this is not just a hospital or even a system. This is the all 120 or so hospitals around the state of North Carolina. For this time, for the last several months, what's been going on is that we have this survey that goes out daily to these 120 hospitals, and a person there has to fill it out manually. The data validation is difficult because we can't just have a range because 45 may be a reasonable answer for one hospital, whereas it's a totally low or high answer for another hospital, and so you'd need complex logic of if it was X percent different from the day before, et cetera. There's a lot of data entry errors that we find. The results are highly visible, so people are paying attention to this, and it's published on our website, the DHHS website, every day, so we need to get it right. Some hospitals, on any given day, some hospitals are not reporting, which is problematic, and one of my colleagues ends up having to text the people there and hunt them down. Then there still are a lot of problems with the data. Part of the team spends an hour or more each night cleaning the data, looking at it, and saying, hmm, something's not right with this. We think they messed up, and having to hunt down the people to say, was this what you were meant to enter? We have been doing this for a few months when the National Healthcare Safety Network came out with national guidelines or federal guidelines for what to collect. It mostly matched what we were already collecting, and so we tried not to change the survey too much for what people were already familiar with. We did want to align with standards, so we tweaked it a bit, but we had been told that what we were reporting from North Carolina was acceptable, and so we didn't want to mess with things too much, but there is something to be said for being in alignment. That was the manual process that we were dealing with. APHRS is a company that has North Carolina's Controlled Substance Reporting System, CSRS. We already had a partnership with them, and it's been a good relationship so far. They had purchased a company called OpenBeds, which was meant for the mental health space and enabled there to be identifying where resources were available in the mental health space, and so they decided to try to use this technology in the COVID space to repurpose for this COVID med surge. To automate that, APHRS developed some APIs to ingest the resource use, so the number of beds available, the number of ICU beds taken up, all of those, from the HL7 feeds in these individual hospitals, and hospitals, system by system, are being onboarded to extract those required data elements and feed into the APHRS system. APHRS then pushes that data on an hourly basis to DHHS, and then the best thing about this is as each hospital goes live with this process, they're able to stop filling out this nightly daily survey that's taking up staff time, and by sending it to the state, we then pass it on to the feds, and so it's meeting the requirements both to report to the state and to report to federal authorities. So, there's four different ways that the hospitals can do this. One, and the preferred one, is an automated API, so an API that gathers those HL7 feeds and just sends it to APHRS. They can also do an automated CSV file, so they can do an extract on their end and then push that CSV file to APHRS. They can do a manual CSV, collecting that data and then pushing it manually to APHRS, and then the last option is manual direct entry to a form within APHRS' system. So, for the hospitals that don't quite have the technical chops, probably the smaller rural ones, who aren't quite scaled up to be able to do APIs, as we move toward this one system, we hope to be able to get them entering the data manually into this form tracker, so that we can have all the data coming from one source and not two different sources, which is what we have now as we onboard different hospitals. It is surprisingly difficult. You would think a bed is a bed, but we have many different terms that we use for these beds, so you have staff beds, beds available in a given hospital. Of those, a subset are your ICU beds, which tend to be the ones that were particularly important in our evaluation. Those are the ones that we're worried about running short on. Of those, some portion of both staffed inpatient beds and the staffed ICU beds are occupied. But then, beyond that, you have physically available beds. So, these are where there's a bed, but they didn't staff up because they didn't think they needed it. Then you have another set that is the licensed inpatient beds, so what that hospital would be allowed to do if they wanted to. Then you have these additional outpatient and observation beds that, if pushed against the shove, you could use for inpatient beds if absolutely necessary. Then there's this concept of surge capacity, which in North Carolina, the way we have been defining it, there was certain sort of above and beyond. Like, if things go bad, here is the tennis court next door that we could expand into or here are the number of spaces in a hallway we could put a gurney and call it a bed. The Fed defined it slightly differently, so they would ask for, what are all your beds including surge beds? So, it's really hard to align on these and get it to report to match what North Carolina, the state was doing, what the Feds are asking for, the way the hospitals themselves define some of these terms. So, that's taken a lot of conversation. But what this has enabled is this automated tracking of availability of these resources across the state. So, you can see, and this is fake data, but you can see over time for a given hospital or for regionally or statewide, how are we doing for ICU availability, how are we doing for ICU census, and then how many deaths we're having in hospitalizations over time and the nurses and doctors in the system. Of the hospitals in North Carolina today, we have onboarded about 35 of them, and so that's some large systems where they come with more than one hospital. In progress is another 47% and then 21% haven't started yet, but we hope to get there soon. So, a bunch of challenges in this, as you can imagine. As I mentioned, it's surprisingly hard to settle on data element definitions, even when you think you understand them really well. There is a tension, I mentioned, between complying with what's written and what we were told is sufficient. So, the federal guidelines came out and said one thing, but we have been doing another thing, and they had said, you're good to go, you don't need to change anything. And we don't want to change things and mess people up in what they've been doing. So, we want to comply with standards and yet not do more work than is necessary. As we onboard these systems, this means that we're getting data from two different sources. One is the automated and one is the surveys, and so having to merge those and make sure that we're not doing anything silly is just a challenge to be aware of. And then some elements have been dropped over time. Some because we just said, this is going to be too difficult to figure out, and the feds aren't complaining, and North Carolina feels like this isn't as important anymore. And some we have had afters have to sort of jump through hoops to do additional work. So, in summary, this has been a really nice, successful public-private partnership between North Carolina's DHHS and APRIS. This has enabled North Carolina hospital personnel to spend their time on patient care and not just on data entry. And the state is able to get accurate and up-to-date data and snapshot available resources, which can help for planning, allocation, and routing of patients. As people get sick, you may need to route them to a different hospital or even a different region. And then this saves on state resources as well. When things started up with COVID a few months ago, there were a lot – it was – I mean, no one's ever done this, but the people who are handling this were the people who usually handle hurricanes, et cetera. And so there was a lot of manual processes involved and it takes a lot of time every day by many people. Not a lot of sleep has been going on. I just want to acknowledge I was skeptic that this wouldn't even work. Charles Carter has a vision and he really pushed it and made it happen and APRIS, Lauren Whistle in particular has been a great partner. So a lot of different people and hundreds more from the different hospitals helping with this. And I'll pass it back to the speaker, thank you. Thanks, Jessie. And I think we now turn to Tal for his presentation. Great, thanks very much for the opportunity to speak here. My name's Tal Bennett. I'm at the University of Colorado where I'm the Informatics Director for our CTSA. I'm a clinician I practice as a pediatric intensivist. But I'm here as a representative of the National COVID Cohort Collaborative or N3C which is led by Melissa Handel and Chris Shute and really represents the work of a very large group of people across the country. And I'll show you many of their names towards the end of the talk. Next slide, please. So the COVID-19 pandemic really highlighted urgent needs as others have mentioned for example, machine learning algorithms for diagnosis, for triage, for prediction, for selecting optimal treatment pathways. In addition, best practices for resource allocation as the last speaker mentioned, incorporating external knowledge with clinical data for drug discovery for example, as well as coordination of our efforts to maximize efficiency. And all of these things really drove the need for a comprehensive multicenter clinical data set. Next slide. And so a lot of folks at this stage kind of ask, well, aren't we participating already in multicenter data sharing? And here's the distinction between a federated network like the Odyssey Network or pieces of the PCOR network on the left side of the slide where you have a federated query that goes out to institutions, they send back their answers and then the answers are amalgamated. But if you really want to iteratively build, test and refine algorithms and tools or perhaps incorporate external knowledge and partner that with the large amount of clinical data in order to generate, for example, biological hypotheses, you really need the data in a centralized location. And that's really what drove the creation of the N3C. Next slide, please. So an overview of the N3C, on the left side of the slide, you will see that the participating sites, many of which are CTSA hubs, have their data in any of the four data models shown there, ACT, TRINEDX, PCORI, or OMOP. We partner with the sites to get the regulatory approvals in place. The data is then ingested by a phenotyping team. Some simple logic is run over it to create some N3C case phenotypes. It's then harmonized to a single data model, to the OMOP data model that Atul mentioned early on in this session. And then the data is then put into a very secure analytic enclave that I'll talk a little bit about soon, where nearly anyone can access it and analyze it. Next slide, please. So the take-home point from this slide is this all happened very fast, despite the many barriers to getting this done from a governance perspective. On the left side, you have that the central IRB was established at Johns Hopkins in mid-April, and by mid-middle to the end of May, there were data in the system and some very early test machine learning models were built. I'm happy to talk about the different pieces of the timeline here in the question session, but I think the next few slides will give people better perspective on the scope. Next slide, please. So I ran this this morning. There are now 49 sites that have executed data transfer agreements, 27 sites that have IRB protocols approved, most by reliance on the central IRB, 24 with both, and nine have deposited data, and you can see the distribution of the data models that they've used, where the data has originated in their own electronic data warehouse. Next slide, please. This is a slide from the middle of May, and the goal here is to get across just how much data is gonna be present when the enclave is fully built out. So this, with only two sites worth of data and only through the middle of May, it's really an enormous number of rows of data, procedures, laboratory results, nearly the full EHR granularity, really dense medication data, et cetera, and so I think that this is gonna be a unique resource going forward. Next slide, please. So the key points relating to governance and then getting to talk a little bit about data access, N3C is built for COVID-related research only. It's an open platform to all credentialed researchers, and credentialing is gonna have a very lightweight definition. So a member of some institution, health system, university, industry partner, and then just onboarding to the N3C group. All the activities are gonna happen on the secure enclave that I'll talk a little bit more about. The activities there are recorded and can be audited, and then importantly, any research results of what's done there, it's required that those be disclosed for the public good. The data cannot be downloaded, and there will be a data access committee. It's in formation right now. Next slide, please. So the take-home point here is that, like others have mentioned in this session, the goal is to update these data frequently. The stated goal was twice per week, and sites vary a little bit in what they're able to do locally, but this is how the kind of phenotyping and acquisition process runs. Ideally, a couple times per week, the phenotype code runs in a site's electronic data warehouse. There are some simple data quality checks. Packages have been built by which the data warehouse teams can then extract the necessary data, and those data are zipped up into flat files and securely transmitted. So really trying to make it a white-glove process that is easy for participating sites. Next slide, please. Once the data comes up into the NCATS Secure Cloud, NCATS is the NIH institute that funds the CTSA network and also the Center for Data to Health, or CD2H, from which N3C arose. It's then harmonized into the OMOP data model. Next slide. And then those harmonized site databases are merged onto that secure enclave, which is run by a company called Palantir. There's a very high level of federal security certification. And then analytics are performed on a robust platform. Next slide, please. So this is a little bit busy, but I think most folks wanna know how they can get access to the data. And so if you look at the bottom of this slide, data users, that's our community. Folks would come in, again, with a lightweight registration, make a data use request. There would need to be an institutional data use agreement. So if I, as an investigator at the University of Colorado, Colorado would need to say, yes, you're allowed to go and work on the N3C platform. If you only want aggregated data or counts or safe harbor definition HIPAA data, that request goes to a data access committee. There's no scientific screening, more just that you're following the established processes, and then you would be granted access to the data. If you want access to the limited data set, like others have mentioned on this call, we have real dates in the system, as well as five-digit zip codes, then you do need to have a separate IRB. We do not have a specific date for the data use agreement being available, but is anticipated within the next couple of weeks. Next slide, please. So as I mentioned, this is a really robust platform that Palantir runs. There's a great system for creating reproducible pipelines, largely are Python and SQL-based. And then the N3C community is working to port in tools from the CTSA community, from the Odyssey community and others to make that process even easier. So folks can use the tools that they already like to use. Next slide, please. This slide just kind of shows the span of N3C and then gives some links where people can get more information and onboard if they would like to. I mentioned the data partnership and governance team, which leads to the phenotype and acquisition team, the ingestion and harmonization team. The collaborative analytics team has several subgroups. I co-lead the clinical scenarios and analytics subgroup within that larger group. And then there is a synthetic data pilot that if successful, we would hope to make available to the community as soon as it's in place. Next slide, please. So, as I mentioned, this is the work of an enormous number of people. And I hope that everybody is here. If anyone is not here, I apologize, my fault, not theirs, but just to give a sense for just how much work this took. Next slide, please. Thank you very much. And I look forward to the discussion in the question session. Excellent. Well, thank you, Tel. And thanks to all of our speakers. I'm gonna get into some Q&A. I just wanna highlight that the Q&A section of the webinar is open. So please submit your questions to the questions tab. Of course, I have a whole bunch ready. And I learned a lot watching these presenters. I think we can all see that COVID has forced all of us to quickly move from theoretical discussions on registries of the future to standing up right now the new registries of the present. Now we're counting bed use, PPEs, waiting room times, hundreds of drugs and vaccines in development. As Tel has just said, we're already conceptualizing interoperating the interoperated, right? You've got entire consortia like Odyssey becoming part of another grouping. We've always known, or at least suspected that clinical data could be used for research on our real world evidence. But now we're talking about health system data, supply chain data, government data. So let me just start as a round robin. We can go in order, I guess, starting with Andrew and the others. What do you think was most required to accelerate and stand up the data sources you brought to your constituents, especially this COVID data? And what would have made things easier for you if you had it way back in February, I suppose? Andrew, some thoughts first. Yeah, so I think what made us stand it up very quickly was just really the sheer determination and willingness of our cancer center. Like I mentioned, there were over 60 people who were willing to work nights and weekends, or even at home if they were quarantined for whatever reason. And that was the only way, I think Zuba mentioned she had 18 days. We had, I think, like 21 days, and we stood up 3,000 patients manually. And that, to me, was amazing. And that, but that does kind of lead into what I wish we kind of had was a little bit more help with verifying our data quality and some of the missing data in this as well, so. Zuba, what do you think? You had a few fewer days, I suppose, but what do you think would have helped back then? And what was really required? Yeah, I think in terms of the positives, I think just people coming together around these problems, like Andrew said, I think was just fabulous. I have, in my 20 years of career in informatics, I have never seen this, people coming together so quickly to solve problems. And also, I think one of the speakers mentioned this was record time to get an IRB approval, right? So you never see the Institutional Review Board review a research protocol so quickly. So I think those were definitely some of the positives. In terms of challenges, I would say, I mean, I always wish that we could have much better standardized organized data in electronic medical records, right? And I think that is always a challenge. How do we better utilize EHR data for secondary purposes such as research? I think one another big positive is we're able to rapidly bring in the public health data sets, right? So the regional, county level, and national data sets, and there are large registries like the Hopkins quickly built up these dashboards and other efforts now were extremely helpful in looking at your data in the context of some of these larger initiatives. That's great. Jesse, you talked about standards. I think your illustration of just counting beds was brilliant. What do you think worked and what could have been better? I think one of the biggest challenges we had is just the vision that it could be done better and differently, especially in public health. I mean, EHRs, we complain about them, but in public health, the standard MO is faxing things, is filling out PDFs, is emailing things all around. Even when people use Tableau, what they do is they manually export a static data set and they push a CSV file to a place and then Tableau just reads that and it's all very manual. So just trying to give people the sense of what is possible and what kind of work it would save them doing has been key. That's great. And Tel, I think you're thinking about standards to try to get all of these data sources together. What do you think really has been helping so far? I know it's still early in your effort. What do you think could have been different to make things better? Yeah, I mean, ironically, I think that the key feature was non-technical. Folks knew how to do this, but it was the sense of purpose that drove the regulatory and leadership commitment to get it done down to the health system and CTSA site level. And I think we have ongoing challenges, I think, aligning what is possible with the data and understanding what is possible with the data with the sort of most critical clinical questions. And in a big coalition, I think that's more of a social challenge and a communication challenge than a technical one. I think this is great. I mean, so I know each of your situations are very different, but at the same time, it's clear you all got commitments and resources from your organizations to really enable you to bring these data sets together. And clearly your organizations get it. And many still don't, but your organizations get it. At the same time, I'm a little worried. I think Subha, you mentioned training and students. We talked to Andrew, who was talking about, not all health executives understand how important some of this data is or why we wanna bring it together. What do you think, how to communicate this type of need to bring data more effectively within one's own organization? I mean, are there tricks that you use to try to convince your own internal support to get the buy-in? Why don't we go backwards? Tel, you're working at University of Colorado, you're trying to get a large number of academic mental centers to share data as well as other consortia. Maybe some of them get it, but even in your own organization, your health system, do they all get it? What works to try to convince people to willingly share this data? Yeah, I mean, first of all, this was an entity led by others and I'm merely a representative, so I don't wanna say that this was my work at all. From the Colorado perspective, I think honestly, our leaders got it very quickly when the COVID-19 pandemic was created and the idea of N3C was brought forth. And so I give them credit. Maybe we had been sort of softening up them over years, trying to impress upon them the importance of informatics and harmonization and data sharing, but I think they very quickly understood it and committed their own time and also local resources to get it done. That's great. Jesse, I thought it was amazing that you're working with people who model hurricanes and trying to get them into health data so quickly, yet you're using standards, actually, and APIs to get data to interchange ways many other states aren't. What worked in your organization to get the right leadership to adopt your way of doing things or your desire to do it this way? You're muted. Oh, you're muted. It probably comes down to North Carolina being ready for this change. So the chief data officer position, they just created this a few years ago, so they have recognized the importance of data. And then having that culture shifting to say, we can't just wing this. There are these things called data standards and we wanna use these. It wraps around a bunch of different areas. Before COVID hit, we were looking at using FHIR for Medicaid claims and doing bulk FHIR for that. I talked about the MedSurg data, but part of the other things we're trying to automate is just getting the EPI, the surveillance data, getting away from that faxing PDFs one at a time and having to hand enter it, to standardizing a CSV file. It's not elegant. It's not an API per se, but even just a CSV file that's gonna have line-level data, we are pretty good at knowing the demographics of the people who are getting COVID in our state. And we know it's very disproportionate. We're not nearly as good at knowing the demographics of who's being tested, because historically, if you're a positive, you get followed up on and you get that data. The negative tests, they haven't necessarily collected the race, the ethnicity, the age, all of that. And part of it is because someone had to do the work to enter it. So we're trying to get that automated and to use standards so that we don't have later down the road, MS, male, female, one, zero, whatever it is. So we're getting there. That's actually very impressive. Subha, you mentioned training. And my biggest worry is we are creating many different repositories of data within our own organizations and then a lot of national consortia. Do we have enough people even knowing how to use any of this data? I'm a little worried that we might have more data sets at some point than actually trained analysts. We're not there yet, I know. But how do we get more of these folks trained specifically on COVID data too? Yeah, and thank you, Atul, for that question. Also leading to the answer, I mean, we definitely don't have enough people who are skilled and knowledgeable to deal with all the problems that we're facing. One thing we think about a lot is you can have well-intentioned bad actors, right? So people who actually want to do the right thing, but they actually don't know how or they don't have the right policies or the right guidance or the right training. So I think training the next generation on how to deal with all of these data challenges, thinking about data ethics, et cetera, is absolutely critical. In the last count, when we were assembling our master's program at Georgetown, there were over 30 such programs around the United States. And I don't think that we collectively are still producing enough people who can do this. And in some ways, I think about this group who are coming out of these master's programs not so much as technicians, right? So you don't want them to just take somebody's guidance and run that R code or the Python code. You want them to be at the table really designing the experiments, designing the studies and asking the right questions and aligning with the business goals. To your question, I just, if I may, just wanted to add about how are you getting buy-in, right? Not relating to training, but how are you getting buy-in? I think it's a daily conversation and Talan talked about data governance and I can't emphasize how important this is because every leader has a different goal. One person maybe all they're thinking about is really sending the data to CMS or to another federal agency. There might be a research leader whose goal is to raise money for the university for research. So I think governance and also continuous discussion and talking with the leaders to understand what their agenda and what their goals are and aligning with that is so important to move our work forward with regards to COVID. No, that's good. You've got a lot of nods from the panelists with what you just said. At the same time, it's kind of ironic. We've got 70 minutes and just finally used the word governance. I'll be, it should have been in the first minute, no doubt. Andrew, you had another kind of unique battle of trying to get the buy-in at an incredibly busy medical system. Any thoughts? Well, like I said, the human resource buy-in was there at our cancer center. And then from the very top executive level, there was actually incredible buy-in because as I kind of mentioned, Hackensack Meridian is traditionally more of a, I guess, more of a community, not a true academic center, although they're striving to be more like it. And so when we brought them this idea and then we showed them we could create this database, we had a lot of, I guess, hype around what we could produce. But then it really goes into Suva's point that we really had to set our foot down because there was really not that many data scientists and we did not want bad data to go out there. And everyone's kind of heard the stories recently in the last couple of months. We were trying to publish, but we wanted to put out results that were actually scientifically sound and ethically, that had all the right measures. And there are definitely times when I was challenged with, oh, Andrew, maybe you should put this out now when it wasn't ready to be put out. So we almost had too much buy-in, honestly, from that standpoint. And we really, we're a very, very large corporation, a lot of doctors here, but yeah, we don't have enough data scientists or statisticians at our current network, but we're trying to work on that, so. That's great. All right, so we got a bunch of questions from the audience. We have well over 100 attendees here. So I'm gonna, I sorted them a little bit. I will get the shorter answers. I'm gonna pick on two of you, but the others can join in too. So this first one's gonna be, I'm gonna direct to Tala and Jesse. And the question is from Janouse Yazani. Can you comment on the sensitivity and specificity of COVID-19 definitions themselves? Now, she's asked me about the NCATS data, so I think it's a general question. How do we even define a COVID patient from the data point of view? What are these challenges? I know we can't even agree what's a bad name, but can we even yet agree on what's a COVID patient? The positives and negatives keep coming and going across time. Tala and Jesse, Tala first, any thoughts on how to define a COVID patient? That's a great question. Thank you. And we've had a lot of discussion about that in starting to write about the N3C data. First of all, I wanna recognize the phenotype group has done a ton of work to try to make a broad set of case and control definitions of which there are four. It sounds like the question asker may have already found them on GitHub. And so there's sort of a confirmed positive, PCR positive. I think there's less debate about that. Confirmed negative, PCR negative. And of course there are temporal issues there. You could be negative and then positive or the reverse. And then these suspected positive or possible positives. And I think the goal of having those was to have a really inclusive data set and then allow analysts and data scientists in the enclave decide what the right cohort was for their scientific and clinical questions. That's great. Jesse, how do you define a COVID patient? I don't know. So from the public health perspective, it comes down to it's a reportable disease. So if it was reported to us as a positive case, it's a case. Unfortunately, to your point, I suspect that different organizations may be defining that differently, but there's not a lot we can do with that. But it's an important caveat if you're doing research using all of the data together. That's great. All right, the second question comes from Bill Hirsch. I'm gonna direct it to Andrew and Subha. He said, great informatics talks, which I already knew. How do we know that the data is skewed, that isn't being skewed towards academic medical centers? And how do we know instead that what we're collecting is generalizable? Andrew and Subha, I think based on your medical centers, taking a very unique sets of patients, do you believe what you're learning that is generalizable to others across the country, across the world? I mean, I can start. I think that's definitely our hope, right? So we hope that everything that we find in our research studies is generalizable. But of course, we have to work within the framework of the data sets that are available. And there are, of course, a number of statistical approaches. I'm preaching to the choir here. A number of statistical approaches to ensure that the case control that we're selecting for any study is in fact the appropriate groupings for that particular analysis. So for example, the SARS-CoV-2 versus NL63 analysis that I showed, we're pretty much just going with the data sets that are available because the pipeline only utilized RNA sequencing data. And I mean, Jesse was just talking, this is a reportable disease, but you don't have RNA sequencing data on all the patients. And so it's hard to generalize to the entire population that has COVID. But we are hoping that this is one step of many steps that can get us to the right policies and the right medications for COVID. Excellent, Andrew? Yeah, for our network, actually, we did have a data field where we would define an academic community center. And it had to be a center that had quaternary care, but also fellowship training, residency training. And really that only fit one of our 13 centers. So really most of our hospitals, our community hospitals, at least based on how we defined it. I think personally, in our data set, it's a little bit hard to generalize because it represents truly a very beginning of the epidemic, at least within the country, but also really for North Jersey when we were really at capacity and at surge. And I think care definitely varied over time. So it will be more interesting as more data sets are reported out in different instances. But I think you do have to take it with the caveat, yeah, there are a lot of different variables that you have to pay attention to when you look at data. Sorry, just to add to that, I apologize. There are a large number of community health centers, including many in highly affected regions who are in the pipeline to submit data to N3C. So- That's a great point. I totally was gonna say exactly the same thing. But most medical centers, you're right, and academic medical centers, but also the consortia that you're connected to, Tel, do include community hospitals, affiliates, practices. That is the modern medical center now. I think it includes outlying hospitals as well. Tel, that's a great point. Thank you. All right, so Steve Blackhoff, of course, mentioned standards, and I'm gonna bring up, what about standards? As everyone knows the aphorism, the best thing about standards is that there's so many to choose from. So we have Epic and Cerner, proprietary data source or vendor-neutral formats. Already during this panel, we talked about HL7, OMOP, we mentioned FHIR, but there's also RxNorm for drugs, including drugs we haven't even invented yet, LOINC codes for the testing, IML concept codes. What do you think about standards? Were there enough of them? I mean, Jessie, I'm not sure I've seen the ontology of beds like you just showed in your graphic. I wanna throw this over to Jessie and maybe Subha. Jessie, you mentioned the standards here. So what do you think? Do we still need more standards or do we have plenty of them? We don't need more. I sometimes show a slide that says, friends don't let friends develop new standards. We don't need more. We need people to use and improve and maintain the ones that we have. It's tough because everyone has their favorite and everyone has a different reason for using it. And even experts don't always know everything about it. So it's really hard to evaluate which is the right standard for me. Melissa Handel and Susanna Sansone and I wrote a paper a few years ago specifically on omics data standards. But it applies here as well, that it depends on the use case you're looking for, that doing research might have a different use case than doing public surveillance. I've been, I hope I don't offend anyone. I was hoping to see ONC slash CDC step up and give a little more guidance to what people should do. And I should say, I feel like I'm in some ways living under a rock. So if they have done that and then I just haven't seen it recently, then I apologize. But I think that there are standards out there we can and should be using. And it would be great to get everyone on the same page with that. Yeah, I mean, I would just very briefly add to the standard joke that standards are like toothbrushes, right? Everybody has one, but nobody wants to use somebody else's. So I think jokes apart, I mean, there are too many standards and there are now a lot of mapping efforts that are going on and Telen mentioned some of them and the FDA has some mapping efforts. So what I hope as a data scientist is that there's enough mapping so that humans don't have to, you don't have to worry about what standard you're using. You use one and then you're automatically mapped to another system that uses a different standard. And that mapping should be transparent and fully trustworthy to the human. I think that's the ideal stage we wanna be at. I don't think we're there yet. We're getting there. I do wanna say kudos to... Oh, go ahead. Sorry, I just wanna say kudos to LOINC. They really jumped on it and got the needed codes out there and publicized and that was great to see. Great point. LOINC really stepped up, including having manufacturer specific types of codes as well. And I think it's tracking the manufacturers it's getting more and more important. I was just gonna ask, tell us well, since you have to deal with a lot of standards too. You have enough of them or too many or not enough? No, I agree with Jesse and Subha. And I wish I could remember those jokes when questions like that come up, I'm gonna try to next time. I agree, I will admit some tension between my sort of informaticist brain and my clinician scientist brain. And I always try to remind my informatics colleagues that the end goal is how soon can we make things that help people and not just to standardize and standardize and harmonize and harmonize. So yeah, that's an internal tension. Excellent, all right. So we've got about six minutes left before we have to turn it back over to Alex but we're gonna get to quicker answers and quicker questions. The next two have to deal with the people, the teams actually. So John Matthew asked, do any of your teams include cognitive psychologists, human factors experts for making sure the processes work well? I don't know about you guys, but our data highways in our setup are constantly getting backed up. We need detours, there's traffic jams. It's hard to automate this completely. I mean, every morning is a new adventure. Any of your teams include cognitive psychologists or human factors? Just chime in, I'm not seeing too many nods here. We have someone who buzzes in like user experience kind of thing, but I would say she's been underutilized in this. She's been part of things and trying to chip in here and there, but everyone is doing things they don't normally do. Everything is moving at breakneck speed. We were having conversations at one point where we say, no, I'm thinking really long-term here, like two days from now. Thankfully that has slowed down a bit, but everything is just like, it needs to be done now. And so user experience and usability are often the first thing to go when that happens. Any other thoughts on that one? Go ahead, Subha. I completely agree. I mean, on my other project, the projects we routinely will engage the MedStar Health National Center for Human Factors Engineering. And they're a great group of people who think about usability, eye tracking studies, et cetera, how best to present information. With COVID projects, there's no time to engage human factors experts and usability experts. We're just putting stuff out there and just hoping that this is the right format and we'll collect the right data and do the right analysis. Yeah. We would love to engage the experts, but not now. I have a related question because many of us stood up teams quickly. Susanna Fox asks, how are you supporting and or renewing your workforce or your teams as the pandemic grinds on? This started as a sprint and now I've been saying this a lot but started as a sprint. Whatever energy you have left, now you're starting a marathon. Any thoughts? Anyone want to chime in on what, any thinking on how you're going to renew or support your teams? These are great questions. Well, yeah, we have a struggle with this. Our team was already small to begin with, but we lost our project manager due to personal reasons. Which was really tough for me cause I had to kind of pick up some of that slack. But no, I think just in general, at least for Stuart and I, we really had to take a step back and not work weekends which we were doing for like two months straight. Cause we were, you know, we are the key investigators for this data because we couldn't burn out in that sense. But, you know, I think we've kind of staggered it, try to give breaks to our data abstractors and not try to overwhelm them. And so that way, when we go back and ask for more data, they're not completely burnt out. Your questions are really timely. Any other thoughts? There are pieces of the N3C that are going to have a little slowdown next week. Not all pieces because things are happening, but people are aware of avoiding burnout as much as we're able. Yeah, especially when we're talking real time. On our teams here in California, we said we don't need this necessarily tweeted on the weekends. Now, you know, we still gotta have a life outside here. Go ahead, Jessie, you were about to say something. I keep hearing people say something about the people who work at VHHS working tirelessly. And I keep wanting to say, what makes you think we're not tired? The people I work with, Secretary Mandy Cohen is just phenomenal. You know, she and Governor Cooper and our Chief Medical Officer, Dr. Betsy Tilson, and just these incredibly dedicated people make you want to work hard, but there is definitely that tension. And I try to, with my team, say like, I'm not going to say we're not going to work evenings and weekends, sometimes when we need to, but I try very hard to make it only when we need to. But for a while, it was all the time. And that, like, as you know, as I've mentioned, I'm off this week because I'm moving, and that was partly a mental health thing. I just need to step away a little bit. Definitely, mental health is going to be important here because this is going to keep going on. All right, we're entering lightning round. So, Chief Kothari talked about telehealth data. We have a ton of telehealth data in our setup. Anyone want to say a couple seconds about telehealth data in any of your setups at all? We're just trying to look at which clinics, which specialties, success, failures, race, ethnicity, social determinants. Anyone want to say anything at all? Go ahead, Jessie. From public health, we have surveillance that typically hits just the EDs, the emergency department. And we're trying to make a push to get not just that, but telehealth, outpatient, ambulatory, because a lot of people are trying to steer clear of the EDs, and so we need to understand where the outbreaks are happening without just the emergency department data. Telehealth being an important part. That's, yeah, I think we're going to all learn a lot about telehealth with all of this data that's been captured. Registries run registries in some ways, but telehealth, maybe nobody really had before. A quick answer, tell on synthetic data. If you can, Shikha Kothari also asks about machine learning to generate synthetic data. I know I'm going to be working with you guys with synthetic data, but you want to put in a sentence or two on what synthetic data is and where you think it would be useful? Yeah, I mean, the goal of synthetic data is to have data that can be shared more broadly or even very broadly that would generate the same or equivalently accurate and useful models without the governance and regulatory barriers to access to real data. And there are a variety of very complicated technical approaches to generate it. I don't think the field has settled on one. And the synthetic data pilot led by Philippine and you and others is hoping to get a really workable set of synthetic data from parts of the N3C that could be shared more broadly. It's a pilot, so not there yet. That's great. All right, we've got about the last two minutes here. So we're just going to do a lightning round and we'll go in the original ordering. The final thoughts and specifically, do you feel like your efforts, your career really took off now because of COVID? And in that way, a sentence or two or three on what's life going to be like? Can we imagine after COVID? Four years set up your infrastructure. I know we've been, if we're operating at the University of California, we're certainly going to keep going past COVID, but we've still got to muddle through this. Andrew, let's start with you. What the final thoughts on what life is going to be like for you and your infrastructure after COVID? Yeah, probably I'm the most junior person here, by far it sounds like. I think, yeah, I'm an oncologist by training. So this has taken up most of this year already. So it seems like I'm definitely going to make a, hopefully a mark just doing research in COVID. Yeah, we're trying to do more and more oncology and what we were already meant to do as a cancer center, oncology research. And that's what I'm hoping I transition more and more to and outcomes work there. But there's still a lot of work to be done with our COVID database and the collaboration kind of mentioned at the end of my presentation. So, we're still very excited to keep going here. That's great. Subha, a sentence or two on what's next after COVID for you. Yeah, I think COVID has really brought a lot of things into focus for me personally. Who are the right people to rely on? What is the right infrastructure to rely on? What are the right processes? Because I think, as we talked about, the timeline has shrunk and you now can experiment everything within a short period of time because of this need of this global pandemic. So, what I'm hoping is post pandemic, that we can leverage the teams that we have assembled to solve other bigger, greater problems. That's really my hope, because now things have become much more clear in terms of how to progress and let's make that progress in other domains as well. Will be my hope. Thank you. Jessie, you somehow got all the North Carolina hospitals to share data. What's next for you after COVID? So, before COVID hit, we were all about Medicaid transformation. And the General Assembly in North Carolina is near hopeless. But just this week, they signed the bill that said that we're gonna do Medicaid transformation. So, COVID is not going away, but we'll get back to that as well. And there's been a lot of focus there, again, with the visionary leadership of Secretary Cohen on social determinants of health, which turns out to be important both in Medicaid and in COVID. So, there's gonna be some interesting problems to solve. That's great. And Tel, I think you're building a repository that's gonna stick around for many years. But can you imagine what's gonna be after COVID for you in this whole infrastructure? Do you see it expanding to other disorders and conditions? Well, I'll echo Jessie's comments about the visionary leadership of NCATS and other pieces of the NIH and of Melissa and Chris and others at CD2H to get this done. And I think there really is a hope that it's an exemplar of the impact that a multicenter, a really dense data source like this can have on people's lives and their health. So, my hope is it does sustain itself. That's great. Well, I hope all of you in the audience, all you attendees can help join me in thanking these panelists. I learned an incredible amount in the last 90 minutes. Thank you guys for your wisdom, especially on busy days. And let's throw it back to Helen for closing comments and introduction for next week's talk. Thank you. Wow, thank you all. That was an incredible amount of data. My fingers hurt from trying to tweet all that, but it was just a extraordinary amount of information. So, special thanks and thanks to the Moore Foundation for funding this work. We're delighted that we're gonna be able to continue to have four more of these over the summer. And if you could actually just go to the next slide, I'm not sure I have controls back yet, Julia, but we're excited that our next, these are the upcoming webinars that we've got listed out. At least the next two dates are already identified. And if you can go to the next slide, since I see we're running out of time. Just wanna introduce the topic for our next session next week, which is interestingly titled, By the Organizers, Reflecting on our COVID-19 Failures, A New Vision for Integrated Registries, which will include several of our CMSS members. We have clinical registries, including Cliff Coe from the American College of Surgeons, Greg Martin, President-elect of Society of Critical Care Medicine, and Liz Garrett, Mayor from ASCO. We're expecting, hopefully, Michael Howell will also join us and give us his vision from Google of how an integrated structure can really move forward. So, I think just staying on the positive note of Atul's last question, it's very clear so much of what you've described today should be looked upon as a reusable rocket, whether that's for the next phase of COVID-19 or how to address disparities or whatever the case may be. Just thank you for your leadership. Thank you for all your remarkable ideas here and we'd love you to, if you could, do an evaluation at the end of this in the GoToWebinar box. And again, thank you, Atul, for your masterful invitations to this incredible crowd and your masterful facilitation of Q&A. Thank you all so much and we look forward to seeing you next time. Thanks again. Thanks, everyone. Bye.
Video Summary
The webinar titled "Advancing Clinical Registries to Support Pandemic Treatment and Response" featured a series of presentations and discussions on the critical role of clinical registries during the COVID-19 pandemic. Hosted by Helen Burstyn, CEO of the Council of Medical Specialty Societies, the session was the second in a series funded by the Gordon and Betty Moore Foundation in collaboration with the AAMC.<br /><br />Key presentations included:<br /><br />1. **Dr. Atul Butte**: From UCSF, he discussed the transformation of the University of California's health data systems to respond to COVID-19. This involved centralizing clinical data across multiple campuses, updating data daily instead of monthly, and making results publicly accessible via Twitter.<br /><br />2. **Dr. Andrew Ip**: Highlighted efforts at John Fuhrer Cancer Center in New Jersey, including creating a comprehensive COVID-19 clinical registry, emphasizing manual data collection and the rapid deployment of a large database.<br /><br />3. **Dr. Subha Madhavan**: From Georgetown University Medical Center, shared insights on multiple data-driven projects, including the Teraworld Registry for thoracic cancer patients, an AI tool for organizing COVID-19 scientific data, and an immunogenomic registry comparing immune responses between different coronaviruses.<br /><br />4. **Jesse Tenenbaum**: Chief Data Officer for North Carolina’s Department of Health and Human Services, spoke about automated medical surge data collection, emphasizing the adoption of standards and public-private partnerships to streamline data reporting from hospitals.<br /><br />5. **Dr. Tal Bennett**: Representing the National COVID Cohort Collaborative (N3C), he detailed the rapid development of a multicenter clinical data repository leveraging federated data models and partnerships across academic and community health centers.<br /><br />Discussions during the Q&A segment touched on data standardization challenges, the importance of inclusion of community health data, data governance, and the sustainability and post-pandemic application of these newly developed infrastructures.<br /><br />Overall, the webinar underscored the critical need for rapid data collection, standardization, and collaboration across health systems to effectively respond to health crises and highlighted the importance of supporting and scaling these efforts for future healthcare improvements.
Keywords
clinical registries
COVID-19 pandemic
health data systems
data centralization
manual data collection
AI tools
immunogenomic registry
data standardization
public-private partnerships
multicenter data repository
data governance
health crises response
future healthcare improvements
×
Please select your language
1
English