Why not use ARDS as the target variable?

What follows is a question posed to our CEO about the Vulnerability Index in a forum for a different COVID-19 project. It’s a great question, so I am repeating it here as part of the full response. To the person asking the question, let me begin by saying THANK YOU. If you want to identify yourself later, that would be great (but far be it for me to violate privacy…!)

The Original Question

We’re assuming the best proxy is ARDS and have been using ARDS as our target variable. I’m more familiar with ICD-10 than CCSR. Why choose a (seemingly) more broad definition of the target variable?

My (rather long-winded) Answer…

Background on Choosing Diagnosis Codes for Outcomes
The choice takes into account several considerations. Thus far (and these can change as more data becomes available), they have been determined based on the biological and medical dynamics of COVID, the availability of a common denominator with respect to data (and the limitations therein), and the potential use cases for the results.

Without COVID-19 data, we clearly need proxy endpoints. This could change as COVID data becomes available but for now, it’s Hobson’s choice.

The final endpoints had to satisfy several criteria before we used them. Before these criteria were applied, our choices were constrained by the nature of the data we had and what that data was capable of reliably saying.

Data Limitations and Realities
We wanted data that was widely accessible, readily available, and commonly understood. That meant claims. Claims data has been and continues to be extremely useful. Its use with ACOs, in Blue Button efforts, for research, and as part of myriad other CMS initiatives has further extended its use.

But claims data has limits, particularly in acute care settings with CDS needs that demand real-time (or near real-time) feedback loops. Claims data simply cannot not tell us which of the ten patients just admitted to the hospital will need to be intubated in the next 5 hours.

Because of this, we decided our value lay in creating models that helped teams prevent such admission in the first place. Similar PHM efforts, teams could use it for targeted precision outreach to help safeguard people vulnerable to complications from respiratory infections.

Biological and Medical Dynamics
The C-19 Vulnerability Index predicts ‘close proxy’ events. COVID-19 is new, but it is a virus, and has recognizable similarities scientists can exploit in terms of what to expect with respect to its mechanisms and pathophysiologies.

Our criteria were that the choice must be

  • Be consistent with COVID-19 infection (i.e. respiratory infection)
  • Be supported by WHO guidelines and research on COVID-19 impacts
  • Have the same risk factors as those highlighted by the CDC
  • Validated by physicians and medical officers from partner organizations

Specific Outcome Definition
Essentially, we are predicting people who are vulnerable to a serious respiratory infection in the near term. We chose to not distinguish between admissions and ICU stays and focus instead on whether it was severe enough for someone to be admitted to the hospital and possibly the ICU. The distinction between these two scenarios is clearly central to what hospitals are facing right now. It is less relevant for teams trying to prevent admissions at all (i.e. we don’t want to not call your grandmother simply because her admission was predicted to be ‘less bad’).

  • The outcome currently being predicted is a composite endpoint that includes pneumonia, influenza, acute bronchitis, and other respiratory infections
  • It is identified using ICD-10-CM codes and AHRQ Clinical Classifications Software CCSRs of RSP002, RSP003, RSP005, and RSP006

The choice of codes was shaped by certain considerations:

  • Code accuracy (accuracy of the ICD-10 codes in reflecting the desired outcome)
  • Code position (principal diagnosis versus secondary position)
  • Code outcome rate (how common or rare the event is)

We confirmed that these codes have a high sensitivity and specificity. We also wanted these infections to be the reason for admission. Specifically, we wanted to avoid infections that were either present on admission or acquired during the stay.

With all of this now said, we finally come to your question about Acute Respiratory Distress Syndrome.

Finally, to your question…
Acute respiratory distress syndrome (ARDS) is an outcome that is clearly relevant to COVID-19. It is defined by the acute need for mechanical ventilation, occurs often in cases of pneumonia, sepsis, aspiration of gastric contents or severe trauma, and is present in ~10% of all ICU patients.

We had considered respiratory distress codes early on. The difficulties we saw were that

  • For people admitted for respiratory infections, they tend to be coded in a secondary position to reflect that progression to ARDS happens after being admitted.
  • Not all ARDS is relevant. Cases not associated with respiratory infections (e.g. sepsis, trauma,etc.) need to be excluded.
  • ARDS is a fairly rare event (even before excluding cases). USing it as the primary endpoint would make prediction quite difficult and wasn;t consistent with the use case we have been targeting.

We’ll Be Updating the Models
Your question is wonderful! In talking about it further, while we would not choose to replace our current endpoints, we think there is value in adding ARDS to the existing endpoints. By doing so, we may capture cases not already flagged. Better yet, we may be able to create a way to distinguish between cases that are less versus more severe. And that’s really exciting!

We plan to include this change in an upcoming release. Stay tuned and thanks for your ideas!

1 Like