Why Class IIa Software Needs More Than Rule 11 Compliance

Written by HATEM RABEH, MD, MSc Ing

Your Clinical Evaluation Expert And Partner

I see it in almost every MDSW submission. The manufacturer checked Rule 11. The software is classified. The clinical evaluation report opens with a statement about software as a medical device. Then it follows the same structure as any Class IIa device. Same appraisal depth. Same literature template. Same equivalence logic. And the Notified Body sends it back with a list of twenty gaps.

In This Article

What Rule 11 Actually Does
The Evidence Question Shifts
What Clinical Evidence Means for MDSW
Classification Defines Scrutiny, Not Evidence Structure
What Reviewers Look For
Moving Beyond Compliance Thinking

The problem is not that Rule 11 was applied incorrectly. The problem is that manufacturers think Rule 11 defines the clinical evaluation strategy.

It does not.

Rule 11 defines classification. Classification defines conformity assessment route. But classification does not define what clinical evidence looks like or how it should be generated. That is where the confusion begins. And that is where most MDSW clinical evaluations fail.

What Rule 11 Actually Does

Rule 11 gives you a classification outcome. If your software drives or influences treatment decisions, you land in Class IIa or higher. If it provides information for diagnostic or therapeutic purposes, the same applies. The rule is clear on boundaries.

What it does not tell you is how to build clinical evidence for software that has no physical interaction, no direct tissue contact, and no conventional performance metrics.

Manufacturers assume the clinical evaluation framework for physical devices applies directly to software. That assumption drives most of the deficiencies I review.

Common Deficiency
Clinical evaluation reports that treat MDSW like a physical device with software components. The appraisal focuses on clinical outcomes of the medical condition, not on the performance and safety of the software itself.

So what changes when the device is software?

The Evidence Question Shifts

For a physical device, clinical evidence often centers on biocompatibility, mechanical performance, and clinical outcome in the anatomical site of use. The literature appraisal searches for comparable devices in similar indications.

For MDSW, the evidence question is different. You are not asking whether the clinical condition can be treated. You are asking whether the software performs its intended function safely and effectively in the hands of the intended user.

That means the appraisal must address software-specific risks. Algorithm accuracy. User interaction errors. Data input variability. Integration with other systems. Version control and updates. These are not secondary considerations. They are the core of clinical safety and performance for MDSW.

But I still see clinical evaluation reports that dedicate pages to the clinical background of the disease and two paragraphs to software performance. That structure will not pass review.

Why Equivalence Becomes Problematic

Equivalence is already difficult for physical devices. For software, it becomes nearly impossible.

To claim equivalence, you need to demonstrate that the clinical, technical, and biological characteristics are equivalent. For software, the technical characteristics include the algorithm, the data it was trained or validated on, the user interface, the intended user, and the clinical workflow.

If your software uses a different algorithm, processes different data types, or integrates into a different clinical pathway, equivalence cannot be claimed. Even if the intended purpose sounds similar.

I have reviewed files where manufacturers claimed equivalence between two diagnostic support tools because both addressed the same disease. But one used rule-based logic and the other used machine learning. One was designed for specialists and the other for general practitioners. One required manual data entry and the other integrated with hospital IT systems.

None of that is equivalent.

Key Insight
For MDSW, equivalence is not about the medical condition. It is about the software function, the algorithm logic, the data characteristics, and the user interaction model. If any of these differ, you need your own clinical evidence.

Which brings us to what that evidence actually looks like.

What Clinical Evidence Means for MDSW

MDCG 2020-1 provides guidance on clinical evaluation for legacy devices. But even there, the principles are clear. Clinical evidence must demonstrate safety and performance in the intended use.

For software, that means you need data that shows the software performs correctly under real-world conditions. Not just that the algorithm was validated in a lab. Not just that the clinical condition is well understood. But that your specific software, with your specific algorithm and interface, performs as intended when used by your intended users.

This is where clinical investigations or performance studies become necessary. You cannot rely entirely on literature that describes the medical condition or even other software tools. You need evidence specific to your device.

The Role of Performance Studies

For many MDSW products, especially those in Class IIa or higher, you will need a performance study. That study should evaluate the software in a setting that reflects real use.

If the software is intended to support diagnostic decisions, the study should assess diagnostic accuracy, including sensitivity, specificity, positive and negative predictive values. If it provides treatment recommendations, the study should evaluate the appropriateness and safety of those recommendations.

The study should also capture usability issues. Software safety is not just about the algorithm. It is about how users interact with the output. Misinterpretation of results, over-reliance on the software, or failure to recognize limitations are all safety risks.

I see studies that test algorithm performance in ideal conditions but ignore user behavior. That is not sufficient. The software does not operate in isolation. It operates in a clinical workflow with real users under time pressure and cognitive load.

Common Deficiency
Performance studies that evaluate the algorithm in controlled datasets but do not assess how the software performs in real clinical use with intended users. Usability and human factors are treated as separate from clinical evidence, when they are central to it.

What About Literature?

Literature still plays a role. You can use it to establish the clinical background, the current standard of care, and the clinical validity of the underlying method.

For example, if your software uses a well-established clinical score or guideline, you can reference the literature that supports that score. If your software applies a diagnostic method that is already validated, you can appraise that evidence.

But you still need to bridge the gap between the general method and your specific implementation. If the literature describes a diagnostic method performed by clinicians, and your software automates that method, you need to show that the automation does not introduce error or reduce accuracy.

That bridging cannot be done by assumption. It must be done with data.

Classification Defines Scrutiny, Not Evidence Structure

Here is the part that manufacturers misunderstand. They think that because the software is Class IIa, the clinical evidence requirements are lighter than for Class III.

That is true in terms of regulatory pathway. Class IIa goes through a less intensive conformity assessment than Class III. But it does not mean the clinical evaluation can be shallow.

What changes with classification is the level of regulatory scrutiny and the depth of independent review. But the clinical evaluation must still answer the same fundamental question: does the device perform safely and effectively in its intended use?

For MDSW, that question is often harder to answer than for physical devices. Because the risks are less visible. They are not mechanical failure or biocompatibility. They are incorrect outputs, user misinterpretation, and integration errors.

So while the pathway may be less burdensome, the evidence burden is not necessarily lower. It is just different.

Key Insight
Classification determines the conformity assessment route. It does not reduce the need for robust clinical evidence. For MDSW, the evidence must address software-specific risks, even if the device is Class IIa.

What Reviewers Look For

When I review an MDSW clinical evaluation, I look for a clear line from the intended use to the evidence.

I want to see that the manufacturer understands what the software does and what could go wrong. I want to see that the literature appraisal is focused on the software function, not just the clinical condition. I want to see performance data that reflects real use.

I also look for evidence of ongoing performance monitoring. Software is not static. It can be updated. It interacts with other systems that change. The clinical environment evolves. PMCF for MDSW must capture those dynamics.

That means your PMCF plan should include mechanisms to track software performance in the field, not just adverse events. It should monitor diagnostic accuracy over time. It should capture user feedback on usability and workflow integration. It should track software updates and assess their impact on clinical safety.

Most MDSF PMCF plans I see are generic. They describe complaint handling and literature monitoring. That is not enough. For software, PMCF must be active and data-driven.

Common Deficiency
PMCF plans for MDSW that do not include specific methods to monitor software performance, such as diagnostic accuracy tracking, user interaction analysis, or version control impact assessment.

Moving Beyond Compliance Thinking

The manufacturers who succeed with MDSW clinical evaluations are the ones who stop thinking in terms of compliance checkboxes.

They do not ask,

Frequently Asked Questions

What is a Clinical Evaluation Report (CER)?

A CER is a mandatory document under MDR 2017/745 that demonstrates the safety and performance of a medical device through systematic analysis of clinical data. It must be updated throughout the device lifecycle based on PMCF findings.

How often should the CER be updated?

The CER should be updated whenever significant new clinical data becomes available, after PMCF activities, when there are changes to the device or intended purpose, and at minimum during annual reviews as part of post-market surveillance.

What causes CER rejection by Notified Bodies?

Common reasons include inadequate equivalence demonstration, insufficient clinical data for claims, poorly structured SOTA analysis, missing gap analysis, and lack of clear benefit-risk determination. Structure and logical flow are as important as the data itself.

Which MDCG guidance documents are most relevant for clinical evaluation?

Key documents include MDCG 2020-5 (Equivalence), MDCG 2020-6 (Sufficient Clinical Evidence), MDCG 2020-13 (CEAR Template), MDCG 2020-7 (PMCF Plan), and MDCG 2020-8 (PMCF Evaluation Report). MDCG 2019-11, MDCG 2020-1

Need Expert Help with Your Clinical Evaluation?

Get personalized guidance on MDR compliance, CER writing, and Notified Body preparation.

Book a Call
Subscribe to Newsletter

✌

Peace, Hatem

Your Clinical Evaluation Partner

Follow me for more insights and practical advice.

Deepen Your Knowledge

Read Complete Guide to Clinical Evaluation under EU MDR for a comprehensive overview of clinical evaluation under EU MDR 2017/745.

Why Class IIa Software Needs More Than Rule 11 Compliance

What Rule 11 Actually Does