Why Your Sample Size Justification Gets Rejected
You submit a clinical investigation plan with 40 patients. The Notified Body comes back: insufficient sample size justification. You revise, add more detail, resubmit. They reject again. The problem is not the number. It is that you never answered the question they are actually asking.
In This Article
- What the Regulation Actually Requires
- The Questions Sample Size Must Answer
- The Role of Device Risk Classification
- Statistical Power and Clinical Relevance
- What Happens When the Sample Is Too Small
- Equivalence and Non-Inferiority Studies
- Adjusting for Dropouts and Loss to Follow-Up
- When Pragmatic Constraints Limit Enrollment
- How to Structure the Justification
- What Reviewers Look For
- Final Reflection
Sample size is not a number you decide. It is a conclusion you reach after defining what you need to demonstrate and what uncertainty you can tolerate. Most rejections happen because the justification works backward from a convenient number instead of forward from the clinical objective.
I have reviewed investigation protocols where the sample size section is reduced to a formula and a result. No context. No reasoning. No connection to the device risk profile or the claims being tested. The manufacturer thinks they provided justification. The reviewer sees a calculation without substance.
What the Regulation Actually Requires
MDR Annex XV, Chapter I, Section 3.2 states that the clinical investigation plan must include a justification for the number of subjects. The word is justification, not calculation. The difference matters.
A calculation is mathematical. A justification is clinical and regulatory. It explains why this number of subjects, with this design, under these conditions, will generate evidence sufficient to address the clinical questions and support the conformity assessment.
The regulation does not specify a formula because no single formula applies to all devices. A low-risk device with well-established technology requires a different approach than a high-risk implant with novel mechanism of action. The justification must be built from the clinical evaluation work that preceded it.
Sample size justification is not a standalone calculation. It is the endpoint of your clinical evaluation logic. If your clinical evaluation report is weak, your sample size justification will collapse under review.
The Questions Sample Size Must Answer
Before you write the justification, identify the clinical questions the investigation must resolve. These questions come directly from the gaps in your clinical evaluation.
What safety outcomes must be observed to confirm acceptable risk? What performance endpoints must be met to support the intended use? What is the expected frequency of adverse events, and how many subjects are needed to detect them with reasonable confidence?
Each question has a statistical requirement. If you need to demonstrate that a complication occurs in fewer than 5% of cases, you cannot base that conclusion on 20 patients. If you claim equivalence to a predicate device with a known success rate, your sample must be powered to detect non-inferiority within a defined margin.
Reviewers do not reject sample sizes because they are too small in principle. They reject them because the number provided cannot reliably answer the clinical questions the device must address.
Manufacturers list primary and secondary endpoints but never explain how the sample size relates to each one. The calculation addresses one endpoint, and the rest are ignored. Reviewers see this immediately.
The Role of Device Risk Classification
Sample size expectations scale with risk. A Class I non-invasive device with decades of clinical history does not require the same evidentiary threshold as a Class III active implant with limited prior use.
But risk classification alone does not dictate the number. You must demonstrate how the sample size accounts for the specific risks identified in your risk management file. If your device has a risk of perforation, infection, or thrombosis, the sample must be large enough to observe or rule out these events with adequate confidence.
This is where many justifications fail. They reference the device class, state a sample size, and assume the connection is obvious. It is not. The justification must show that the proposed sample can address the risks with sufficient statistical and clinical confidence.
For higher-risk devices, this often means designing the study to detect rare but serious events. If a complication has an expected incidence of 2%, you need more than 50 patients to observe it with reasonable probability. If you cannot enroll that many, you must explain how you will manage the uncertainty.
Statistical Power and Clinical Relevance
Power calculations are essential, but they must be grounded in clinically meaningful thresholds. A study powered to detect a 10% difference in performance is only useful if 10% is the threshold of clinical relevance.
I have seen protocols where the power calculation is technically correct but clinically meaningless. The device is tested against a standard of care, and the study is powered to detect superiority with 80% power. But the margin of superiority chosen is arbitrary. No clinical rationale supports it. No connection to patient outcomes justifies it.
The threshold you choose must reflect what patients, clinicians, and regulators would consider a meaningful benefit or an acceptable difference. This is not a statistical decision. It is a clinical one. If you cannot defend the threshold, your sample size calculation is built on sand.
A sample size justified by a power calculation is only valid if the underlying assumptions are clinically defensible. Reviewers will challenge the expected effect size, the variance, and the clinical relevance of the margin.
What Happens When the Sample Is Too Small
Small samples are not inherently wrong. They are wrong when they cannot answer the clinical questions being asked.
If your investigation includes 30 patients and the primary endpoint is device success, the study may be sufficient if the expected success rate is high and the confidence interval is acceptable. But if a secondary endpoint is to rule out a 5% complication rate, 30 patients cannot provide that assurance.
The problem is not the size. The problem is the mismatch between the sample and the objectives. Many manufacturers design studies with multiple endpoints, some of which require large samples and others that do not. The justification must address each endpoint explicitly or acknowledge that some questions will remain unanswered.
When a sample is insufficient, reviewers will ask how you plan to manage the residual uncertainty. This is where post-market surveillance and PMCF come in. If the clinical investigation cannot fully characterize rare events, the PMCF plan must be designed to close that gap. This must be stated in the justification.
Equivalence and Non-Inferiority Studies
If your device relies on equivalence to support safety and performance, the sample size calculation becomes more demanding. Non-inferiority studies require larger samples than superiority studies because the goal is to rule out a clinically meaningful difference.
The justification must define the non-inferiority margin and explain why it is appropriate. Reviewers will scrutinize this. If the margin is too wide, the study becomes uninformative. If it is too narrow, the sample size may be impractical.
I have reviewed equivalence studies where the margin is chosen to make the sample size feasible, not because it reflects clinical relevance. This is backwards. The margin must be clinically justified first, and the sample size must follow.
If the required sample size is not feasible, you cannot solve the problem by widening the margin. You must reconsider the study design or accept that equivalence cannot be demonstrated through direct comparison alone.
Manufacturers cite a predicate device and claim equivalence, but the investigation is powered to detect superiority, not non-inferiority. The reviewer sees the inconsistency immediately. The justification and the statistical plan must align.
Adjusting for Dropouts and Loss to Follow-Up
Sample size calculations often assume perfect enrollment and complete data. Real trials do not work that way. Patients withdraw. Follow-up is lost. Data are incomplete.
The justification must account for this. If you expect a 10% dropout rate, you must inflate the enrollment target accordingly. If you do not, the final evaluable sample will be smaller than planned, and the study may lose statistical power.
Reviewers look for this adjustment. If it is missing, they will question whether the study can meet its objectives. If dropouts are higher than expected, the investigation may fail not because the device failed, but because the sample size was never adequate to absorb the loss.
The justification should also explain how missing data will be handled. Will the analysis be per-protocol or intention-to-treat? How will incomplete follow-up affect the interpretation of endpoints? These are not statistical technicalities. They determine whether the study results will be credible.
When Pragmatic Constraints Limit Enrollment
Not every manufacturer can enroll 200 patients. Budget, timeline, and access to sites impose real limits. Reviewers understand this. But they will not accept a small sample without acknowledgment of the limitation.
If your sample size is constrained by feasibility, the justification must state this openly and explain how the limitation will be managed. Will the study design be adjusted to focus on the most critical endpoints? Will PMCF be expanded to address the residual uncertainty?
A transparent justification that acknowledges constraints and proposes mitigation is stronger than a justification that pretends the sample is ideal when it is not. Reviewers value honesty. They reject evasion.
A small sample size with a clear mitigation strategy is more defensible than a small sample size with no explanation. The justification must show that you understand the limitation and have a plan to address it.
How to Structure the Justification
A strong sample size justification follows a logical sequence. Start with the clinical questions derived from the clinical evaluation. Identify the primary and secondary endpoints that address those questions. Define the expected outcomes and the thresholds of clinical relevance.
Then, present the statistical rationale. What power level is appropriate? What confidence level is required? What assumptions underlie the calculation? Cite published data or prior studies that support those assumptions.
Next, account for practical factors. What is the expected dropout rate? How will missing data be handled? What is the planned enrollment timeline, and is it realistic given site capacity?
Finally, connect the sample size back to the clinical evaluation and risk management. Explain how the proposed investigation will close the evidence gaps and support the conformity assessment. If the sample cannot fully address all uncertainties, state what will be deferred to PMCF.
This structure makes the justification reviewable. The logic is visible. The assumptions are stated. The limitations are acknowledged. Reviewers can follow the reasoning and either accept it or challenge specific points.
What Reviewers Look For
When I review a clinical investigation plan, the sample size justification is one of the first sections I scrutinize. I look for coherence between the clinical evaluation, the investigation objectives, and the statistical plan.
Does the sample size reflect the device risk profile? Are the endpoints clinically meaningful? Are the assumptions realistic? Is there a plan to manage uncertainty if the sample is small?
If these questions are not answered, the justification fails. It does not matter if the calculation is mathematically correct. If the clinical logic is missing, the reviewer cannot accept it.
The best justifications I have seen are those where the manufacturer shows they understand what they need to prove, what they can prove with the proposed study, and what they will prove later through PMCF. The reasoning is complete. The gaps are acknowledged. The plan is credible.
Manufacturers copy sample size justifications from previous submissions or templates without adapting them to the specific device and clinical context. Reviewers recognize template language immediately.
Final Reflection
Sample size justification is not a formality. It is the bridge between your clinical evaluation and your investigation design. If the bridge is weak, the entire clinical evidence strategy collapses.
Reviewers do not expect perfection. They expect transparency, logic, and connection to the clinical reality. A well-justified sample size shows that you understand what you are trying to demonstrate and how the proposed study will get you there.
If your justification has been rejected, do not just add more detail. Go back to the clinical questions. Make sure the sample size answers them. Make sure the assumptions are defensible. Make sure the limitations are acknowledged.
That is what reviewers are waiting to see.
Peace,
Hatem
Clinical Evaluation Expert for Medical Devices
Follow me for more insights and practical advice.
Frequently Asked Questions
What is a Clinical Evaluation Report (CER)?
A CER is a mandatory document under MDR 2017/745 that demonstrates the safety and performance of a medical device through systematic analysis of clinical data. It must be updated throughout the device lifecycle based on PMCF findings.
How often should the CER be updated?
The CER should be updated whenever significant new clinical data becomes available, after PMCF activities, when there are changes to the device or intended purpose, and at minimum during annual reviews as part of post-market surveillance.
What causes CER rejection by Notified Bodies?
Common reasons include inadequate equivalence demonstration, insufficient clinical data for claims, poorly structured SOTA analysis, missing gap analysis, and lack of clear benefit-risk determination. Structure and logical flow are as important as the data itself.
Which MDCG guidance documents are most relevant for clinical evaluation?
Key documents include MDCG 2020-5 (Equivalence), MDCG 2020-6 (Sufficient Clinical Evidence), MDCG 2020-13 (CEAR Template), MDCG 2020-7 (PMCF Plan), and MDCG 2020-8 (PMCF Evaluation Report).
Need Expert Help with Your Clinical Evaluation?
Get personalized guidance on MDR compliance, CER writing, and Notified Body preparation.
✌
Peace, Hatem
Your Clinical Evaluation Partner
Follow me for more insights and practical advice.
– Regulation (EU) 2017/745 (MDR), Annex XV, Chapter I, Section 3.2
– MDCG 2020-6: Regulation (EU) 2017/745: Clinical evidence needed for medical devices previously CE marked under Directives 93/42/EEC or 90/385/EEC
– MDCG 2020-13: Clinical evaluation assessment report template





