Tested Agilent multiplex hybrid capture (up to 8 libraries pooled per reaction) against the singleplex standard.
47 samples across 3 DNA types (FFPE, gDNA Male, gDNA Female) at 6 conditions (8x, 6x, 4x, 3x, 1x@325ng, Normal).
Each multiplexed library was paired with its singleplex counterpart (same library, standard mass ~520ng) for direct comparison.
What changes with multiplexing? Capture efficiency (Cov/M reads) is the primary metric affected.
For gDNA, on-target rate drops modestly from 78% (Normal) to 71-76% at 3-8 plex — a manageable loss.
Coverage efficiency decreases from 1.11-1.18 (Normal) to 0.72-0.90, with 3-4 plex retaining 77-81% of normal efficiency.
Duplication rises from 37% (Normal) to 46-54%, with 3-plex at 46-50% (well within limits) and 8-plex at 54% (approaching threshold).
Insert size and GC content remain stable across all conditions.
For FFPE, the pattern is similar but with tighter margins. On-target rate drops from 80% to 69-76%.
Coverage efficiency goes from 0.97 (Normal) to 0.62-0.74, with 3-plex at 0.74 (above 0.7 threshold)
but 4-plex at 0.69 (below threshold). This means FFPE is viable at 3-plex only; higher plex levels fail the 0.7 hard cutoff.
Importantly, the 1-plex@325ng control shows efficiency of 0.80 — confirming that lower input mass
(325ng vs 520ng) accounts for roughly half the efficiency loss, with multiplexing contributing the other half.
Biological signal is preserved. Variant calling concordance shows
VAF correlation r² = 0.83-0.85 across all plex levels — allele frequencies are not distorted.
Ti/Tv ratio remains stable at 2.1-2.2 (normal baseline: 2.3). SNP precision holds at 0.65-0.77 (multiplex calls are real).
CNA breakpoint concordance: 75-83% for gDNA, 64-68% for FFPE (FFPE noise is inherent, not from multiplexing).
Sensitivity variation is depth-driven, not capture quality — more reads recover more variants.
Cost implications. The probe set costs $81.65/reaction (BG WES V8 + V8 Supplement);
per-sample library prep is $52.94. Sequencing on the 25B flow cell: $0.640/M reads.
For gDNA (65M base reads), singleplex costs $176/sample. At 3-4 plex, probe savings outweigh extra sequencing,
yielding 27-32% savings ($48-56/sample). For FFPE (125M base reads), singleplex costs $215/sample.
3-plex saves 12% ($26/sample) — the extra 44M reads cost $28, partially offsetting $54 in probe savings.
Coverage efficiency improvements (e.g., mass optimization, bead cleanup tuning) would directly increase
FFPE savings by reducing the extra reads needed.
Every metric vs lab thresholds. Green = PASS, Yellow = borderline, Red = FAIL, Gray = N/A. Median coverage failures are solvable by adding reads; Cov/M reads failures are not.
Key reading of this scorecard: The only hard, unfixable QC failure is FFPE 8-plex Cov/M reads at 0.62 — below the 0.7 threshold. FFPE 4-plex at 0.69 is borderline.
Median coverage shortfalls (red cells in the coverage column) are addressable by increasing reads at pooling; the cost of those extra reads is accounted for in the cost-benefit analysis (Section 11).
Off-target rate, insert size, deduplicated %, mapping rate, and total reads all pass across every condition for all sample types.
GC content at gDNA Male 3-plex (56.8%) slightly exceeds the 55% upper bound, potentially indicating GC-biased capture at higher plex.
2
Median Coverage vs Thresholds
Hard threshold: 142.5X (includes 5% acceptable deviation). gDNA normal threshold: 95X. These can be met by adjusting total reads at pooling.
FFPE
gDNA Male
gDNA Female
Median Coverage (X) by Condition
Bar heights with QC threshold lines overlaid
% of Normal Coverage Retained
How much coverage each multiplex condition preserves relative to its singleplex baseline
Coverage is adjustable, efficiency is not. At current read levels, FFPE Normal singleplex reaches 144.8X (passes 142.5X with 1.6% margin), while all FFPE multiplex conditions fall short (67-110X).
For gDNA, 3-6 plex all meet the 95X threshold; only 8-plex Female (85.8X) falls short.
The critical question is not whether these values pass — they can always be raised by sequencing more reads.
The question is: how much does that extra sequencing cost? That depends on coverage efficiency (Section 5) and drives the cost-benefit analysis (Section 11).
For FFPE 3-plex, achieving 142.5X requires ~193M reads (1.31x Normal); for 4-plex, ~207M reads (1.41x Normal).
3
Coverage and Duplication Dynamics vs Total Reads
Scatter plots reveal how sequencing depth interacts with multiplexing. Each point = one condition x DNA type. Hover for details.
FFPE
gDNA Male
gDNA Female
Large circle = Normal | Cross = 1x@325 | Triangle = 3-plex | Square = 4-plex | Diamond = 6-plex | Star = 8-plex
Median Coverage vs Total Reads
How efficiently reads convert to coverage. Higher slope = better capture. Dashed lines = thresholds
Total Deduplicated % vs Total Reads
Library complexity vs sequencing depth. Low dedup at high reads = wasted sequencing
Two distinct regimes visible. In the coverage plot, Normal singleplex samples cluster at high coverage-per-read (steep implied slope), while multiplex samples systematically shift down — same reads yield less coverage.
FFPE points sit lower than gDNA at every plex level, confirming FFPE-specific efficiency penalty.
The dedup plot is the key diagnostic: Normal and 1-plex@325ng both maintain ~62% deduplicated regardless of read count,
while all 3-8 plex conditions drop to 46-55% deduplicated even when they have more total reads.
This proves multiplexing inherently reduces library complexity — it is independent of sequencing depth or input mass, and is the primary driver of the efficiency penalty.
4
On-Target Rate and Off-Target %
Capture specificity. Off-target threshold: less than 35%. All conditions pass. The on-target pattern reveals that input mass, not multiplexing, drives most of the loss.
On-Target Rate (%) by Condition
Higher is better. Implied threshold at 65% (off-target below 35%)
Delta On-Target vs Matched Normal
Percentage point drop from each library's singleplex counterpart
Surprising and important pattern: 8-plex gDNA Male is nearly identical to Normal (-0.1pp), and gDNA Female is only -2.2pp.
Meanwhile, lower plex levels show worse on-target than 8-plex — 1-plex@325ng drops 7-11pp.
This proves input mass (325ng vs ~520ng) is the primary on-target driver, not multiplexing itself.
FFPE is more sensitive at all plex levels: 8-plex loses 3.6pp, 1-plex@325ng loses 11pp.
All conditions comfortably pass the off-target below 35% threshold (worst case: FFPE 1-plex@325ng at 31%).
For both gDNA and FFPE, on-target rates remain within acceptable operational range at 3-4 plex.
5
Coverage Efficiency — The Key Decision Metric (Cov / Million Reads)
This metric determines how much extra sequencing you need, and therefore the true cost of multiplexing. Hard threshold: 0.7 Cov/M reads. Below 0.7 = not viable.
Coverage / Million Reads
Higher = more efficient use of sequencing. Red dashed = 0.7 hard threshold
% Efficiency Loss from Normal
Relative performance degradation drives the extra sequencing cost
This metric drives the entire cost equation. Every % loss in efficiency = proportionally more reads needed = higher sequencing cost.
For gDNA: efficiency drops ~24% at 3-plex, ~28% at 6-plex, ~35% at 8-plex. All gDNA conditions pass the 0.7 threshold.
For FFPE: drops ~24% at 3-plex (0.74 — passes), ~29% at 4-plex (0.69 — borderline), ~28% at 6-plex (0.70 — at threshold), ~36% at 8-plex (0.62 — fails).
Note that 1-plex@325ng also loses 13-22% vs Normal, confirming input mass contributes ~13-18% of the total efficiency loss, with multiplexing adding ~10-17% on top.
At $0.640/M reads (25B), a 24% efficiency loss at 3-plex means ~$10 extra per sample; a 36% loss at 8-plex means ~$20 extra — progressively eating into the probe savings.
6
Duplication Rate and Library Complexity
Threshold: total deduplicated at least 40% (duplication at most 60%). All conditions pass, but the duplication trend directly explains the efficiency loss.
Duplication Rate (%) by Condition
Lower is better. Dashed = 60% max (deduplicated at least 40% threshold)
Extra Duplication vs Normal
Absolute pp increase — the true multiplexing penalty, same for gDNA and FFPE
Duplication is the clearest multiplexing artifact and the root cause of efficiency loss. All 3-8 plex conditions show +8 to +16pp more duplication than Normal.
Critically, 1-plex@325ng matches Normal (~37-39%), proving this is a pure multiplexing effect, not a mass effect.
The duplication penalty is remarkably similar across gDNA and FFPE: both reach ~54% at 8-plex, ~46-50% at 6-plex, ~45-51% at 4-plex.
Although all conditions still pass the 40% deduplicated threshold, 8-plex at ~46% deduplicated has only 6pp margin from failure.
Each duplicate read is wasted sequencing spend — this is why coverage efficiency degrades and extra reads are needed.
7
Coverage Uniformity — x20 Completeness
Fraction of target bases at 20X or above. Threshold: at least 90%. Only available for gDNA (normal) samples. FFPE data not available for this metric.
x20 Completeness (%) — gDNA Only
Red dashed line = 90% threshold
x20 Pass/Fail by Condition
Count of gDNA sample types meeting at least 90%
gDNA Female performs well:passes 90% at 6-plex (90.1%), 4-plex (90.1%), 3-plex (93.2%), and 1-plex@325ng (93.9%). 8-plex is borderline at 89.4%.
gDNA Male is below 90% at nearly all conditions (75-84%), and even the Normal singleplex only reaches 88% — this appears to be a sample-specific limitation rather than a multiplexing effect.
The 3-plex Male anomaly at 75.4% is concerning but based on n=1 for this condition, so statistical significance is uncertain.
This metric is expected to improve proportionally when more reads are allocated to meet median coverage thresholds.
8
Bead Cleanup Impact Analysis
4-plex and above required bead concentration. 3-plex, 1-plex, and Normal did not. This is perfectly confounded with plex level in this experiment.
Bead Cleanup vs No Cleanup — Key Metrics (Averaged Across Types)
Cannot separate cleanup effect from plex effect in this design
Confounded design — cannot isolate cleanup effect. Bead cleanup perfectly correlates with higher plex (4-8 = Yes, 3/1/Normal = No).
The cleanup group shows worse metrics, but this is expected from higher plex alone. If bead cleanup itself degrades quality, the true 4-6 plex performance may be slightly better than measured.
Practical implication: 3-plex avoids bead cleanup entirely, keeping the workflow simpler and cheaper. At 4-plex and above, cleanup adds ~$5-10 and extra handling time.
A follow-up experiment testing 3-plex with and without cleanup at fixed plex level would isolate this variable.
9
FFPE vs gDNA — Multiplexing Sensitivity by Sample Type
How each DNA source responds to multiplexing. FFPE degrades more, but the difference is manageable at 3-4 plex with extra reads.
Relative On-Target Loss by Sample Type
% change from each type's normal baseline
Relative Coverage Efficiency Loss
% change in Cov/M reads — the metric that drives cost
FFPE is more sensitive but not disqualified at lower plex. At 3-plex, FFPE loses 24% efficiency vs gDNA Male's 24% and gDNA Female's 23% — the gap is small at this level.
At 8-plex, FFPE loses 36% efficiency vs gDNA Male's 35% and gDNA Female's 35% — similar magnitude but FFPE starts from a lower baseline (0.97 vs 1.11-1.18), so it crosses the 0.7 threshold first.
On-target rate shows a larger FFPE-specific penalty: FFPE loses 4.6% at 8-plex vs gDNA Male's 0.1%, reflecting FFPE DNA's fragmented, crosslinked nature.
Bottom line: FFPE 3-plex maintains Cov/M at 0.74 (above 0.7). At $0.640/M reads (25B), extra sequencing costs $28/sample vs $54 probe savings = net $26 saved (12% reduction).
10
Additional QC Metrics (Insert Size, GC, Total Reads)
Secondary metrics that remain stable across conditions, confirming multiplexing does not disrupt fundamental library characteristics.
Median Insert Size
Threshold: at least 120bp
Mean GC Content (%)
Threshold: 40-55%
Total Reads (Millions)
Threshold: at least 47.5M
All stable across conditions.Insert size (120-139bp), total reads (92-167M), and mapping rate (99.95%) are unaffected by multiplexing for all sample types.
FFPE insert sizes are slightly shorter (120-130bp vs 130-139bp for gDNA), as expected for fragmented DNA, but all pass the 120bp threshold.
GC content is mostly within 40-55%: gDNA Male at 3-plex (56.8%) and 8-plex (55.6%) slightly exceed the 55% upper bound.
This GC shift in male gDNA multiplex samples may indicate a GC-biased capture artifact at higher plex, worth monitoring in follow-up experiments.
gDNA (65M base reads, singleplex = $176/sample):
3-plex: saves $48 (27%) — no bead cleanup needed.
4-plex: saves $56 (32%) — best balance of savings and quality.
6-plex: saves $59 (33%) — highest savings but requires bead cleanup.
8-plex: saves $54 (31%) — diminishing returns, high duplication.
FFPE (125M base reads, singleplex = $215/sample):
3-plex: saves $26 (12%) — modest but real; probe savings of $54 minus $28 extra sequencing.
6-plex: saves $34 (16%) — but Cov/M at 0.70 is exactly at threshold, risky.
At 1000 gDNA/year: 4-plex saves $56K (32%). At 1000 FFPE/year: 3-plex saves $26K (12%).
12
Per-Sample Distributions — Boxplots with Jitter
Individual sample values across conditions. Box = IQR (n≥3), line = range (n=2), dots = individual samples. Normal n=8/type, 8-plex n=2-3, 6-plex n=2, 3-4-plex n=1/type.
Filter:
FFPE
gDNA Male
gDNA Female
13
Reproducibility — Coefficient of Variation (CV%)
CV = (SD / mean) × 100% within each condition × type. Green ≤10%, amber 10-20%, red >20%. Only calculated where n≥2.
Normal (n=8) provides the reproducibility baseline. Most metrics show CV below 5% for on-target, insert size, and GC —
excellent reproducibility. Total reads and median coverage show higher CV (10-15%) reflecting deliberate variation in sequencing depth across replicates.
Multiplex conditions with n=2-3 show similar CV ranges where calculable, suggesting multiplexing does not degrade run-to-run consistency.
Conditions with n=1 (3-plex, 4-plex, 1x@325) cannot be assessed — expanding replicates is recommended for validation.
14
Recommendations
IMPLEMENT: 3-4 Plex for gDNA Samples
Net savings: 4-plex saves $56/sample (32%). 3-plex saves $48 (27%). No bead cleanup needed at 3-plex
Cost breakdown: Probe drops from $81.65 to $20.41 (4-plex) or $27.22 (3-plex). Extra sequencing: only $5-10/sample
Coverage efficiency: 0.85-0.90 Cov/M reads — well above 0.7 threshold
Variant fidelity confirmed: VAF r-squared = 0.83-0.85, Ti/Tv stable at 2.1-2.2, SNP precision 0.65-0.77
Filtered consensus approach: LDT_PASS + target_filter=PASS applied to all MAFs. Consensus truth built from ≥6/8 filtered Normal replicates per gender. Validates multiplexing effect on germline variant calling across 32 gDNA samples.
Samples
32
gDNA Male + Female
Conditions
6
Normal → 8-plex
Truth — Male
23,151
SNPs + Indels filtered
Truth — Female
23,751
SNPs + Indels filtered
x20 → Sens
r=0.84
strong predictor
Summary
Multiplexed hybrid capture (3- to 8-plex) was benchmarked against singleplex on gDNA Male and Female reference material. Variant calls were filtered with LDT_PASS and target_filter=PASS, and compared to a consensus truth set built from 8 singleplex Normal replicates per gender.
Outcome: at 3-plex and 4-plex, SNP sensitivity remains ≥97.7% and GT concordance ≥99.9%, statistically indistinguishable from singleplex baseline. 6- and 8-plex show a measurable but bounded sensitivity loss (~3 pp) concentrated in regions with reduced x20 completeness. Phase 2 validation on COLO829 and clinical FFPE at 3- and 4-plex is proposed (see end of report).
Methodology
Three analysis approaches with filtered MAFs (LDT_PASS + target_filter=PASS).
Truth set construction
The reference is not externally validated. Per gender, a consensus truth set was built from 8 singleplex Normal replicates of the same gDNA: a variant is included if observed in ≥6/8 replicates (filtered MAFs). Consensus genotype is the majority vote. This yields ~23,000 high-confidence variants per gender used as the reference for all multiplex comparisons.
1. MAF Filtering
Every MAF processed with LDT_PASS == True AND target_filter == PASS. Same filter applied to truth-set construction and test samples.
2. Consensus Truth Set
Per gender, a variant is included if present in ≥6/8 filtered Normal replicates. Consensus GT = majority vote across replicates.
Approach 1 — vs Consensus
Each sample (multiplex or Normal) compared against its gender's consensus truth set. Used for headline sensitivity, precision, GT, and all correlation analyses.
Approach 2a — Pairwise Reproducibility
Each Normal × every other Normal (8×8 per gender). Measures within-condition consistency and noise floor.
Approach 2b — Normal × Multiplex
Each Normal as truth × each Multiplex sample (8×8 per gender). Shows how multiplex deviates from any individual Normal baseline.
Metrics
SNP and Indel sensitivity/precision computed separately. GT concordance on shared sites with het/hom breakdown by truth GT.
Understanding the metrics: TP, FN, FP
Sensitivity and GT concordance answer different questions with different denominators on the same data.
Terminology
TP (True Positive) — variant found by BOTH sample and truth set, at the same position with the same allele. FN (False Negative) — variant present in truth set but NOT found by sample. Usually a low-coverage region where the caller couldn't make a confident call. FP (False Positive) — variant called by sample but NOT in truth set. In the filtered consensus setup, these are calls outside the ≥6/8-supported consensus.
Why sensitivity drops but GT stays flat: Sensitivity counts against all 22,623 truth variants, including 1,082 that the sample didn't call at all (missed in low-depth regions). GT concordance only counts the 21,541 shared sites — of which ~99.9% have matching genotype. Missed variants don't hurt GT. That's why multiplexing shows a small sensitivity drop but flat GT concordance.
Key Findings
Approach 1 results — each sample compared against the gender-matched consensus truth set.
Metric definitions:SNP sensitivity — fraction of truth-set SNPs recovered by the sample (TP / truth).
SNP precision — fraction of sample calls confirmed in truth (TP / called).
GT concordance — agreement on genotype (het / hom) at shared positions.
Het / Hom concordance — same metric, stratified by truth-set genotype class.
Normal Male SNP Sens
98.9%
baseline
8-plex Male SNP Sens
95.7%
−3.2 pp
Normal Female SNP Sens
99.5%
baseline
8-plex Female SNP Sens
96.8%
−2.7 pp
GT Concordance
99.9%
flat across all conditions
Summary by Condition (Approach 1)
Averaged across replicates. All metrics on filtered calls (LDT_PASS + target_filter=PASS).
SNP calling: small sensitivity drop with plex; precision stable
SNP Sensitivity by Condition
Normal (n=8) is baseline. Multiplex conditions n=1–3.
SNP Precision by Condition
Stable 94–99% across all conditions. No FP increase.
Indel calling: similar pattern, slightly more variation
Indel Sensitivity by Condition
Drops ~6 pp at 8-plex. Indels are more sensitive to coverage.
Indel Precision by Condition
Stable ~92% throughout. No multiplexing effect on precision.
GT concordance remains at ~99.9% across all conditions
GT Concordance by Condition
Near-perfect in all conditions. Quality of calls preserved.
Het vs Hom Concordance (averaged)
Both 99.8–100%. No systematic bias by genotype type.
Approach 1 — Per-Sample Metrics Heatmap
Each row = one sample vs its gender's consensus truth. Color-coded across eight key metrics. Condition shown explicitly per sample.
Approach 2a — Pairwise Normal × Normal (Reproducibility baseline)
Each Normal replicate used as truth for every other replicate. Shows within-condition reproducibility — the noise floor for any concordance metric.
Pairwise — Female Normals (8×8)
Diagonal = self (100%). Off-diagonal = Normal × Normal.
Pairwise — Male Normals (8×8)
Diagonal = self (100%). Off-diagonal = Normal × Normal.
Each Normal replicate used as truth (rows) vs each Multiplex sample (columns). Columns ordered by plex level. Shown alongside the prior hybridization-time optimization study (rejected on the same metric) for comparative context.
Prior study on HCC1395BL_FPB. Pairwise SNP concordance across 3 sample groups × 4 conditions (CTRL, 6 hr, 6 hr TD, 16 hr). The 16 hr condition (highlighted) was rejected based on the concordance drop visible against CTRL columns. Same metric used in Approach 2b above.
Comparative interpretation. The hybridization-time study rejected the 16 hr condition based on pairwise SNP concordance falling to ~95.3–95.6% vs CTRL (red-tinted block, bottom-left of each ref-group panel), against a CTRL × CTRL across-sample noise floor of ~97.9%. In Approach 1 above, 3-plex and 4-plex against the Normal baseline yield ~97.7–99% — within or above the CTRL × CTRL noise floor of the prior study, and clearly above the rejected 16 hr threshold. 6-plex and 8-plex sit closer to the rejected 16 hr level (~95–96%), consistent with their classification as caution / not-recommended in the risk table above. The methodology change proposed here is therefore more conservative than what was previously evaluated for hybridization time.
Hypothesis validation: x20 completeness drives sensitivity, not median coverage
Sensitivity values from Approach 1 (vs consensus). Dot color = gender, dot shape = condition. Hover for sample details.
x20 completeness vs median coverage: these capture different aspects of sequencing depth. Median coverage describes typical depth across the panel; x20 completeness describes the percentage of target bases covered at ≥20×, the threshold required for confident heterozygous variant calling. A sample can hold high median coverage while losing target uniformity — pockets of regions falling below 20×. Multiplexing primarily affects this uniformity, which is why x20 (not median) tracks variant recovery.
Shape: Normal 1x@325 3-plex 4-plex 6-plex 8-plexColor: Male Female
Median Coverage vs x20 Completeness
Pearson r = 0.17 — weak. Median doesn't predict x20.
x20 Completeness vs SNP Sensitivity
Pearson r = 0.84 — strong. x20 is the dominant predictor.
Hypothesis validated. The meeting hypothesis stated: multiplexing affects coverage completeness (x20) in significant regions, which drops germline variant calling sensitivity even though median coverage stays above threshold. The data confirms this exactly:
Median coverage drops ~44% Normal → 8-plex (Male 175× → 98×), but remains far above the 20× calling threshold.
x20 completeness drops only ~8 pp (Male 88% → 80%) — small absolute change, but the consequence is that some capture regions fall below 20× and become uncallable.
Sensitivity tracks x20, not median coverage — r=0.84 vs r=0.17. When x20 drops, calls are lost in those regions even while median looks healthy.
GT concordance barely moves — because where calls DO happen (above 20×), genotypes remain accurate.
Supporting evidence: median coverage is a weaker predictor than x20
Median Coverage vs SNP Sensitivity
Pearson r = 0.45 — moderate. Weaker than x20.
x20 Completeness vs Indel Sensitivity
Pearson r = 0.60 — moderate. Indels even more coverage-sensitive.
Risk Assessment by Plex Level
Per-condition implication for clinical-context variant calling. Sensitivity ranges from 8-plex condition; cost figures are per-sample probe cost relative to singleplex baseline.
3-plex — Recommended
SNP sensitivity ≥97.7% · GT concordance 99.9%
Het concordance 100% · Hom concordance ≥99.9%
Performance statistically equivalent to singleplex baseline. Probe cost: ~33% of singleplex per sample.
4-plex — Recommended
SNP sensitivity ≥97.7% · GT concordance 99.9%
Het concordance 100% · Hom concordance ≥99.9%
Indistinguishable from 3-plex within replicate variability. Probe cost: ~25% of singleplex per sample.
6-plex — Acceptable with QC monitoring
SNP sensitivity ~97.4% · GT concordance 99.9%
Het concordance 100% · Hom concordance ~99.8%
Sensitivity loss concentrated in regions where x20 falls below 90%. Mitigation: require x20 ≥ 90% as release criterion.
8-plex — Not recommended for primary diagnostic
SNP sensitivity ~95.7-96.8% · GT concordance 99.9%
Het concordance 100% · Hom concordance ~99.8%
~3-4 pp sensitivity loss; calls remain accurate but coverage gaps measurably reduce recovery. Use case: screening / discovery contexts only.
Clinical-context note: the variants lost at higher plex levels are not random — they cluster in regions where x20 falls below the QC release threshold. Such regions would already be flagged in current workflow and, where clinically relevant, re-sequenced. The practical impact on reported findings is therefore lower than the headline sensitivity figure suggests, provided x20 is enforced as a release gate.
Conclusions
1. Multiplexing up to 8-plex preserves germline variant calling quality. GT concordance is 99.9% in all conditions (Normal through 8-plex). No degradation from multiplexing. Het and hom concordance both ~99.9%, no systematic bias.
2. SNP sensitivity drop is modest. Male: 98.9% → 95.7% (Normal → 8-plex, −3.2 pp). Female: 99.5% → 96.8% (−2.7 pp). At 3-4 plex, sensitivity stays above 97.5% for both genders.
3. Precision is stable across all conditions. SNP precision 94–99%, Indel precision ~92%. Multiplexing does not introduce false positives.
4. Indel calling shows slightly larger drop. Male indel sensitivity: 97.4% → 92.0% at 8-plex. Female: 98.2% → 91.8%. Indels are more coverage-sensitive than SNPs.
5. Approach 2b confirms multiplex degradation against any individual Normal baseline. The pattern of decreasing SNP sensitivity from 1x@325 to 8-plex is consistent regardless of which Normal is chosen as truth — the choice of baseline does not change the conclusion.
6. Hypothesis validated: multiplexing impacts coverage completeness, not variant accuracy. x20 completeness is the strongest predictor of sensitivity (r=0.84). Median coverage is a weaker predictor (r=0.45) and is uncorrelated with x20 (r=0.17). Monitoring only median coverage misses the real effect of multiplexing.
7. Recommendation: 3-4 plex is safe for germline calling (<3 pp sensitivity loss). 6-8 plex is acceptable if x20 completeness is monitored alongside median coverage. The bottleneck is read distribution uniformity across the capture target, not capture quality per se.
Phase 2 Validation Proposal
Phase 1 (this study) established multiplex non-inferiority on gDNA reference material. Phase 2 extends the validation to the next two sample classes in routine workflow.
Proposal
Validate 3-plex and 4-plex against singleplex on COLO829/COLO829BL and clinical FFPE material. Acceptance is contingent on meeting predefined SNP sensitivity, GT concordance, and x20 thresholds, with COLO829 providing the orthogonally validated somatic truth set absent in Phase 1.
Scope
Sample matrix
COLO829 / COLO829BL: 3 replicates × 3-plex, 4-plex, singleplex (paired controls). Clinical FFPE: 6 archived diagnostic samples per plex level (3- and 4-plex), with paired singleplex controls. De-identified, covering the typical input-mass and DIN range encountered in routine workflow. Total: ~36 libraries, ~12 hybrid capture reactions.
Acceptance criteria (per sample)
SNP sensitivity ≥ 97% against paired singleplex.
GT concordance ≥ 99.9%.
Het concordance: 100%.
x20 completeness ≥ 90%.
On-target rate ≥ 75% (gDNA) / ≥ 70% (FFPE).
Coverage / M reads ≥ 0.7.
Read-outs
Approach 1, 2a, 2b framework as in Phase 1.
COLO829 somatic variant recovery against orthogonal truth set.
VAF correlation between paired multiplex / singleplex.
Insert size and dedup metrics for FFPE-specific analysis.
Final report in dual-deck format (QC + Germline + COLO829 somatic).
Decision gate
Go: ≥ 90% of samples meet all criteria; no systematic bias in clinically actionable variant classes. Conditional go: gDNA passes; FFPE shows input-mass dependent failures. Proceed with revised input-mass guidelines. No-go: > 3 pp sensitivity loss on FFPE, or recurrent missed variants in actionable regions.
Economics & throughput
Per-sample probe cost (Phase 1 estimate)
Singleplex baseline: 100%.
3-plex: ~33% of singleplex per sample.
4-plex: ~25% of singleplex per sample.
Phase 2 will refine for FFPE-specific over-sequencing requirements.
Throughput & timeline
Wet-lab capture throughput: ~3× current (per capture reaction).
Phase 2 wet-lab: ~3 weeks (library prep, capture, sequencing).
Phase 2 dry-lab: ~2 weeks (pipeline validated; rerun on new MAFs).
End-to-end to decision: ~5 weeks.
Required from the lab
Wet lab:
Allocation of ~36 library preps and ~12 hybrid capture reactions.
Selection of 6 representative clinical FFPE samples (de-identified) covering typical input-mass and DIN range.
COLO829 / COLO829BL stocks (~200 ng per replicate, ~6 replicates).
Dual-deck report (QC + Germline + COLO829 somatic) within 2 weeks of MAF availability.
Bottom line
At 3-plex and 4-plex, multiplex hybrid capture is non-inferior to singleplex for germline variant calling on gDNA. Phase 2 closes the open question of FFPE and somatic performance. If Phase 2 confirms the Phase 1 result — and the data trajectory supports this — the lab gains ~65-75% probe cost reduction and ~3× wet-lab throughput on hybrid capture, with no compromise on reported variant content.