AOI to Action: LLMs that Explain Defects

AOI is great at catching potential defects and notorious for flooding reviewers with false calls. In 2025, the winning plants pair 3D AOI with explainable LLMs: images become structured tokens, codes become plain-English reasons, and fixes are linked to IPC and SOPs. Less triage, more action, higher FPY.

KEY TAKEAWAYS

• Most AOI pain comes from context-poor alarms. Explanations anchored to measurements convert noise into action.

• An explainable pipeline, features → tokens → RAG over IPC/SOPs, slashes review time and debate.

• Track FCR, review time, and FPY. Drop any metric that doesn’t drive real process changes.

Why AOI raises too many alarms: the false-call problem

False calls slow lines and drain expert time. Typical culprits: lighting shifts, board warpage, dust or flux residues, fiducial drift, and ambiguous thresholds. Even when AOI is technically correct, images can be uninformative for decisions, forcing humans to click through alarms. Vendors now combine 3D metrology and ML to curb this noise, but process context still matters. Reality check: if reviewers are simply reclassifying images without changing upstream settings, you’re moving pixels, not yield.

Stabilise optics and golden-board references before “tuning the AI”.
Segment by defect family (lifted lead, insufficient solder, skew) with class-specific thresholds.
Feed AOI with upstream data (paste height, reflow profile) to explain why a flag occurred.

AI-assisted AOI is viable when it reduces manual review and drives meaningful upstream adjustments. Otherwise, it’s a reporting tool, expensive, shiny, and ineffective.

From images to tokens: building an explainable pipeline

To get past “black-box” AOI, convert raw images into interpretable features that LLMs can reason over. A practical path: (1) image capture and 3D measurement, (2) segmentation and defect classification, (3) feature extraction (coplanarity, volume, skew, pad coverage), (4) tokenisation of findings plus line context, (5) LLM explanation with guardrails. Explanations should cite measurement evidence and show why a part fails acceptability rules. Use saliency or heatmaps to anchor text to pixels; evidence beats persuasion. Not magic—better data, clearer context.

• Store features and images; keep the model lineage for audit.
• Run “challenge sets” (hard cases) to test drift and overfitting.
• Prefer explanations that propose a single next action (tighten stencil, adjust reflow, retrain class X).

Studies show explainable aids improve expert task performance; make that benefit repeatable with versioned datasets and reproducible prompts.

Explainable AI as a decision aid improves task performance of domain experts.

Johanna Senoner et al. Scientific Reports (Nature), 2024

RAG over IPC and SOPs for root-cause clarity

Reviewers don’t need prose, they need the exact rule and step. Retrieval-Augmented Generation (RAG) pulls the relevant IPC-A-610 paragraph and the plant’s SOP/8D snippet, then drafts a short rationale: “Insufficient solder (Class 2, section X.Y). Next: verify paste aperture Z; inspect stencil wear; run reflow profile check.” Enrich RAG with a knowledge graph that maps parts, pads, defects, and actions; it boosts retrieval precision and keeps answers grounded. Guardrails: only allow citations from whitelisted sources (IPC, J-STD-001 extracts, internal SOPs). If an answer lacks a citation, it should fail closed.

Index IPC sections, FMEA tables, and past NCs with embeddings.
Attach AOI features to nodes (e.g., “QFN5 pad3 volume 42%”).
Return a one-screen brief: rule, reason, action, owner, due-by.

This shifts triage into root-cause dialogues and reduces the back-and-forth that kills throughput.

Stop alarms, start real fixes

LLMs turn AOI flags into cited, step-by-step actions using IPC and SOPs, cutting reviews and boosting first-pass yield.

KPIs to watch: false-call rate, review time, FPY

Measure impact where it hurts: False-Call Rate (FCR), Review Time per Board, and First-Pass Yield (FPY). FCR falls when AOI learns class-specific boundaries and reviewers trust explanations. Review time shrinks as LLMs pre-fill defect rationales and next steps. FPY rises when upstream changes (stencil, placement, reflow) are actually made, so tie every explanation to a corrective ticket. If a KPI can’t change staffing, stencil settings, or programming, it’s a vanity number.

FCR: false alarms / total AOI alarms (track by defect class and lot).
Review Time: median minutes from flag to disposition (OK/repair/scrap).
FPY: units passing without rework; pair with AOI escape rate.
Action Latency: time from explanation → corrective change in process.

Suppliers report lower false calls and tighter process control when 3D AOI is paired with ML and M2M feedback, make those gains visible on your dashboards.

FAQ

How do LLMs help with AOI false calls?

They translate defect codes into evidence-backed explanations and next actions, reducing manual triage.

What standards should RAG cite for acceptability?

IPC-A-610 (and J-STD-001 where relevant) for workmanship and acceptability classes.

Can this work with 3D AOI systems?

Yes, 3D metrology provides features (coplanarity, volume) that make explanations more reliable and actionable.

Which KPIs prove value to leadership?

False-call rate, review time per board, and first-pass yield, plus action latency from explanation to process change.

About the Author

Liam Rose

I founded this site to share concise, actionable guidance. While RFID is my speciality, I cover the wider Industry 4.0 landscape with the same care, from real-world tutorials to case studies and AI-driven use cases.