Machine Learning–Based Neuroimaging in Epilepsy: Improving Detection of Epileptogenic Lesions
Machine learning tools are transforming the detection of epileptogenic lesions that are frequently missed on conventional MRI, with several pipelines now validated across international epilepsy centers.
Approximately one-third of people with epilepsy have drug-resistant epilepsy.1 For these individuals, surgical resection of the epileptogenic zone remains the most effective treatment. Despite advances in technology and surgical techniques over the past 2 decades, postoperative seizure freedom rates have remained modest, at 50% to 70%.2 A major limitation to improving surgical outcomes is the accurate identification and localization of epileptogenic lesions.
Focal cortical dysplasia (FCD) and hippocampal sclerosis (HS) are common etiologies for drug-resistant focal epilepsies. However, they are not accurately identified on conventional (1.5T or 3T) brain MRI in an estimated 30% to 50% of individuals.3 When no clear surgical target is identified after comprehensive presurgical evaluation, people with these disorders may be deemed unsuitable surgical candidates or undergo surgery without clinical benefit, which may result in persistent uncontrolled seizures.
Recent advances in machine learning (ML) applications, such as the development of deep learning pipelines and multicenter-validated platforms, offer the potential to improve the detection of epileptogenic lesions on neuroimaging to potentially identify the epileptogenic zone. Automated tools for detecting FCD on anatomic MRI to uncover structural lesions have evolved from single-center research prototypes to multicenter-validated, open-source platforms being deployed clinically worldwide.
A novel deep learning pipeline using advanced neural network architectures now enables the detection of HS in cases previously deemed MRI-negative, including at the hippocampal subfield level, thereby enhancing diagnostic accuracy and treatment planning. Several independent direct comparisons of competing detection algorithms have been published.4,5 In addition, unsupervised ML has identified potential distinct disease subtypes within temporal lobe epilepsy (TLE).6 We highlight the computer-based tools most likely to affect clinical practice and review ML approaches in neuroimaging, illustrating key capabilities and limitations relevant to neurologists.
Artificial Intelligence–Based FCD Detection
Established Tools: MAP and deepFCD
Huppertz et al’s7 Morphometric Analysis Program (MAP), introduced in 2005, represented one of the first systematic approaches to automated FCD detection. MAP employs voxel-based MRI post-processing to compare individual patient scans with a normal control database. This generates feature maps that highlight abnormalities in gray matter distribution and blurring of the gray-white matter junction.6 MAP established the feasibility of computational lesion detection, but its broader clinical adoption has been limited by susceptibility to false-positive findings, reduced sensitivity for subtle abnormalities such as FCD type I lesions (Palmini classification), and dependence on expert interpretation. While the detection rates for more visually apparent lesions are high, performance in subtle or “occult” abnormalities is more variable. However, in selected MRI-negative cases, MAP has been shown to improve lesion detection.8
Gill et al’s9 deepFCD, introduced in 2021, addressed several of these limitations by using a 3-dimensional convolutional neural network (3D CNN) with Bayesian uncertainty estimation, trained on T1-weighted and fluid-attenuated inversion recovery (FLAIR) MRI sequences.9 The Bayesian uncertainty estimation provides a clinically useful confidence measure for each detection, enabling more informed triage of flagged regions. In a multicenter validation study across 9 centers involving 148 participants with epilepsy with histologically verified FCD, 51% were initially MRI-negative for a lesion.9 deepFCD achieved 93% overall sensitivity for lesion detection using leave-one-site-out cross-validation and 85% sensitivity in MRI-negative cases, with an average of 6 false-positives per participant.9 The true lesion was among the highest confidence clusters in 73% of participants. In an independent cohort of 23 individuals, the model achieved 83% sensitivity for lesion detection and a specificity of 89% to 90% defined by the absence of false-positive detections in healthy controls and disease controls (individuals with temporal lobe epilepsy and HS).9

Figure 1. Multicentre Epilepsy Lesion Detection (MELD) Graph results in an individual with proven focal cortical dysplasia showing correct identification of focal cortical dysplasia in sagittal (A), magnetization-prepared 2 rapid acquisition gradient echo (MP2RAGE) (B), and 3-dimensional edge-enhancing gradient echo (3D-EDGE) (C) sequences (arrows). In another individual, the MELD Graph identifies a false-positive in the right frontal lobe (D) but fails to correctly identify a base of sulcus dysplasia identified on expert review of MP2RAGE (E) and 3D-EDGE (F) sequences (arrows).
MELD Graph: A 2025 Breakthrough
The Multicentre Epilepsy Lesion Detection (MELD) Project represents the largest multicenter collaborative initiative for artificial intelligence (AI)–based FCD detection. Its original 2022 algorithm used a neural network trained on individuals from 22 centers (618 participants with FCD and 397 controls) using surface-based morphologic features, achieving sensitivity of 59% (67% when including a border zone around lesions to account for uncertainty in manual delineation). Like MAP, it had a high number of false-positive predictions.10
In early 2025, the MELD Graph algorithm addressed this limitation by applying graph neural networks to the surface-based framework.11 Trained on data from 1185 participants across 23 international centers, including both adult and pediatric participants, the MELD Graph reduced the maximum number of false-positive clusters from 7 to 2 per individual in the independent test cohort, thus improving its potential for integration with clinical workflow.11 The algorithm generates individualized reports that show the predicted lesion locations (Figure 1), confidence scores, and the specific imaging features underlying each detection, thereby also providing the interpretability necessary for clinical decision-making. MELD Graph is freely available as an open source tool.
First Head-to-Head Comparisons
Until recently, each ML tool targeting FCD detection has been largely validated on separate datasets, precluding meaningful cross-tool comparisons. Two key publications bridged this gap. Kersting et al4 conducted the first independent head-to-head comparison of 6 AI models across 4 epilepsy centers. The 2 models using 3D input data performed best, achieving detection rates of up to 82%, with the newly trained 3D nnU-Net demonstrating the best balance between precision and sensitivity. No single established tool demonstrated clear superiority. The MELD Graph provided the most consistent performance across centers.4
A separate meta-analysis of 41 studies investigating AI algorithms for FCD detection found pooled sensitivity of 81% and specificity of 92% on internal validation. However, when applied to external datasets, these values dropped to 73% and 66%, respectively.5 These findings highlighted limited generalizability as the main challenge to clinical translation. To date, none of these tools have received regulatory approval, and they remain designated for research use only (Table).
A comprehensive review by Bernasconi et al3 provides further discussion of automated detection approaches for HS and FCD, including federated learning and other strategies to improve generalizability across centers.
Despite encouraging results, several important limitations remain. First, many AI-based FCD detection studies are affected by case selection and histopathologic spectrum bias. International League Against Epilepsy (ILAE) type I FCD is both more difficult to detect on MRI and less likely to proceed to surgery, and therefore lack histopathologic confirmation. As a result, type II is commonly underrepresented on neuroimaging studies, in contrast to type II lesions, which are used in training and validation cohorts. In the original multicenter MELD study, for example, detection was substantially lower for type I FCD than for type IIA or IIB lesions (50.0% vs 64.6% and 76.8%, respectively). The highest reported sensitivity was achieved in a restricted gold standard subgroup of seizure-free individuals. These participants had histologically confirmed type IIB FCD (the most visible FCD subtype) and both T1-weighted and FLAIR imaging available.10 The detection rate was higher for type I FCD with the MELD Graph vs MELD, but participants with type I FCD only made up 5% of the cohort, again highlighting their underrepresentation. This raises a concern that published performance metrics may overestimate clinical benefit in the very individuals for whom adjunctive detection is most needed, particularly those with subtle, MRI-occult, and histologically less conspicuous lesions.
Second, the incremental value of these tools has not yet been well established against newer epilepsy MRI approaches that may improve lesion conspicuity beyond conventional T1-weighted and FLAIR imaging at 1.5T or 3T. Advanced techniques such as 3D edge-enhancing gradient echo (3D-EDGE) and ultra-high-field 7T MRI have shown improved visualization of subtle dysplastic features in selected cohorts, including additional lesion detection in individuals previously considered MRI-negative. The authors’ (unpublished) experience with the MELD Graph has found a <50% detection rate of FCD that was subsequently identified on 7T MRI by expert review. Therefore, the current ceiling on AI performance may reflect not only limitations of the algorithms themselves, but also limitations of the input data. Overall, AI-based FCD detection remains promising. Current systems appear to perform best for individuals with conspicuous lesions, especially type II FCD. Therefore, current algorithms may underperform in detection of lesions where diagnostic assistance is most needed. Future progress will likely depend not only on better models but also on richer imaging inputs, including advanced structural sequences, ultra-high-field MRI, and multimodal integration.

Hippocampal Sclerosis: Detection, Subfield Analysis, and Disease Redefinition
HS is a major surgical pathology in epilepsy, with an estimated 30% to 50% of cases initially reported as normal on conventional MRI. ML algorithms have demonstrated high accuracy (often >90%) in lateralizing HS in patients with TLE. Caldairou et al12 demonstrated that a classifier combining surface-based morphologic (T1-weighted) and intensity-driven features (T2-weighted and FLAIR/T1) could lateralize HS with 93% accuracy, including 76% to 90% accuracy in MRI-negative cases. However, the most significant recent advances extend beyond lateralization to subfield-level analysis and automated detection in radiologically occult cases.
HippUnfold, an open-source deep learning tool developed by DeKraker et al,13 uses a self-configuring neural network (nnU-Net) to segment the hippocampus and derive subfield-level metrics (subiculum, CA1 through CA4, and dentate gyrus). It also extracts surface-based morphologic features, including thickness, curvature, and gyrification, from standard structural MRI. Unlike conventional whole-hippocampal volumetric tools, HippUnfold generates topologically constrained hippocampal surfaces analogous to what FreeSurfer provides for the neocortex, enabling vertex-wise morphometric analysis at the subfield level. This can be clinically relevant. Whole hippocampal volumetrics are insufficient for detecting focal subfield damage because subtypes of HS exist that may comprise only a small fraction of the total hippocampal volume. For example, type 3 HS preferentially affects the hippocampal hilum, which makes up less than one-fifth of the total hippocampal volume and cannot be accurately captured by whole-hippocampal volumetric measurements alone. Hippocampal subfield volumetrics can provide better histologic correlations and an improved clinical biomarker.
Building on HippUnfold-derived features, Ripart et al14 developed the automated and interpretable detection of HS (AID-HS) pipeline. AID-HS compares surface-based hippocampal features against normative growth charts and computes within-subject asymmetry scores between left and right hippocampi to classify HS presence and laterality. Validated across 18 international epilepsy centers including 815 participants, 426 with unilateral HS, it achieved a sensitivity of 90.1% for lesion detection and 97.4% for lateralization in people with unilateral HS. In one of the most clinically consequential subgroups—individuals with histologically confirmed HS that was missed on standard radiologic review—AID-HS was able to detect 79.2% of cases. The pipeline input requires only a 3D T1-weighted MRI scan, age, and sex, and generates individualized, interpretable reports suitable for clinical integration. Performance was consistent across pediatric and adult populations and across both 1.5T and 3T MRI scanners.
A further advancement has been application of data-driven approaches to characterize heterogeneity within TLE. Jiang et al6 applied an unsupervised ML algorithm, Subtype and Stage Inference (SuStaIn), to structural MRI data from 296 individuals with TLE, and identified 4 biologically distinct disease subtypes. Two subtypes exhibited predominant hippocampal atrophy, with surgical effectiveness of ~70%. A third subtype featured cortex-predominant atrophy, with the hippocampus affected later in the disease course and surgical effectiveness of ~63%. The fourth subtype showed no cortical atrophy but featured amygdala enlargement, with ~45% surgical effectiveness. These findings support the concept that TLE comprises biologically distinct subtypes rather than a single entity. If validated prospectively, this subtype classification could help guide surgical candidacy, which would spare individuals with the amygdala-predominant subtype from interventions that are unlikely to succeed. In addition, it would prioritize those with hippocampal-predominant patterns for early surgery.
Seizure Onset Zone Localization and Other Emerging Applications
ML has also been applied to functional neuroimaging for noninvasive localization of the seizure onset zone. Luckett et al15 demonstrated that a deep learning model trained on resting-state functional MRI scans could lateralize the hemisphere of seizure onset in TLE with >90% accuracy. This could potentially provide an advantage for individuals with cognitive impairment, who may be unable to cooperate with task-based paradigms. However, these findings were derived from a small cohort of 32 participants with TLE. Larger prospective studies will be needed before this approach can be considered for routine clinical use.
Using magnetoencephalography, Sun et al16 developed a personalized deep learning electromagnetic source imaging approach. In a cohort of 29 participants with drug-resistant focal epilepsy, their method achieved 93% sublobar concordance with intracranial EEG-defined seizure onset zones, with a mean spatial dispersion of ~8 mm relative to the surgical resection and a mean localization error of ~16 mm.

Figure 2. Recent machine learning applications in epilepsy neuroimaging. Abbreviations: 3D, 3-dimensional; CNN, convolutional neural network; FCD, focal cortical dysplasia; FLAIR, fluid-attenuated inversion recovery; fMRI, functional MRI; HS, hippocampal sclerosis; MAP, morphometric analysis program; MEG, magnetoencephalography; ML, machine learning; RS-fMRI, resting-state functional MRI; SOZ, seizure onset zone; TLE, temporal lobe epilepsy.
Summary
The key practical message for neurologists is that an MRI reported as normal should no longer end the diagnostic workup for individuals with drug-resistant focal epilepsy undergoing presurgical evaluation. Tools such as the MELD Graph and AID-HS have demonstrated automated detection of both FCD and HS, including in cases previously deemed MRI-negative, and have been used across diverse clinical settings. Early head-to-head comparisons and meta-analyses have begun to establish performance benchmarks. However, achieving generalizability across scanners and patient populations remains a challenge.
Referral of MRI-negative individuals with drug-resistant epilepsy to comprehensive epilepsy centers, where ML tools can complement expert neuroradiologic review, is increasingly justified by the evidence supporting improved lesion detection. Regulatory approval and prospective validation studies remain necessary before clinical deployment becomes routine. Nonetheless, the trajectory is clear: AI-augmented neuroimaging is poised to become a standard component of presurgical epilepsy evaluation. The emergence of distinct TLE biotypes further suggests that the field is moving toward a precision model of epilepsy care in which treatment decisions are guided not only by lesion detection but also by disease subtype (Figure 2). However, although these tools support interpretation, gaps in detection persist among AI-based interpretations, expert review, and advanced MRI acquisitions; therefore, none of these approaches should be used in isolation.
Ready to Claim Your Credits?
You have attempts to pass this post-test. Take your time and review carefully before submitting.
Good luck!
Recommended
- Epilepsy & Seizures
Functional Neurologic Symptom Disorder Resolving After Endoscopic Encephalocele Repair
Charles F. Palmer, MD; Rodney J. Schlosser, MDCharles F. Palmer, MD; Rodney J. Schlosser, MD - Imaging & Testing
Practical MRI in Neurology: Use Cases, Technologies, and Opportunities for Outpatient Clinics
Joseph Fritz, PhD; Edmond A. Knopp, MD, DABR, FASFNRJoseph Fritz, PhD; Edmond A. Knopp, MD, DABR, FASFNR - Imaging & Testing
Artificial Intelligence in Clinical Neurology: Opportunities, Limitations, and the Path Forward
Aysha Jadran, MD; Saqib A. Chaudhry, MDAysha Jadran, MD; Saqib A. Chaudhry, MD






