🦷 Deep Learning for 3D Teeth Segmentation: Revolutionizing Digital Dentistry
How we built an AI system that automatically identifies and segments individual teeth from 3D dental scans using PyTorch and PointNet
🎤 Presentation Overview: From Research to Real-World Impact
Presented by Vitoria Soria and Livia Ellen at the University of San Francisco
This project represents the culmination of our journey into the intersection of deep learning and digital dentistry. What started as an academic exploration evolved into a practical solution with real clinical applications. Our presentation highlighted how we transformed a traditionally manual, time-consuming dental workflow into an automated, AI-powered system.
[SCREENSHOT PLACEHOLDER: Slide 1 — Title slide showing “3D Teeth Segmentation and Generative AI” with our names and university affiliation]
📹 Watch our full presentation: 3D Teeth Segmentation and Generative AI Demo
The Problem: Why Dental AI Matters
Imagine visiting your dentist and having your entire mouth scanned in seconds, with AI instantly identifying every tooth, detecting potential issues, and planning your treatment automatically. This isn’t science fiction — it’s the future of digital dentistry, and it’s happening now.
Traditional dental workflows require manual tooth identification and segmentation from 3D scans, a time-consuming process that can take hours for complex cases. With over 2.3 billion people suffering from dental caries worldwide, automating this process could revolutionize dental care accessibility and accuracy.
The Challenge: In digital dentistry, one of the most time-consuming tasks is identifying and segmenting each individual tooth from a 3D intraoral scan. This process is traditionally done manually by clinicians or technicians, which is not only labor-intensive but also prone to inconsistency.
With the rise of AI in healthcare, we saw an opportunity to automate this process — making it faster, more accurate, and scalable. That’s the motivation behind our project.
[SCREENSHOT PLACEHOLDER: Slide 2 — Problem statement slide showing manual vs automated segmentation workflow]
The Challenge: Why 3D Teeth Segmentation is Hard
3D teeth segmentation presents unique challenges that make it particularly difficult for traditional computer vision approaches:
🧩 Geometric Complexity
- Similar tooth shapes: Molars, premolars, and incisors have subtle but important differences
- Patient variability: Every mouth is unique, with different sizes, orientations, and spacing
- Overlapping boundaries: Where does one tooth end and another begin?
🦷 Clinical Reality
- Dental pathologies: Cavities, crowns, and fillings alter tooth geometry
- Orthodontic equipment: Braces and retainers occlude tooth surfaces
- Missing teeth: Gaps and extractions create irregular patterns
📊 Technical Challenges
- 3D point clouds: Traditional CNNs don’t work on unstructured 3D data
- Variable point density: Intraoral scanners produce irregular mesh patterns
- Real-time requirements: Clinical workflows need fast processing
The Solution: Deep Learning Meets Dentistry
We developed a comprehensive solution using cutting-edge neural architectures specifically designed for 3D point cloud processing. Here’s our algorithm pipeline — the backbone of our approach:
🔄 Complete Algorithm Pipeline
- Preprocessing — We start by cleaning and normalizing the raw mesh scans to reduce noise and standardize inputs.
- Feature Extraction — Then, we compute geometric features like curvature and surface normals, which help the model better understand the structure of the teeth.
- Segmentation — This is where the deep learning happens. We use neural networks to classify each point in the 3D scan into a tooth category.
- Post-processing — Finally, we refine the segmentation results to ensure label consistency and anatomical accuracy across the full jaw.
This pipeline allowed us to transform messy raw scans into structured, clinically meaningful data.
[SCREENSHOT PLACEHOLDER: Slide 3 — Algorithm pipeline diagram showing the 4-step process]
🎨 Generative AI: VAE and Diffusion Models
Beyond segmentation, we explored generative AI approaches to understand and synthesize 3D dental structures. This represents the cutting edge of AI in dentistry — not just analyzing existing scans, but generating new ones.
Variational Autoencoder (VAE) for 3D Teeth
We implemented a VAE specifically designed for 3D point cloud generation:
class TeethVAE(nn.Module):
def __init__(self, latent_dim=128):
super().__init__()
# Encoder: 3D points -> latent space
self.encoder = nn.Sequential(
nn.Conv1d(3, 64, 1),
nn.BatchNorm1d(64),
nn.ReLU(),
nn.Conv1d(64, 128, 1),
nn.BatchNorm1d(128),
nn.ReLU(),
nn.Conv1d(128, 256, 1),
nn.BatchNorm1d(256),
nn.ReLU(),
)# Latent space
self.fc_mu = nn.Linear(256, latent_dim)
self.fc_var = nn.Linear(256, latent_dim) # Decoder: latent space -> 3D points
self.decoder = nn.Sequential(
nn.Linear(latent_dim, 256),
nn.ReLU(),
nn.Linear(256, 512),
nn.ReLU(),
nn.Linear(512, 1024 * 3), # 1024 points * 3 coordinates
) def encode(self, x):
# x: [batch, 3, num_points]
features = self.encoder(x)
features = torch.mean(features, dim=2) # Global pooling
mu = self.fc_mu(features)
log_var = self.fc_var(features)
return mu, log_var def decode(self, z):
# z: [batch, latent_dim]
decoded = self.decoder(z)
return decoded.view(-1, 3, 1024) # Reshape to [batch, 3, num_points] def forward(self, x):
mu, log_var = self.encode(x)
z = self.reparameterize(mu, log_var)
return self.decode(z), mu, log_var def reparameterize(self, mu, log_var):
std = torch.exp(0.5 * log_var)
eps = torch.randn_like(std)
return mu + eps * std
latent space self.encoder = nn.Sequential( nn.Conv1d(3, 64, 1), nn.BatchNorm1d(64), nn.ReLU(), nn.Conv1d(64, 128, 1), nn.BatchNorm1d(128), nn.ReLU(), nn.Conv1d(128, 256, 1), nn.BatchNorm1d(256), nn.ReLU(), ) # Latent space self.fc_mu = nn.Linear(256, latent_dim) self.fc_var = nn.Linear(256, latent_dim) # Decoder: latent space -> 3D points self.decoder = nn.Sequential( nn.Linear(latent_dim, 256), nn.ReLU(), nn.Linear(256, 512), nn.ReLU(), nn.Linear(512, 1024 * 3), # 1024 points * 3 coordinates ) def encode(self, x): # x: [batch, 3, num_points] features = self.encoder(x) features = torch.mean(features, dim=2) # Global pooling mu = self.fc_mu(features) log_var = self.fc_var(features) return mu, log_var def decode(self, z): # z: [batch, latent_dim] decoded = self.decoder(z) return decoded.view(-1, 3, 1024) # Reshape to [batch, 3, num_points] def forward(self, x): mu, log_var = self.encode(x) z = self.reparameterize(mu, log_var) return self.decode(z), mu, log_var def reparameterize(self, mu, log_var): std = torch.exp(0.5 * log_var) eps = torch.randn_like(std) return mu + eps * std” tabindex=”0" role=”button” style=”box-sizing: border-box; position: relative; display: flex !important; padding: 0px !important; font-size: 14px; font-weight: 500; line-height: 20px; white-space: nowrap; vertical-align: middle; cursor: pointer; user-select: none; border: 0px; border-radius: 6px; appearance: none; color: rgb(9, 105, 218); background-color: transparent; box-shadow: none; transition: color 80ms cubic-bezier(0.33, 1, 0.68, 1), background-color, box-shadow, border-color; justify-content: center !important; align-items: center !important; margin: 8px !important; width: 28px; height: 28px;”>
Key Innovation: The VAE learns a compressed representation of dental anatomy in latent space, enabling:
- Data augmentation: Generate synthetic dental scans for training
- Missing tooth reconstruction: Fill gaps in incomplete scans
- Anatomical variation: Explore different tooth configurations
Diffusion Models for Progressive Generation
We also experimented with diffusion models for more sophisticated 3D generation:
class DiffusionTeethGenerator(nn.Module):
def __init__(self, timesteps=1000):
super().__init__()
self.timesteps = timesteps# Noise scheduler
self.beta = torch.linspace(1e-4, 0.02, timesteps)
self.alpha = 1.0 - self.beta
self.alpha_bar = torch.cumprod(self.alpha, dim=0) # U-Net for denoising
self.denoiser = TeethUNet(
input_channels=3,
output_channels=3,
time_embed_dim=128
) def forward(self, x_0, t):
# Add noise to input
noise = torch.randn_like(x_0)
x_t = self.alpha_bar[t].sqrt().view(-1, 1, 1) * x_0 + \
(1 - self.alpha_bar[t]).sqrt().view(-1, 1, 1) * noise
return x_t, noise def sample(self, batch_size, device):
# Generate from noise
x = torch.randn(batch_size, 3, 1024, device=device) for t in reversed(range(self.timesteps)):
t_tensor = torch.full((batch_size,), t, device=device)
predicted_noise = self.denoiser(x, t_tensor) # Denoising step
alpha_t = self.alpha[t]
alpha_bar_t = self.alpha_bar[t]
beta_t = self.beta[t] x = (1 / alpha_t.sqrt()) * (x - (beta_t / (1 - alpha_bar_t).sqrt()) * predicted_noise) if t > 0:
noise = torch.randn_like(x)
x = x + beta_t.sqrt() * noise return x
0: noise = torch.randn_like(x) x = x + beta_t.sqrt() * noise return x” tabindex=”0" role=”button” style=”box-sizing: border-box; position: relative; display: flex !important; padding: 0px !important; font-size: 14px; font-weight: 500; line-height: 20px; white-space: nowrap; vertical-align: middle; cursor: pointer; user-select: none; border: 0px; border-radius: 6px; appearance: none; color: rgb(9, 105, 218); background-color: transparent; box-shadow: none; transition: color 80ms cubic-bezier(0.33, 1, 0.68, 1), background-color, box-shadow, border-color; justify-content: center !important; align-items: center !important; margin: 8px !important; width: 28px; height: 28px;”>
Applications in Dentistry:
- Treatment Planning: Generate “before/after” scenarios for orthodontic treatment
- Educational Tools: Create diverse dental anatomy for training
- Research: Explore anatomical variations and pathologies
🚀 Architecture 1: PointNet
PointNet, pioneered by researchers at Stanford, was the first neural network capable of directly processing 3D point clouds without converting them to voxels or images.
class PointNetSegmentation(nn.Module):
def __init__(self, num_classes=49):
super().__init__()
# Input transformation network
self.input_transform = nn.Sequential(
nn.Conv1d(3, 64, 1),
nn.BatchNorm1d(64),
nn.ReLU(),
# ... more layers
)
# Point feature extraction
self.point_features = nn.Sequential(
nn.Conv1d(3, 64, 1),
# ... feature extraction layers
)Key Innovation: PointNet uses permutation invariance — it doesn’t matter what order the points are in, the network produces the same result. This is crucial for 3D dental scans where point ordering is arbitrary.
🌟 Architecture 2: PointNet++
Next, we tried PointNet++, which builds on PointNet by incorporating local neighborhood information. This helps with learning hierarchical features and improves performance on finer details like tooth boundaries.
🎯 Architecture 3: Custom Multi-Task Network
Building on PointNet’s foundation, we designed a custom architecture that simultaneously:
- Segments teeth (which tooth is each point?)
- Predicts instances (how many teeth are there?)
- Uses 6D features (XYZ coordinates + surface normals)
- Adds jaw type classification and instance separation tasks
class TeethSegmentationNet(nn.Module):
def forward(self, x):
# x: [Batch, 6, Points] - XYZ + normals
local_feat = self.shared_features(x)
global_feat = self.global_features(local_feat) # Multi-task outputs
seg_output = self.segmentation_head(combined_feat)
inst_output = self.instance_head(combined_feat)
return seg_output, inst_outputWe implemented everything in PyTorch and trained the models on GPU.
[SCREENSHOT PLACEHOLDER: Slide 5 — Model architectures comparison showing PointNet, PointNet++, Custom Multi-task model, VAE, and Diffusion models]
The Data: Real Clinical Dental Scans
The project uses the 3DTeethSeg22 Challenge dataset from MICCAI 2022:
- 1,800 3D intraoral scans from 900 patients
- Upper and lower jaws with complete FDI numbering
- Clinical quality data from real dental practices
- Ground truth labels for every vertex in the mesh
📊 Dataset Split
For this project, we used a publicly available dataset from MICCAI 2022, containing 1,800 3D intraoral scans from 900 patients.
We split the data into:
- 1,200 scans for training, which included around 16,000 labeled teeth
- 600 scans for testing, with 8,000 teeth
These included both upper and lower jaws, giving us good variability in anatomical structure.
[SCREENSHOT PLACEHOLDER: Slide 4 — Data source slide showing dataset statistics and split information]
🔢 FDI Numbering System
The Fédération Dentaire Internationale (FDI) system is the global standard for tooth identification:
- Quadrant 1: Upper right (11–18)
- Quadrant 2: Upper left (21–28)
- Quadrant 3: Lower left (31–38)
- Quadrant 4: Lower right (41–48)
The Training: Teaching AI to Think Like a Dentist
📈 Training Strategy
We employed several advanced techniques to ensure robust performance:
- Data Augmentation: Random rotations, noise, and scaling to simulate real-world variations
- Multi-task Learning: Simultaneous segmentation and instance prediction
- Custom Loss Functions: Combined Dice loss and Cross-Entropy for better segmentation
class DiceAwareLoss(nn.Module):
def forward(self, pred, target):
ce_loss = self.ce_loss(pred, target)
dice_loss = self.dice_loss(pred, target)
return (1 - self.dice_weight) * ce_loss + self.dice_weight * dice_loss⚡ Training Results
After training all three models for multiple epochs:
- Custom Multi-task Model: 2.3M parameters, 6D input features
- PointNet++: Enhanced hierarchical features
- PointNet: 1.4M parameters, 3D input features
- Processing Speed: <1 second per scan
- Accuracy: >85% vertex-level segmentation accuracy
The Results: AI That Understands Teeth
🎯 Segmentation Results (IoU)
We evaluated the model performance using Intersection over Union (IoU), a standard metric in segmentation tasks.
Across most tooth types, our custom model achieved strong results — especially on molars and premolars, which have distinct shapes. The results were slightly lower for tightly packed or overlapping teeth like canines, which is an area for future improvement.
Overall, we demonstrated that deep learning can robustly segment complex anatomical structures from raw point clouds.
[SCREENSHOT PLACEHOLDER: Slide 6 — Segmentation results showing IoU scores across different tooth types]
🎯 Dental-Specific Metrics
Unlike general computer vision tasks, dental segmentation requires specialized evaluation metrics:
- TSA (Teeth Segmentation Accuracy): F1-score over all tooth point clouds
- TLA (Teeth Localization Accuracy): Normalized distance between predicted and actual tooth centroids
- TIR (Teeth Identification Rate): Percentage of correctly identified teeth
📊 Performance Comparison
ModelTSATLATIROverall ScoreCustom Multi-task0.890.870.850.87PointNet++0.850.830.820.83PointNet0.820.790.780.80Literature Baseline0.850.820.800.82
Key Finding: The custom multi-task architecture outperformed PointNet by 8.8%, demonstrating that dental-specific design choices matter.
Real-World Impact: From Research to Practice
🏥 Clinical Applications
This technology enables several breakthrough applications:
- Automated Treatment Planning: AI identifies teeth requiring attention
- Orthodontic Analysis: Precise measurement of tooth positions and movements
- Quality Assurance: Consistent, objective dental assessments
- Telemedicine: Remote dental consultations with AI-assisted diagnosis
⏱️ Efficiency Gains
- Manual segmentation: 2–4 hours per case
- AI-assisted segmentation: <5 minutes per case
- Potential time savings: >95% reduction in processing time
💰 Economic Impact
With dental CAD market projected to reach $2.9 billion by 2027, automated segmentation could:
- Reduce dental lab costs by 30–40%
- Enable same-day dental restorations
- Improve treatment accessibility in underserved areas
Technical Deep Dive: The Implementation
🔧 Data Preprocessing Pipeline
class TeethDataPreprocessor:
def process_sample(self, mesh, labels_dict, augment=False):
# 1. Sample fixed number of points
sampled_vertices, labels, instances = self.sample_points(...)# 2. Normalize to unit sphere
normalized_points = self.normalize_points(sampled_vertices) # 3. Compute surface normals
normals = self.compute_normals(mesh, vertex_indices) # 4. Data augmentation (if training)
if augment:
normalized_points = self.augment_data(normalized_points) # 5. Combine XYZ + normals for 6D features
features = np.concatenate([normalized_points, normals], axis=1) return torch.FloatTensor(features.T) # [6, N] for model
🧠 Model Innovation
The custom architecture introduces several key innovations:
- 6D Feature Processing: Combines spatial coordinates with surface geometry
- Global-Local Feature Fusion: Captures both tooth-level and mouth-level patterns
- Multi-task Output: Simultaneous segmentation and instance prediction
- Attention Mechanisms: Focuses on discriminative tooth boundaries
📦 Complete Codebase
The full implementation includes:
- Data loaders for 3D mesh processing
- Training pipelines with validation and checkpointing
- Evaluation metrics specific to dental applications
- Interactive visualization tools for result analysis
- Web interface for real-time testing
Challenges and Lessons Learned
🚧 Technical Challenges
- Memory Constraints: 3D point clouds require significant GPU memory
- Data Imbalance: Gingiva (gums) vs. teeth vertex distribution is heavily skewed
- Evaluation Complexity: Dental metrics are more nuanced than standard IoU/accuracy
🎓 Key Learnings
- Domain Expertise Matters: Understanding dental anatomy was crucial for feature engineering
- Multi-task Learning Works: Joint segmentation and instance prediction improved both tasks
- Data Quality > Quantity: High-quality synthetic data often outperformed noisy real data
- Visualization is Critical: 3D results require specialized visualization tools
🔮 Future Improvements
- Graph Neural Networks: Model tooth adjacency relationships
- Temporal Modeling: Track changes across multiple scans
- Pathology Detection: Identify cavities, fractures, and anomalies
- Real-time Processing: Optimize for live intraoral scanner integration
Practical Limitations and Future Work
As with any real-world AI project, we faced some practical limitations:
🐌 Performance Challenges
- Large mesh files slowed down training and visualization. Many of our meshes had over 400,000 vertices, which overwhelmed the browser and made real-time rendering nearly impossible.
- Visualization pipeline needed to subsample the mesh, which sometimes resulted in overly coarse displays.
- Training time took a long time, especially for the custom model. This is something we’d improve with better computing resources or cloud-based infrastructure.
🚀 Scalability Solutions
Despite these hurdles, we’re proud of what we achieved in a relatively short time. We believe that with more training data, faster hardware, and improved visualization, this approach can be scaled into real clinical settings.
[SCREENSHOT PLACEHOLDER: Slide 7 — Future improvements slide showing limitations and potential solutions]
Open Source and Reproducibility
🌟 Available Resources
We’ve made this project fully reproducible:
- Complete Jupyter Notebook: Step-by-step implementation with detailed explanations
- Trained Models: Pre-trained weights for both PointNet and custom architectures
- Interactive Demo: Web interface for testing with your own dental scans
- Documentation: Comprehensive setup and usage guides
🔬 Research Impact
This work contributes to the growing field of medical AI by:
- Demonstrating practical application of 3D deep learning to healthcare
- Providing open-source tools for dental research community
- Establishing benchmarks for future 3D dental segmentation research
Conclusion: The Future of AI-Powered Dentistry
This project demonstrated the potential of generative and deep learning models in digital dentistry. By combining advanced point cloud architectures with clean anatomical data, we were able to build a system that automates one of the most tedious steps in dental workflows.
🌍 Global Health Impact
- Accessibility: AI-powered dental analysis in underserved regions
- Consistency: Standardized, objective dental assessments worldwide
- Education: Training tools for dental students and practitioners
🚀 Technology Advancement
- 3D Deep Learning: Pushing boundaries of point cloud processing
- Medical AI: Contributing to the broader healthcare AI revolution
- Open Science: Providing tools and datasets for research community
💡 Personal Reflection
Building this system taught us that the intersection of AI and healthcare requires not just technical expertise, but deep domain understanding, careful validation, and constant awareness of real-world impact. Every line of code potentially affects patient care.
The future of dentistry is digital, intelligent, and patient-centered. This project is one small step toward that future.
[SCREENSHOT PLACEHOLDER: Slide 8 — Conclusion slide summarizing key achievements and future potential]
🔗 Try It Yourself
Interested in exploring 3D dental AI? Here’s how to get started:
- 🔧 Fork the Code: GitHub repository with full source code
- 📹 Watch Demo: Full project presentation and demo
What applications of AI in healthcare excite you most? Share your thoughts in the comments below!
Follow us for more deep dives into AI applications in healthcare, computer vision, and 3D deep learning.
Tags
#ArtificialIntelligence #DeepLearning #Healthcare #Dentistry #3DComputerVision #PyTorch #MedicalAI #DigitalHealth #MachineLearning #PointCloud #GenerativeAI #VAE #DiffusionModels
