Resource: Introducing SNAC-DB: A New Resource for Antibody & NANOBODY® VHH-Antigen Modeling
Submitted by Abhinav Gupta; posted on Friday, July 25, 2025

Submitter
Predicting antibody and NANOBODY® VHH--antigen complexes remains a notable gap in current AI models, limiting their utility in drug discovery. We present SNAC-DB, an open-source, machine-learning-ready database and pipeline developed by structural biologists and ML researchers to address this challenge.

Key features of SNAC-DB include:
  • Expanded Coverage: 32% more structural diversity than SAbDab, capturing overlooked assemblies such as antibodies/nanobodies as antigens, complete multi-chain epitopes, and weak CDR crystal contacts.
  • ML-Friendly Data: Cleaned PDB/mmCIF files, atom37 NumPy arrays, and unified CSV metadata to eliminate preprocessing hurdles.
  • Transparent Redundancy Control: Multi-threshold Foldseek clustering for principled sample weighting, ensuring every experimental structure contributes.
  • Rigorous Benchmark: An out-of-sample test set comprising public PDB entries post--May 30, 2024 (disclosed) and confidential therapeutic complexes.
Using this benchmark, we evaluated six leading models (AlphaFold2.3?multimer, Boltz-2, Boltz-1x, Chai-1, DiffDock-PP, GeoDock) and found that success rates rarely exceed 25 %, built-in confidence metrics and ranking often misprioritize predictions, and all struggle with novel targets and binding poses.

I had the opportunity to present this work last week at the [ICML] Int'l Conference on Machine Learning 2025 Workshop on DataWorld: Unifying Data Curation Frameworks Across Domains (https://dataworldicml2025.github.io) in Vancouver. I hope SNAC-DB will accelerate the development and evaluation of more accurate models for antibody/NANOBODY® VHH complex prediction.

We welcome any comments on how to make this resource more user-friendly or of any deficiencies.

Best,
Abhinav Gupta
Senior AI/ML Scientist
Large Molecule Research, Sanofi
Email: abhinav.gupta[at]sanofi.com

Expanded view | Monitor forum | Save place

Start a new thread:
You have to be logged in to post a reply.

© 1998-2025 Scilico, LLC. All rights reserved.