AtomDisc: An Interpretable Atom-level Tokenizer that Boosts Molecular LLMs and Reveals Structure–Property Relationships
Submitted to Nature Machine Intelligence
Recent advances in large language models (LLMs) have spurred growing interest in their application to molecular modeling and property prediction. However, existing molecular LLMs either rely solely on SMILES strings—thus ignoring rich at...