Training
Modern experimental and computational methods, such as high throughput experimentation or data mining, can rapidly generate large datasets. However, most organic chemists are not well trained to quantitatively analyze such large datasets. C-CAS provides training for a new generation of “data chemists” looking for a career applying computational and data science to synthesis. Through co-mentoring and workshops for center participants, C-CAS bridges the gap between chemistry and data science in both academia and industry.
Training Resources
A short course by the Sigman Lab
Short Course in Multivariate Linear Regression Models
1.0 Introduction to the Short Course
1.1 What are Linear Free Energy Relationships (LFER)?
2-0 Why is Conformations Searching Important?
2-1 Conformational Searches Using Molecular Mechanics
2-2 Conducting a Conformational search in MacroModel
2-4 Submitting a QM Calculation through Utah's CHPC
3-1 Using Python to Parameterize Molecules
4-0 Intro to Statistical Modeling Strategy
4-1 Interpreting Statistical Models
Training videos of C-CAS on the C-CAS Youtube Channel
Introduction to Bayesian Optimization
Part 1: Introduction to Bayesian Optimization
Part 2: Applications to "over-the-arrow" optimization
In these videos, Ben Shields from the Doyle group explains the basics of Bayesian optimization and its application to finding the best reaction conditions. The work explained in these video is published in a recent Nature paper by the Doyle group.
Conformational Searching
Part 1: Introduction to Conformational Searching
Part 2: Conformational Searching in Macromodel
In these videos, Liliana Gallegos and Guillian Luchini from the Paton Group, together with Jessica Wahlers and Kevin Koh from the Wiest group explain different approaches to conformational searching of small molecules.
Generating Potential Energy Surfaces in Python
Lilian Gallegos from the Paton group explains how information on potential energy surfaces can be extracted from Gaussian outputs using a set of python scripts
Graph Neural Networks: Basics and Applications
Part 1: Representing molecules as Graph Neural Networks (GNN)
Part 3: Heterogeneous Knowledge Graphs
Part 4: Property Prediction using GNNs
Mandana Saebi, Zhichun Guo and Chuxu Zhang from the Chawla group explain what graph neural networks are and how they can be used to represent and predict chemical properties and reactions.
Data Scrubbing
Bozhao Nan from the Wiest group explains workflows to prepare real-world datasets for application in machine learning.
Synthesis Planning using Synthia
Melissa Hardy and Brandon Wright from the Sarpong group explain the concepts and application of computer-aided synthesis planning using Synthia®
Modern Steric Parameters
Guillian Luchini from the Paton Group demonstrates the use of python scripts to generate a series of modern steric parameters for the featurizations of molecules.
The Center is building up a curated resource library of videos and publications that members find useful:
- Andrew Ng’s online lectures on machine learning have often been described as a “Rite of Passage” for many interested in this topic.