Estimator

May 25, 2020

Estimator is a Machine Learning enabled platform to predict critical chain length of a polymer corresponding to its saturated solubility parameter.

Checkout the product - Estimator

Cover Image Credits - Science-In-Chains

Introduction

Now-a-days machine learning is extensively being used by researchers to accelerate the development process of novel materials. Development of novel polymers can also be accelerated by using machine learning algorithms in many ways. i.e.

Predicting properties of a novel polymer before synthesizing it in the laboratory.
Tuning the design of a polymer to have desired set of properties.
Extracting valuable information from failed experiments by traning ML models.
For accelerating material simulation & materials design models (CGM, PSO etc.)

So, whenever we have sufficient amount of data and we need to extract some valuable information from that or need to make some prediction, we can use appropriate machine learning algorithm.

Problem Statement

The problem on which this project is based on is predicting critical chain length of a desired polymer corresponding to its saturated solubility parameter.

Critical Chain Length & Saturated Solubility Parameter

Solubility parameter of a polymer depends on its chain length and after a certain chain length it becomes saturated (no more significant change) more details can be found in literature

Now the issue is the calculation of this value which involves molecular simulations. This process is very slow, computesionaly expensive and inefficient. Our apporach is to solve this problem using machine learning so that we can directly predict the chain length at which the solubility parameter of a polymer will become saturated.

Machine Learning based solution

So, we are trying to train a machine learning model on a dataset which is having the data of polymers and their critical chain length. The very first issue is making such dataset because we need all the data in mathematical form. We used group conribution theory and developed a python program which is used extract features from a repeting unit structure and make a fingerprint of that polymer using feature vectors and ultimately represents the polymer in numerical form. So, the processed dataset have numerical representation of a polymer and its critical chain length.

Next we are developing a machine learning model from scratch to learn the dependency of critical chain length on various features. Firstly we used linear hypothesis for this purpose because we need continuous outputs. Now we are working on support vector regression model to get better efficiency. We are also working on self evolving machine learning algorithms.

Challenges

Appropriate numerical representation of a polymer (Using some optimization algorithm)
Collecting sufficient amount of data (Using data augmentation techniques)
Optimizing number of features to train the model. (Literature Survey Going On)
Mapping function to map features vectors to desired property. (Literature Survey Going On)

Machine Learning Polymers