
Want to learn more? Book a presentation with our team
FAST, a full-search based tool to rapidly execute features selection, is the first software of a suite designed to lead computational chemists, labs and companies along the whole QSAR modelling process

Learn more about QUEEN and the whole ALChemy Suite


Want to learn more? Book a presentation with our team

While building a machine learning model, identifying a proper subset of features plays a crucial role. In QSAR modelling, where the amount of molecular descriptors available in the initial dataset is enormous, this step is the one that most affects the final performance. Since the literature hasn’t identified a criterion to a priori determine the best approach among the many available, we’ve developed a tool that systematically evaluates several of them and pick the one that, case by case, shows the best performance.
FAST is the first tool of the suite that Kode Chemoinformatics developed to support computational chemists, labs and organisations in the chemistry, pharma, food and biotechnology fields. The client is thus led through the whole process of creating, managing, and deploying QSAR models, from features selection to prediction.
In order to guarantee the best features subset, minimising the out-of-sample error, FAST faces Features Selection through three sequential steps. Each step has increasing computational cost and strictness which progressively reduce the dataset size. The sequential approach allows to focus the computational effort of each step on datasets with a proper size. It makes the new dataset affordable for the following step. With the time saved, testing a large number of different solutions is therefore accessible even on a standard laptop. Finally, FAST calculates the quality of each set in terms of prediction performance through one or more machine learning models.

FAST identifies a set of solutions. Each of them is accompanied by an accuracy index. Allowing to determine, in the end, which set performs best. This result is achieved by a three steps method:
Removal of those features that are redundant or that exhibit characteristics not compatible to the modelling.
Removal of all the features with low or insufficient importance within the models
Cross-validated selection of features that ensures the optimal bias-variance trade-off
Questions about our products?
Drop us a few lines via email!