InsilicoΣ
Drug Discovery, Cheminformatics & Bioinformatics
About Our Team Publications Mobile App Contact Us Login Register

About QSAR-X

Advanced QSAR Modeling Platform

Cite Us
How to Cite

QSAR-X - Multi-Dimensional QSAR Modelling Suite is part of the InsilicoSigma computational platform. A formal platform-level citation is in preparation; until it is published, please attribute results as follows:

  • Tool name: QSAR-X
  • Platform: InsilicoSigma (insilicosigma.com)
  • Access date: the date you ran your analysis (the platform is under active development; results may differ between releases).

Example acknowledgement: "Analyses were performed using QSAR-X on the InsilicoSigma platform (insilicosigma.com), accessed [DATE]."


Underlying methodology you should also cite:

RDKit (rdkit.org), scikit-learn (Pedregosa et al., JMLR 2011), CoMFA (Cramer et al., JACS 1988), CoMSIA (Klebe et al., J Med Chem 1994).

For a citable preprint or DOI as soon as one is issued, please contact the authors.

Overview

QSAR-X is a comprehensive platform for Quantitative Structure-Activity Relationship modeling, offering advanced capabilities for 2D, 3D, and 4D-QSAR studies.

The platform integrates state-of-the-art machine learning algorithms, molecular descriptor calculation, feature selection methods, and model interpretability tools to enable robust and reproducible QSAR model development.

System Architecture
Core Components
  • Data Processing Engine

    Handles molecular data validation, cleaning, and standardization

  • Descriptor Calculation Module

    Computes 1D, 2D, and 3D molecular descriptors using RDKit

  • Feature Selection Framework

    Multiple algorithms for optimal feature subset selection

  • ML Training Pipeline

    Automated model training with hyperparameter optimization

  • Model Validation Suite

    Rigorous cross-validation and external validation tools

  • Explainability Engine

    SHAP, LIME, and feature importance analysis for model interpretation

Key Principles
Design Philosophy
  • Multi-Dimensional Modeling

    Support for 2D, 3D, and 4D-QSAR approaches

  • Best Practices Compliance

    Follows OECD QSAR validation principles

  • Transparency & Interpretability

    Explainable AI for understanding model predictions

  • Reproducibility

    Complete tracking of all modeling steps and parameters

  • Applicability Domain

    Built-in tools for assessing prediction reliability

Modeling Workflow
ML-QSAR Workflow
1
Dataset Upload & Quality Check

Upload molecular data with activity values. Automated quality checks identify issues and provide recommendations.

2
Descriptor Calculation

Compute comprehensive molecular descriptors (physicochemical, topological, electronic, etc.)

3
Feature Selection

Apply feature selection algorithms to identify optimal descriptor subset (Genetic Algorithm, Recursive Feature Elimination, etc.)

4
Model Training

Train models using various ML algorithms (Random Forest, XGBoost, SVM, etc.) with automated hyperparameter optimization

5
Validation & Assessment

Comprehensive validation including cross-validation, Y-scrambling, and applicability domain assessment

6
Interpretability Analysis

Generate SHAP values, feature importance plots, and explainability reports for model understanding

3D-QSAR Workflow
1
3D Structure Generation

Generate and optimize 3D molecular structures with conformational analysis

2
Molecular Alignment

Align molecules in 3D space using pharmacophore or shape-based methods

3
Field Calculation

Calculate molecular interaction fields (steric, electrostatic, hydrophobic)

4
3D-QSAR Model Building

Build CoMFA/CoMSIA-like models using calculated interaction fields

4D-QSAR Workflow
1
Conformational Ensemble Generation

Generate multiple conformations for each molecule considering flexibility

2
Temporal Descriptor Calculation

Compute time-averaged and conformationally-dependent descriptors

3
4D-QSAR Model Development

Build models incorporating conformational flexibility and entropic contributions

Key Features
15+ ML Algorithms

Random Forest, XGBoost, SVM, Neural Networks, and more

Advanced Validation

Cross-validation, Y-scrambling, external validation, bootstrapping

Model Explainability

SHAP, LIME, partial dependence plots, feature importance

1000+ Descriptors

Comprehensive molecular descriptor library from RDKit

Applicability Domain

Multiple AD methods: Leverage, Bounding Box, k-NN

Export & Reporting

Comprehensive reports, model files, and publication-ready figures

Ready to Build Your QSAR Model?

Start by creating a project, then upload your dataset and train models

AI Lab