Documentation/Nodes/Database Access

Database Access Nodes

Access to major chemical and biological databases including PubChem, RCSB PDB, and KNApSAcK for comprehensive compound and structure retrieval.

Node Reference

Detailed documentation for each database access node available in Bioshift.

PubChem Search

Search PubChem database for chemical compounds and retrieve molecular data

Type: pubchem_searchCategory: Chemical Database

Key Features

  • Multiple search types (name, SMILES, CID, CAS)
  • Batch compound retrieval
  • Property data extraction
  • 3D structure download
  • Synonym and identifier lookup
  • Error handling and retry logic

Input Ports

querystring

Search query (name, SMILES, CAS, etc.)

search_typestring

Search type (name, smiles, cid, etc.)

Output Ports

moleculesmolecules

Found molecular structures

compound_infodata

Compound metadata and properties

search_resultsdata

Search results summary

RCSB PDB Search

Search RCSB Protein Data Bank for protein structures

Type: rcsb_pdbCategory: Protein Database

Key Features

  • Advanced query capabilities
  • Structure and sequence search
  • Ligand-bound structure filtering
  • Resolution and quality filters
  • Bulk download support
  • Metadata extraction

Input Ports

querystring

PDB search query

search_typestring

Search type (protein, ligand, etc.)

Output Ports

structuresmolecules

Retrieved protein structures

pdb_infodata

PDB entry metadata

search_summarydata

Search results overview

KNApSAcK Scraper

Scrape chemical compound data from KNApSAcK database

Type: knapsack_scraperCategory: Natural Products Database

Key Features

  • Organism-specific compound search
  • Natural product database access
  • Bioactivity data extraction
  • Structure-activity relationships
  • Traditional medicine connections
  • Batch scraping capabilities

Input Ports

organism_querystring

Organism name for search

compound_querystring

Compound name (optional)

Output Ports

compoundsmolecules

Natural product structures

organism_datadata

Organism and source information

compound_infodata

Compound properties and activities

Workflow Examples

Common database integration workflows for research and drug discovery.

Drug Discovery Pipeline

Search for compounds and screen against protein targets

  1. 1Search PubChem for compounds with specific properties
  2. 2Download 3D structures from RCSB PDB
  3. 3Filter compounds by drug-likeness criteria
  4. 4Prepare structures for molecular docking
  5. 5Run virtual screening with AutoDock Vina
  6. 6Analyze docking results and select hits

Natural Products Research

Discover bioactive compounds from natural sources

  1. 1Search KNApSAcK for compounds from specific organisms
  2. 2Extract bioactivity data and traditional uses
  3. 3Filter compounds by pharmacological activity
  4. 4Compare with known drug structures
  5. 5Analyze structure-activity relationships
  6. 6Generate research reports

Target Identification

Find protein targets for compounds of interest

  1. 1Search PubChem for compounds of interest
  2. 2Find co-crystallized structures in RCSB PDB
  3. 3Extract protein-ligand interaction data
  4. 4Identify key binding residues
  5. 5Analyze binding pocket characteristics
  6. 6Generate target validation data

Supported Databases

Integration with major chemical and biological databases for comprehensive data access.

PubChem

The world's largest collection of freely accessible chemical information

  • • 110+ million compounds
  • • Chemical structures and properties
  • • Biological activities
  • • Literature references

RCSB PDB

Protein Data Bank with 3D structural data for biological molecules

  • • 190,000+ structures
  • • Protein and nucleic acid structures
  • • Small molecule ligands
  • • Experimental metadata

KNApSAcK

Natural products database with species-metabolite relationships

  • • 50,000+ natural products
  • • Species-metabolite links
  • • Traditional medicine data
  • • Bioactivity information