Documentation/Core Features/File I/O Operations

File I/O Operations

Comprehensive file input and output capabilities supporting 15+ file formats for molecular data, experimental results, and analysis outputs.

File Operations

Essential file management operations for scientific workflows.

File Input/Output

Basic file reading and writing operations

  • Read text and binary files
  • Write data to files
  • File path handling
  • Encoding support
  • File existence checking

Folder Operations

Directory and folder management

  • List directory contents
  • Create new folders
  • File system navigation
  • Path validation
  • Recursive operations

Import/Export

Data import and export in multiple formats

  • Bulk file processing
  • Format conversion
  • Batch operations
  • Progress tracking
  • Error handling

Save Operations

Specialized save operations for different data types

  • Save DataFrames to CSV/Excel
  • Export molecular structures
  • Save plots and images
  • Save analysis results
  • Custom file formats

Supported File Formats

Comprehensive support for molecular, data, and visualization formats.

Molecular Formats

Supported formats for molecular formats

SDF
ReadWrite

Structure Data File

MOL
ReadWrite

MDL Molfile

MOL2
ReadWrite

Tripos Mol2

PDB
ReadWrite

Protein Data Bank

PDBQT
ReadWrite

PDBQT format

XYZ
ReadWrite

XYZ coordinates

Data Formats

Supported formats for data formats

CSV
ReadWrite

Comma-separated values

Excel
ReadWrite

Microsoft Excel

JSON
ReadWrite

JavaScript Object Notation

TXT
ReadWrite

Plain text files

XML
Read

Extensible Markup Language

Image Formats

Supported formats for image formats

PNG
ReadWrite

Portable Network Graphics

JPG/JPEG
ReadWrite

Joint Photographic Experts Group

SVG
ReadWrite

Scalable Vector Graphics

PDF
Write

Portable Document Format

File I/O Nodes

Essential nodes for file input and output operations.

File Input

Read data from files on disk

Load data files for processing

file_input

Input Ports

No input ports

Output Ports

file_datafile

File contents or path

file_infodata

File metadata

Properties

PropertyTypeDefaultDescription
file_pathstringPath to input file
encodingstringutf-8File encoding
auto_detectbooltrueAuto-detect file type

Folder Input

Process entire directories of files

Batch process multiple files

folder_input

Input Ports

No input ports

Output Ports

file_listdata

List of files in folder

folder_infodata

Folder metadata

Properties

PropertyTypeDefaultDescription
folder_pathstringPath to folder
recursiveboolfalseInclude subdirectories
patternstring*File pattern to match

Save DataFrame

Export DataFrame to various formats

Export processed data for external analysis

save_dataframe

Input Ports

datadata

DataFrame to save

Output Ports

output_pathfile

Path to saved file

Properties

PropertyTypeDefaultDescription
file_pathstringOutput file path
formatstringcsvOutput format (csv, excel, json)
indexboolfalseInclude row index

Save Molecule

Export molecular structures

Export molecular structures for external tools

save_molecule

Input Ports

moleculesmolecules

Molecular structures

Output Ports

output_pathfile

Path to saved file

Properties

PropertyTypeDefaultDescription
file_pathstringOutput file path
formatstringsdfMolecular format (sdf, mol, pdb)
single_filebooltrueSave all to one file

Common File Workflows

Typical file processing patterns for scientific data management.

High-Throughput Data Processing

Process large numbers of data files efficiently

Intermediate

Workflow Steps:

  1. 1Select input folder with data files
  2. 2Filter files by type and date
  3. 3Process each file in parallel
  4. 4Aggregate results
  5. 5Export processed data

Required Nodes:

Folder InputFilter RowsData Processing NodesSave DataFrame

Molecular Database Creation

Build molecular databases from structure files

Beginner

Workflow Steps:

  1. 1Import molecular structure files
  2. 2Calculate molecular descriptors
  3. 3Filter by properties
  4. 4Export to database format
  5. 5Generate molecular fingerprints

Required Nodes:

File InputMolecular DescriptorsFilter RowsSave MoleculeFingerprint Generation

Automated Report Generation

Create automated analysis reports

Advanced

Workflow Steps:

  1. 1Load analysis data
  2. 2Generate plots and visualizations
  3. 3Create summary statistics
  4. 4Export to PDF or HTML report
  5. 5Save all outputs to organized folder

Required Nodes:

Data ReaderPlotting NodesStatistics NodesReport GeneratorFolder Output

Best Practices

Guidelines for efficient and reliable file operations.

File Path Management

Handle file paths correctly across different systems

  • Use absolute paths when possible
  • Handle path separators (/ vs \)
  • Validate file existence before reading
  • Use the session temp directory for temporary files
  • Check file permissions

Error Handling

Robust handling of file operation errors

  • Check if files exist before reading
  • Handle permission errors gracefully
  • Validate file formats
  • Provide meaningful error messages
  • Use try-catch blocks for file operations

Performance Optimization

Efficient file processing for large datasets

  • Use streaming for large files
  • Process files in chunks when possible
  • Close file handles promptly
  • Use appropriate buffer sizes
  • Monitor memory usage

Data Integrity

Ensure data integrity during file operations

  • Validate data after reading
  • Check file sizes and line counts
  • Verify data types and formats
  • Use checksums for critical files
  • Backup important data

Ready to Process Data?

Now that you understand file operations, learn about data processing and analysis.