Complete machine learning toolkit with 8+ algorithms, model management, evaluation metrics, and clustering capabilities for data science workflows.
Essential nodes for machine learning workflows from data preparation to model deployment.
Split datasets into training and testing sets
Input DataFrame
Target variable column
Training features
Testing features
Training target
Testing target
Binary and multiclass classification using logistic regression
Training features
Training target
Testing features
Trained logistic regression model
Predicted labels
Prediction probabilities
Ensemble classification using random forest algorithm
Training features
Training target
Testing features
Trained random forest model
Predicted labels
Feature importance scores
Linear regression for continuous target prediction
Training features
Training target
Testing features
Trained linear regression model
Predicted values
Model coefficients
Load a saved model from disk (.pkl file)
Path to .pkl model file
Loaded model object
Save a trained model to disk (.pkl file)
Trained model to save
Destination file path (optional)
Path to saved model file
Test trained models on new data
Trained model
Testing features
Model predictions
Prediction confidence scores
Model performance metrics
Make predictions using trained models
Trained model
New data for prediction
Model predictions
Detailed prediction results
Unsupervised learning algorithms for data exploration and feature extraction.
Partition data into k clusters using K-means algorithm
cluster_kmeansUnsupervised data segmentation
Hierarchical clustering with customizable linkage methods
cluster_agglomerativeHierarchical data organization
Principal Component Analysis for dimensionality reduction
dim_pcaFeature extraction and visualization
Comprehensive evaluation tools for assessing model performance and generating insights.
Create confusion matrix for classification evaluation
eval_confusion_matrixClassification model evaluation
Generate detailed classification performance report
eval_classification_reportComprehensive classification metrics
Calculate regression performance metrics
eval_regression_metricsRegression model evaluation
Typical machine learning pipeline using Bioshift ML nodes.