Release # [0.10] Performance Optimization, Enhanced Preprocessing, and much more! · ombhojane/explainableai

ExplainableAI v0.10 introduces significant performance improvements, enhanced data preprocessing capabilities, and a more robust logging system.

New Features

Dask Integration for Large Datasets

Added support for Dask DataFrames to handle larger-than-memory datasets efficiently.
Implemented _preprocess_data_dask method for parallel data preprocessing.

Enhanced `analyze` Function

Added support for batch processing and parallel execution:
- batch_size: Allows processing of large datasets in smaller chunks. Default is None (process all data at once).
- parallel: Enables parallel processing of batches using multiprocessing. Default is False.
- instance_index: Specifies the index of a particular instance for detailed interpretation. Default is 0.

Enhanced Logging

Implemented a more comprehensive logging system using Python's logging module.
Added colorized console output for better readability using the colorama library.

Expanded Documentation

Created a new /doc directory for additional documentation:
- API reference guide
- User guide with detailed explanations and best practices
- Installation and setup instructions

Use cases

Added an /examples directory showcasing various use cases:
- Small code snippets for quick start
- Comprehensive examples of ExplainableAI in larger projects
- Jupyter notebooks demonstrating step-by-step analysis

Improvements

Core Functionality

Refactored XAIWrapper class for improved performance and modularity.
Enhanced error handling and added more informative error messages.

Data Preprocessing

Improved categorical and numerical feature handling in the preprocessing pipeline.
Added support for handling missing values and outliers.

Model Comparison

Enhanced model comparison functionality with more detailed metrics.
Improved selection of the best model based on cross-validation scores.

Visualization

Added new visualization options, including correlation heatmaps.
Improved existing plots for better interpretability.

Report Generation

Enhanced PDF report generation with more customizable options.
Added ability to selectively include sections in the generated report.

Exploratory Data Analysis (EDA)

Implemented a new perform_eda method in XAIWrapper for quick dataset insights.
Added correlation analysis and outlier detection to EDA process.

Bug Fixes

Fixed issues related to feature importance calculation and visualization.
Resolved compatibility issues with the latest versions of dependencies.

Performance Optimization

Implemented more efficient data handling techniques for large datasets.
Optimized SHAP value calculations and other computationally intensive operations.

Installation

pip install explainableai==0.10

Usage

from explainableai import XAIWrapper
import pandas as pd

# Load your dataset
df = pd.read_csv('your_dataset.csv')
X = df.drop(columns=['target_column'])
y = df['target_column']

# Initialize XAIWrapper
xai = XAIWrapper()

# Fit and analyze models
xai.fit(models, X, y)
results = xai.analyze(batch_size=100, parallel=False, instance_index=0)

# Generate a comprehensive report
xai.generate_report('analysis_report.pdf')

# Make and explain predictions
new_data = {...} # Dictionary of feature values
prediction, probabilities, explanation = xai.explain_prediction(new_data)

Analyze with batch processing and parallel execution

This will:

Process the data in batches of 1000 samples
Use parallel processing for faster computation
Provide detailed interpretation for the 43rd instance (0-based index)

xai = XAIWrapper()
xai.fit(models, X, y)

results = xai.analyze(batch_size=1000, parallel=True, instance_index=42)

Breaking Changes

The analyze method now supports batch processing and parallel execution options.
Some internal method signatures have been updated to accommodate new features.

We encourage users to update to this version for improved performance and new capabilities. As always, please report any issues or suggestions through our GitHub issue tracker.

For more detailed information, please refer to the documentation in the /doc directory and explore the explainableai usecases in the /examples directory.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

# [0.10] Performance Optimization, Enhanced Preprocessing, and much more!

New Features

Dask Integration for Large Datasets

Enhanced `analyze` Function

Enhanced Logging

Expanded Documentation

Use cases

Improvements

Core Functionality

Data Preprocessing

Model Comparison

Visualization

Report Generation

Exploratory Data Analysis (EDA)

Bug Fixes

Performance Optimization

Installation

Usage

Analyze with batch processing and parallel execution

Breaking Changes

# [0.10] Performance Optimization, Enhanced Preprocessing, and much more!

New Features

Dask Integration for Large Datasets

Enhanced analyze Function

Enhanced Logging

Expanded Documentation

Use cases

Improvements

Core Functionality

Data Preprocessing

Model Comparison

Visualization

Report Generation

Exploratory Data Analysis (EDA)

Bug Fixes

Performance Optimization

Installation

Usage

Analyze with batch processing and parallel execution

Breaking Changes

Enhanced `analyze` Function