This documentation page is for backwards compatibility. For new projects, we recommend using the new semantic dataframes.

Migrate SmartDatalakes

To migrate from SmartDatalake to the new semantic dataframes:

  1. Use the new multiple dataframe syntax with pai.chat()
  2. Update configuration to use pai.config.set()

Example migration:

# Old code
from pandasai import SmartDatalake, SmartDataframe
lake = SmartDatalake([
    SmartDataframe(customers_df),
    SmartDataframe(orders_df)
])
response = lake.chat("Query across dataframes")

# New code
import pandasai as pai
df_customers = pai.DataFrame(customers_df)
df_orders = pai.DataFrame(orders_df)
response = pai.chat("Query across dataframes", df_customers, df_orders)

SmartDatalakes (Legacy)

The SmartDatalake class was the previous way to work with multiple dataframes in PandaAI. It allowed you to combine multiple SmartDataframes and ask questions that span across them.

Basic Usage

from pandasai import SmartDatalake, SmartDataframe
import pandas as pd

# Create sample dataframes
customers_df = pd.DataFrame({
    'customer_id': [1, 2, 3],
    'name': ['John', 'Emma', 'Alex']
})
orders_df = pd.DataFrame({
    'order_id': [101, 102, 103],
    'customer_id': [1, 2, 1],
    'amount': [100, 200, 150]
})

# Convert to SmartDataframes
smart_customers = SmartDataframe(customers_df)
smart_orders = SmartDataframe(orders_df)

# Create SmartDatalake
lake = SmartDatalake([smart_customers, smart_orders])

# Ask questions spanning multiple dataframes
response = lake.chat("What is the total order amount for each customer?")

Configuration

SmartDatalake inherited configuration from its SmartDataframes, but you could also set config directly:

lake = SmartDatalake(
    [smart_customers, smart_orders],
    config={
        "llm": llm,
        "verbose": True
    }
)