Release v3 is currently in beta. This documentation reflects the features and functionality in progress and may change before the final release.

Once you have turned raw data into semantic enhanced dataframes with the semantic layer, you can load them as either materialized or virtualized dataframes, depending on the data source. Using the .chat method, you can ask questions and get responses and charts. These dataframes can be shared with your team by pushing them to our data platform.

Materialized Dataframes

When working with local files (CSV, Parquet) or datasets based on such files, the dataframes are materialized, meaning:

  • Data is loaded entirely into memory
  • Fast access to all data
  • Ideal for local file processing or cross-source analysis
import pandasai as pai

# Load local files as materialized dataframes
file= pai.read_csv("local_file.csv")

df = pai.create(path="organization/dataset-name",
    name="dataset-name",
    df = file,
    description="describe your dataset")

Virtualized Dataframes

When loading remote datasets, dataframes are virtualized by default, providing:

  • Minimal memory usage through on-demand data loading
  • Efficient handling of large datasets
  • Optimal for remote data sources
import pandasai as pai

# Load remote datasets (virtualized by default)
df = pai.load("organization/dataset-name")