Semantic Dataframes

Release v3 is currently in beta. This documentation reflects the features and functionality in progress and may change before the final release.

Once you have turned raw data into semantic enhanced dataframes with the semantic layer, you can load them as either materialized or virtualized dataframes, depending on the data source. Using the .chat method, you can ask questions and get responses and charts. These dataframes can be shared with your team by pushing them to our data platform.

Materialized Dataframes

When working with local files (CSV, Parquet) or datasets based on such files, the dataframes are materialized, meaning:

Data is loaded entirely into memory
Fast access to all data
Ideal for local file processing or cross-source analysis

import pandasai as pai

# Load local files as materialized dataframes
file= pai.read_csv("local_file.csv")

df = pai.create(path="organization/dataset-name",
    name="dataset-name",
    df = file,
    description="describe your dataset")

Virtualized Dataframes

When loading remote datasets, dataframes are virtualized by default, providing:

Minimal memory usage through on-demand data loading
Efficient handling of large datasets
Optimal for remote data sources

import pandasai as pai

# Load remote datasets (virtualized by default)
df = pai.load("organization/dataset-name")

Overview

Data

Natural Language

Data Platform

Advanced Usage

Backwards Compatibility

About

Semantic Dataframes

Materialized Dataframes

Virtualized Dataframes

Overview

Data

Natural Language

Data Platform

Advanced Usage

Backwards Compatibility

About

​Materialized Dataframes

​Virtualized Dataframes

Materialized Dataframes

Virtualized Dataframes