Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: update docs for sql, create, links, examples #1571

Merged
merged 2 commits into from
Jan 31, 2025
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/mint.json
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@
"navigation": [
{
"group": "Overview",
"pages": ["v3/introduction", "v3/getting-started", "v3/cli", "v3/privacy-security"],
"pages": ["v3/introduction", "v3/getting-started"],
"version": "v3"
},
{
Expand All @@ -74,7 +74,7 @@
},
{
"group": "Advanced Usage",
"pages": ["v3/agent"],
"pages": ["v3/cli", "v3/privacy-security","v3/agent"],
"version": "v3"
},
{
Expand Down
2 changes: 1 addition & 1 deletion docs/v3/chat-and-output.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ You can inspect the code that was generated to produce the result:

```python
response = df.chat("Calculate the correlation between age and salary")
print(response.last_code_generated)
print(response.last_code_executed)
# Output: df['age'].corr(df['salary'])
```

Expand Down
211 changes: 38 additions & 173 deletions docs/v3/data-ingestion.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,6 @@ file = pai.read_csv("data.csv")
# Use the semantic layer on CSV
df = pai.create(
path="company/sales-data",
name="sales_data",
df = file,
description="Sales data from our retail stores",
columns={
Expand All @@ -50,182 +49,48 @@ response = df.chat("Which product has the highest sales?")

## How to work with SQL in PandaAI?

PandaAI provides a sql extension for you to work with SQL, PostgreSQL, MySQL, SQLite databases.
PandaAI provides a sql extension for you to work with SQL, PostgreSQL, MySQL, and CockroachDB databases.
To make the library lightweight and easy to use, the basic installation of the library does not include this extension.
It can be easily installed using either `poetry` or `pip`.
It can be easily installed using pip with the specific database you want to use:

```bash
poetry add pandasai-sql
pip install pandasai-sql[postgres]
pip install pandasai-sql[mysql]
pip install pandasai-sql[cockroachdb]
```

```bash
pip install pandasai-sql
```

Once you have installed the extension, you can use it to connect to SQL databases.

### PostgreSQL

```yaml
name: sales_data

source:
type: postgres
connection:
host: db.example.com
port: 5432
database: analytics
user: ${DB_USER}
password: ${DB_PASSWORD}
table: sales_data

destination:
type: local
format: parquet
path: company/sales-data

columns:
- name: transaction_id
type: string
description: Unique identifier for each sale
- name: sale_date
type: datetime
description: Date and time of the sale
- name: product_id
type: string
description: Product identifier
- name: quantity
type: integer
description: Number of units sold
- name: price
type: float
description: Price per unit

transformations:
- type: convert_timezone
params:
column: sale_date
from: UTC
to: America/New_York
- type: calculate
params:
column: total_amount
formula: quantity * price

update_frequency: daily
Once you have installed the extension, you can use the [semantic data layer](/v3/semantic-layer#for-sql-databases-using-the-create-method) and perform [data transformations](/docs/v3/transformations).

order_by:
- sale_date DESC

limit: 100000
```
### MySQL
```yaml
name: customer_data

source:
type: mysql
connection:
host: db.example.com
port: 3306
database: analytics
user: ${DB_USER}
password: ${DB_PASSWORD}
table: customers

destination:
type: local
format: parquet
path: company/customer-data

columns:
- name: customer_id
type: string
description: Unique identifier for each customer
- name: name
type: string
description: Customer's full name
- name: email
type: string
description: Customer's email address
- name: join_date
type: datetime
description: Date when customer joined
- name: total_purchases
type: integer
description: Total number of purchases made

transformations:
- type: anonymize
params:
column: email
- type: split
params:
column: name
into: [first_name, last_name]
separator: " "

update_frequency: daily

order_by:
- join_date DESC

limit: 100000
```
### SQLite
```yaml
name: inventory_data

source:
type: sqlite
connection:
database: path/to/database.db
table: inventory

destination:
type: local
format: parquet
path: company/inventory-data

columns:
- name: product_id
type: string
description: Unique identifier for each product
- name: product_name
type: string
description: Name of the product
- name: category
type: string
description: Product category
- name: stock_level
type: integer
description: Current quantity in stock
- name: last_updated
type: datetime
description: Last inventory update timestamp

transformations:
- type: categorize
params:
column: stock_level
bins: [0, 10, 50, 100, 500]
labels: ["Critical", "Low", "Medium", "High"]
- type: convert_timezone
params:
column: last_updated
from: UTC
to: America/Los_Angeles

update_frequency: hourly

order_by:
- last_updated DESC

limit: 50000
```python
sql_table = pai.create(
path="example/mysql-dataset",
description="Heart disease dataset from MySQL database",
source={
"type": "mysql",
"connection": {
"host": "database.example.com",
"port": 3306,
"user": "${DB_USER}",
"password": "${DB_PASSWORD}",
"database": "medical_data"
},
"table": "heart_data",
"columns": [
{"name": "Age", "type": "integer", "description": "Age of the patient in years"},
{"name": "Sex", "type": "string", "description": "Gender of the patient (M = male, F = female)"},
{"name": "ChestPainType", "type": "string", "description": "Type of chest pain (ATA, NAP, ASY, TA)"},
{"name": "RestingBP", "type": "integer", "description": "Resting blood pressure in mm Hg"},
{"name": "Cholesterol", "type": "integer", "description": "Serum cholesterol in mg/dl"},
{"name": "FastingBS", "type": "integer", "description": "Fasting blood sugar > 120 mg/dl (1 = true, 0 = false)"},
{"name": "RestingECG", "type": "string", "description": "Resting electrocardiogram results (Normal, ST, LVH)"},
{"name": "MaxHR", "type": "integer", "description": "Maximum heart rate achieved"},
{"name": "ExerciseAngina", "type": "string", "description": "Exercise-induced angina (Y = yes, N = no)"},
{"name": "Oldpeak", "type": "float", "description": "ST depression induced by exercise relative to rest"},
{"name": "ST_Slope", "type": "string", "description": "Slope of the peak exercise ST segment (Up, Flat, Down)"},
{"name": "HeartDisease", "type": "integer", "description": "Heart disease diagnosis (1 = present, 0 = absent)"}
]
}
)
```

## How to work with Enterprise Cloud Data in PandaAI?
Expand Down Expand Up @@ -590,8 +455,8 @@ limit: 100000
</tr>
<tr>
<td style={{ border: '1px solid #ccc', padding: '8px 16px' }}>pandasai_sql</td>
<td style={{ border: '1px solid #ccc', padding: '8px 16px' }}><code>poetry add pandasai-sql</code></td>
<td style={{ border: '1px solid #ccc', padding: '8px 16px' }}><code>pip install pandasai-sql</code></td>
<td style={{ border: '1px solid #ccc', padding: '8px 16px' }}><code>poetry add pandasai-sql[postgres]</code></td>
<td style={{ border: '1px solid #ccc', padding: '8px 16px' }}><code>pip install pandasai-sql[postgres]</code></td>
<td style={{ border: '1px solid #ccc', padding: '8px 16px' }}>No</td>
</tr>
<tr>
Expand Down
16 changes: 9 additions & 7 deletions docs/v3/getting-started.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,6 @@ df = pai.read_csv("data/companies.csv")
companies = pai.create(
path="my-org/companies",
df=df,
name="companies",
description="Customer companies dataset"
)

Expand All @@ -89,7 +88,6 @@ By default, the column will be inferred from the data. For more control, though,
companies = pai.create(
path="my-org/companies",
df=df,
name="companies",
description="Customer companies dataset",
columns=[
{
Expand Down Expand Up @@ -121,7 +119,11 @@ stocks = pai.load("organization/coca_cola_stock")
companies = pai.load("organization/companies")

# Query using natural language
result = companies.chat("What's the average revenue by region?")
response= stocks.chat("What is the volatility of the Coca Cola stock?")
response = companies.chat("What is the average revenue by region?")

# Query using multiple datasets
result = pai.chat("Compare the revenue between Coca Cola and Apple", stocks, companies)
```

## Sharing and collaboration
Expand All @@ -130,8 +132,8 @@ Share your data layers with your team:

```python
# Push datasets to the platform
stocks.push()
companies.push()
market.push()
```

Team members can then access and query the shared datasets through:
Expand All @@ -143,6 +145,6 @@ Of course, they will only be able to see the datasets they have access to. You c
## Next Steps

- Learn more about [Data Schema Definition](/v3/data-layer)
- Explore [Advanced Views and Joins](/v3/views)
- Check out our [Example Projects](/v3/examples)
- Join our [Discord Community](https://discord.gg/kF7FqH2FwS) for support
- Join our [Discord Community](https://discord.gg/KYKj9F2FRH) for support
{/* - Explore [Advanced Views and Joins](/v3/views) */}
{/*- Check out our [Example Projects](/v3/examples) */}
4 changes: 2 additions & 2 deletions docs/v3/introduction.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -56,8 +56,8 @@ For enhanced capabilities, connect your PandaAI implementation to our [Data Plat
- [Quick Start Guide](/v3/getting-started)
- [Data Layer Documentation](/v3/data-layer)
- [Natural Language Features](/v3/overview-nl)
- [Example Projects](/v3/examples)
{/* - [Example Projects](/v3/examples) */}

## Get in touch

If you can’t find the information you’re looking for in the documentation, or if you need help, get in touch with our Support Team at pm@sinaptik.ai, or join our [Discord](https://discord.gg/kF7FqH2FwS), where our team and community can help you.
If you can’t find the information you’re looking for in the documentation, or if you need help, get in touch with our Support Team at pm@sinaptik.ai, or join our [Discord](https://discord.gg/KYKj9F2FRH), where our team and community can help you.
Loading
Loading