feat: `scan_csv` #1555

raisadz · 2024-12-10T14:41:51Z

What type of PR is this? (check all applicable)

Related issues

Related issue feat: read_csv #1112
Closes #<issue number>

Checklist

Code follows style guide (ruff)
Tests added
Documented the changes

If you have comments or can explain your changes, please do so below

MarcoGorelli

thanks! there's some merge conflicts, plus some minor comments

MarcoGorelli · 2024-12-10T15:35:11Z

narwhals/functions.py

+    This allows the query optimizer to push down predicates and projections
+    to the scan level, thereby potentially reducing memory overhead.


this is Polars-specific, perhaps we can remove it?

MarcoGorelli · 2024-12-10T15:36:40Z

narwhals/functions.py

+        ...     return (
+        ...         nw.scan_csv("file.csv", native_namespace=native_namespace)
+        ...         .to_native()
+        ...         .collect()


collect isn't guaranteed to be present on a native frame, can we keep it out of agnostic_scan_csv? you can just put collect as the end of the Polars example, and compute at the end of the dask example

either that, or put collect before to_native

MarcoGorelli · 2024-12-10T15:37:16Z

narwhals/stable/v1/__init__.py

+    This allows the query optimizer to push down predicates and projections
+    to the scan level, thereby potentially reducing memory overhead.


MarcoGorelli · 2024-12-10T15:38:04Z

narwhals/stable/v1/__init__.py

+        ...         .collect()
+        ...     )
+
+        Then we can read the file by passing Polars or dask namespaces:


how about

by passing, for example, Polars or Dask namespaces

?

MarcoGorelli

so good, thanks @raisadz !

add scan_csv

d5c9766

github-actions bot added the enhancement New feature or request label Dec 10, 2024

add collect

ff34f3e

raisadz marked this pull request as ready for review December 10, 2024 15:06

MarcoGorelli reviewed Dec 10, 2024

View reviewed changes

resolve conflicts, address comments

bb619de

MarcoGorelli approved these changes Dec 10, 2024

View reviewed changes

MarcoGorelli merged commit ade07ae into narwhals-dev:main Dec 10, 2024
24 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: `scan_csv` #1555

feat: `scan_csv` #1555

raisadz commented Dec 10, 2024

MarcoGorelli left a comment

MarcoGorelli Dec 10, 2024

MarcoGorelli Dec 10, 2024

MarcoGorelli Dec 10, 2024

MarcoGorelli Dec 10, 2024

MarcoGorelli left a comment

		This allows the query optimizer to push down predicates and projections
		to the scan level, thereby potentially reducing memory overhead.

feat: scan_csv #1555

feat: scan_csv #1555

Conversation

raisadz commented Dec 10, 2024

What type of PR is this? (check all applicable)

Related issues

Checklist

If you have comments or can explain your changes, please do so below

MarcoGorelli left a comment

Choose a reason for hiding this comment

MarcoGorelli Dec 10, 2024

Choose a reason for hiding this comment

MarcoGorelli Dec 10, 2024

Choose a reason for hiding this comment

MarcoGorelli Dec 10, 2024

Choose a reason for hiding this comment

MarcoGorelli Dec 10, 2024

Choose a reason for hiding this comment

MarcoGorelli left a comment

Choose a reason for hiding this comment

feat: `scan_csv` #1555

feat: `scan_csv` #1555