Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maintain PySpark compatibility for databricks.labs.lsql.core.Row #99

Merged

Conversation

bishwajit-db
Copy link
Contributor

Add asDict to databricks.labs.lsql.core.Row to maintain PySpark compatibility

Copy link
Collaborator

@nfx nfx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@nfx nfx merged commit 2245043 into databrickslabs:main May 8, 2024
8 of 10 checks passed
nfx added a commit that referenced this pull request May 8, 2024
* Bump actions/checkout from 4.1.2 to 4.1.3 ([#97](#97)). The `actions/checkout` dependency has been updated from version 4.1.2 to 4.1.3 in the `update-main-version.yml` file. This new version includes a check to verify the git version before attempting to disable `sparse-checkout`, and adds an SSH user parameter to improve functionality and compatibility. The release notes and CHANGELOG.md file provide detailed information on the specific changes and improvements. The pull request also includes a detailed commit history and links to corresponding issues and pull requests on GitHub for transparency. You can review and merge the pull request to update the `actions/checkout` dependency in your project.
* Maintain PySpark compatibility for databricks.labs.lsql.core.Row ([#99](#99)). In this release, we have added a new method `asDict` to the `Row` class in the `databricks.labs.lsql.core` module to maintain compatibility with PySpark. This method returns a dictionary representation of the `Row` object, with keys corresponding to column names and values corresponding to the values in each column. Additionally, we have modified the `fetch` function in the `backends.py` file to return `Row` objects of `pyspark.sql` when using `self._spark.sql(sql).collect()`. This change is temporary and marked with a `TODO` comment, indicating that it will be addressed in the future. We have also added error handling code in the `fetch` function to ensure the function operates as expected. The `asDict` method in this implementation simply calls the existing `as_dict` method, meaning the behavior of the `asDict` method is identical to the `as_dict` method. The `as_dict` method returns a dictionary representation of the `Row` object, with keys corresponding to column names and values corresponding to the values in each column. The optional `recursive` argument in the `asDict` method, when set to `True`, enables recursive conversion of nested `Row` objects to nested dictionaries. However, this behavior is not currently implemented, and the `recursive` argument is always `False` by default.

Dependency updates:

 * Bump actions/checkout from 4.1.2 to 4.1.3 ([#97](#97)).
@nfx nfx mentioned this pull request May 8, 2024
nfx added a commit that referenced this pull request May 8, 2024
* Bump actions/checkout from 4.1.2 to 4.1.3
([#97](#97)). The
`actions/checkout` dependency has been updated from version 4.1.2 to
4.1.3 in the `update-main-version.yml` file. This new version includes a
check to verify the git version before attempting to disable
`sparse-checkout`, and adds an SSH user parameter to improve
functionality and compatibility. The release notes and CHANGELOG.md file
provide detailed information on the specific changes and improvements.
The pull request also includes a detailed commit history and links to
corresponding issues and pull requests on GitHub for transparency. You
can review and merge the pull request to update the `actions/checkout`
dependency in your project.
* Maintain PySpark compatibility for databricks.labs.lsql.core.Row
([#99](#99)). In this
release, we have added a new method `asDict` to the `Row` class in the
`databricks.labs.lsql.core` module to maintain compatibility with
PySpark. This method returns a dictionary representation of the `Row`
object, with keys corresponding to column names and values corresponding
to the values in each column. Additionally, we have modified the `fetch`
function in the `backends.py` file to return `Row` objects of
`pyspark.sql` when using `self._spark.sql(sql).collect()`. This change
is temporary and marked with a `TODO` comment, indicating that it will
be addressed in the future. We have also added error handling code in
the `fetch` function to ensure the function operates as expected. The
`asDict` method in this implementation simply calls the existing
`as_dict` method, meaning the behavior of the `asDict` method is
identical to the `as_dict` method. The `as_dict` method returns a
dictionary representation of the `Row` object, with keys corresponding
to column names and values corresponding to the values in each column.
The optional `recursive` argument in the `asDict` method, when set to
`True`, enables recursive conversion of nested `Row` objects to nested
dictionaries. However, this behavior is not currently implemented, and
the `recursive` argument is always `False` by default.

Dependency updates:

* Bump actions/checkout from 4.1.2 to 4.1.3
([#97](#97)).
@bishwajit-db bishwajit-db deleted the feature/spark-row-compatibility branch May 8, 2024 09:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants