Skip to content

Latest commit

 

History

History
250 lines (136 loc) · 59.1 KB

CHANGELOG.md

File metadata and controls

250 lines (136 loc) · 59.1 KB

Version changelog

0.10.1

  • patch hosted runner (#185). In this release, we have implemented a temporary fix to address issues with publishing artifacts in the release workflow. This fix involves changing the runner used for the job from ubuntu-latest to a protected runner group labeled "linux-ubuntu-latest". This ensures that the job runs on a designated hosted runner with the specified configuration, enhancing the reliability and security of the release process. The permissions section of the job remains unchanged, allowing authentication to PyPI and signing of release artifacts with sigstore-python. It is worth noting that this is a stopgap measure, and further changes to the release workflow may be made in the future.

0.10.0

  • Fixed incorrect script for no-pylint-disable (#178). In this release, we have updated the script used in the no-cheat GitHub workflow to address false positives in stacked pull requests. The updated script fetches the base reference from the remote repository and generates a diff between the base reference and the current branch, saving it to a file. It then runs the "no_cheat.py" script against this diff file and saves the results to a separate file. If the count of cheats (instances where linting has been intentionally disabled) is greater than one, the script outputs the contents of the results file and exits with a non-zero status, indicating an error. This change enhances the accuracy of the script and ensures it functions correctly in a stacked pull request scenario. The no_cheat function, which checks for the presence of certain pylint disable tags in a given diff text, has been updated to the latest version from the ucx project to improve accuracy. The function identifies tags by looking for lines starting with - or "+" followed by the disable tag and a list of codes, and counts the number of times each code is added and removed, reporting any net additions.
  • Skip dataclassess fields only when None (#180). In this release, we have implemented a change that allows for the skipping of dataclass fields only when the value is None, enabling the inclusion of empty lists, strings, or zeros during marshalling. This modification is in response to issue #179 and involves adding a check for None before marshalling a dataclass field. Specifically, the previous condition if not raw: has been replaced with if raw is None:. This change ensures that empty values such as [], '', or 0 are not skipped during the serialization process, unless they are explicitly set to None. This enhancement provides improved compatibility and flexibility for users working with dataclasses containing empty values, allowing for more fine-grained control during the serialization process.

Dependency updates:

  • Bump codecov/codecov-action from 4 to 5 (#174).

0.9.3

  • Fixed issue when Databricks SDK config objects were overridden for installation config files (#170). This commit addresses an issue where Databricks SDK config objects were being overridden during installation config files creation, which has been resolved by modifying the _marshal method in the installation class to handle databricks.sdk.core.Config instances more carefully, and by introducing a new helper function get_databricks_sdk_config in the paths.py file, which retrieves the Databricks SDK configuration and improves the reliability and robustness of the SDK configuration. This fixes bug #169 and ensures that the SDK configuration is not accidentally modified during the installation process, preventing unexpected behavior and errors. The changes are isolated to the paths.py file and do not affect other parts of the codebase.

0.9.2

  • Bump actions/checkout from 4.2.1 to 4.2.2 (#160). In this release, the 'actions/checkout' dependency has been updated from version 4.2.1 to 4.2.2. This update includes changes to the 'url-helper.ts' file, which now utilizes well-known environment variables for improved reliability and maintainability. Additionally, unit test coverage for the isGhes function has been expanded. These changes are recommended for adoption to take advantage of the enhancements. The pull request includes a detailed changelog, commit history, and instructions for managing the update using Dependabot commands and options.
  • Bump databrickslabs/sandbox from acceptance/v0.3.1 to 0.4.2 (#166). In the latest release, the databrickslabs/sandbox Python package has been updated from version acceptance/v0.3.1 to 0.4.2. This update includes new features such as installation instructions, additional go-git libraries, and modifications to the README file. Dependency updates include a bump in the version of golang.org/x/crypto used. The pull request for this update was created by a GitHub bot, Dependabot, which will manage any conflicts and respond to comments containing specific commands. It is essential to thoroughly review and test this updated version to ensure that the new methods and modifications to existing functionality do not introduce any issues or regressions, and that the changes are well-documented and justified.
  • Don't draft automated releases (#159). In this release, the draft release feature in the GitHub Actions workflow has been disabled, enhancing the release process for software engineers. The 'draft: true' parameter has been removed from the Draft release job, which means that automated releases will now be published immediately upon creation instead of being created as drafts. This modification simplifies and streamlines the release process, making it more efficient for engineers who adopt the project. The change is aimed at reducing the time and effort required in manually publishing draft releases, thereby improving the overall experience for project contributors and users.
  • Updated custom Path support for python 3.13 (#161). In this revision, the project's continuous integration (CI) workflow has been updated to include Python 3.13, enhancing compatibility and enabling early identification of platform-specific issues. The paths module has been refactored into several submodules for better organization, and a new submodule, databrickspath_posixpath, has been added to distinguish PosixPath from DBFSPath and WorkspacePath. The comparison and equality behavior of _DatabricksPath objects has been modified to include parser property identity checks in Python 3.13, ensuring consistent behavior and eliminating exceptions when built-in paths are compared with custom paths. These updates promote confidence in the project's long-term viability and adaptability in response to evolving language standards.

Dependency updates:

  • Bump actions/checkout from 4.2.1 to 4.2.2 (#160).
  • Bump databrickslabs/sandbox from acceptance/v0.3.1 to 0.4.2 (#166).

0.9.1

  • Bump actions/checkout from 4.1.7 to 4.2.0 (#149). In this pull request, the actions/checkout dependency is upgraded from version 4.1.7 to 4.2.0 in the acceptance.yml and downstreams.yml workflow files. The new version provides additional Ref and Commit outputs, as well as updated dependencies, which aim to improve the functionality and security of the checkout process. The Ref output is a string representing the reference that was checked out, and the Commit output is the SHA-1 hash of the checked-out commit. Dependency updates include bumping the braces package from 3.0.2 to 3.0.3 and updating the minor-npm-dependencies group across one directory with four updates. These changes contribute to a more reliable and efficient checkout process and enhance the overall functionality and maintainability of the Action. Software engineers are recommended to review the changes and ensure they do not introduce conflicts with their current setup before adopting the new version.
  • Bump actions/checkout from 4.2.0 to 4.2.1 (#152). In this update, the version of the actions/checkout GitHub Action is bumped from 4.2.0 to 4.2.1 in a project's GitHub workflow files. This new version includes a modification to check out other refs/* by commit if provided, falling back to the ref. This change enhances the flexibility of the checkout action in handling different types of references, which could be useful for users working with multiple branches or references in their workflows. The update also adds a workflow file for publishing releases to an immutable action package. This release was contributed by the new project collaborator, @orhantoy, who made the change in pull request 1924.
  • Bump databrickslabs/sandbox from acceptance/v0.3.0 to 0.3.1 (#155). In this update, the dependency for databrickslabs/sandbox has been bumped from version acceptance/v0.3.0 to 0.3.1. This change includes bug fixes, upgrades to go-git libraries, and dependency updates. The golang.org/x/crypto library was specifically bumped from version 0.16.0 to 0.17.0 in both /go-libs and /runtime-packages. Additionally, the cac167b commit expanded acceptance test logs and introduced experimental OIDC refresh token rotation. The acceptance test job in the workflow was also updated to use the new version of databrickslabs/sandbox. Ignore conditions were added for previous versions of databrickslabs/sandbox in this release. The README was also modified, and install instructions were added to the changelog.
  • Catch all errors when checking Databricks path, notably BadRequest ones (#156). This commit introduces improvements to the error handling of the exists method in the paths.py file when checking Databricks path. Previously, only NotFound errors were caught, but now BadRequest errors are also handled, addressing issue #2882. The exists method has been updated to catch and manage DatabricksError exceptions, which now encompass BadRequest errors, ensuring comprehensive error handling for Databricks path-related operations. Additionally, the _cached_file_info and _cached_object_info attributes are now initialized when a DatabricksError exception occurs, returning False accordingly. This enhancement maintains consistency and accuracy in the exists method while broadening the range of errors captured, resulting in a more robust and reliable codebase with enhanced error reporting for users.
  • Normalize databricks paths as part of resolving them (#157). In this release, the resolve method in the paths.py file of the databricks/labs/blueprint project has been enhanced to handle parent directory references ("..") consistently with Python's built-in Path object. Previously, Path("/a/b/../c").resolve() would return Path("/a/b/c"), while Databricks paths were not behaving consistently. This modification introduces a new _normalize() method, which processes the path parts and ensures that ".." segments are handled correctly. The commit also includes a new test function, 'test_resolve_is_consistent', which checks the consistent resolution of Databricks paths with various input formats, such as relative paths, ".." or "." components, and absolute paths. This change ensures that the resolved path will be normalized according to the expected behavior, regardless of the input format, contributing to the resolution of issue #2882. By normalizing Databricks paths in the same fashion as Python's built-in Path object, the code should become more robust and predictable, providing a more reliable and predictable experience for software engineers utilizing the project.
  • Updated databrickslabs/sandbox requirement to acceptance/v0.3.0 (#153). In this pull request, the databrickslabs/sandbox package requirement in the downstreams GitHub Actions workflow is updated to version 0.3.0, which is the latest version available. This package provides a sandbox environment for development and testing, and the new version includes bug fixes and dependency updates that may enhance its reliability and performance. Dependabot has been used to ensure a smooth update process, with any conflicts being resolved automatically. However, it is recommended to review the changelog and test the updated version before merging this pull request to ensure compatibility and functionality in your specific use case. Additionally, Dependabot commands are available to manage ignore conditions for this dependency.

Dependency updates:

  • Bump actions/checkout from 4.1.7 to 4.2.0 (#149).
  • Bump actions/checkout from 4.2.0 to 4.2.1 (#152).
  • Updated databrickslabs/sandbox requirement to acceptance/v0.3.0 (#153).
  • Bump databrickslabs/sandbox from acceptance/v0.3.0 to 0.3.1 (#155).

0.9.0

  • Added Databricks CLI version as part of routed command telemetry (#147). A new environment variable, "DATABRICKS_CLI_VERSION", has been introduced in the Databricks CLI version for routed command telemetry. This variable is incorporated into the with_user_agent_extra method, which adds it to the user agent for outgoing requests, thereby enhancing detailed tracking and version identification in telemetry data. The with_user_agent_extra method is invoked twice, with the blueprint prefix and the version variable, followed by the cli prefix and the DATABRICKS_CLI_VERSION environment variable, ensuring that both the blueprint and CLI versions are transmitted in the user agent for all requests.

0.8.3

  • add missing stat() methods to DBFSPath and WorkspacePath (#144). The stat() method has been added to both DBFSPath and WorkspacePath classes, addressing issues #142 and #143. This method, which adheres to the Posix standard, returns file status in the os.stat_result format, providing access to various metadata attributes such as file size, last modification time, and creation time. By incorporating this method, developers can now obtain essential file information for Databricks File System (DBFS) and Databricks Workspace paths when working with these classes. The change includes a new test case for stat() in the test_paths.py file to ensure the correctness of the method for both classes.

0.8.2

  • Make hatch a prerequisite (#137). In version 1.9.4, hatch has become a prerequisite for installation in the GitHub workflow for the project's main branch, due to occasional failures in pip install hatch that depend on the local environment. This change, which includes defining the hatch version as an environment variable and adding a new step for installing hatch with a specific version, aims to enhance the reliability of the build and testing process by eliminating potential installation issues with hatch. Users should install hatch manually before executing the Makefile, as the line pip install hatch has been removed from the Makefile. This change aligns with the approach taken for ucx, and users are expected to understand the requirement to install prerequisites before executing the Makefile. To contribute to this project, please install hatch using pip install hatch, clone the GitHub repository, and run make dev to start the development environment and install necessary dependencies.
  • support files with unicode BOM (#138). The recent change to the open-source library introduces support for handling files with a Unicode Byte Order Mark (BOM) during file upload and download operations in Databricks Workspace. This new functionality, added to the WorkspacePath class, allows for easier reading of text from files with the addition of a read_text method. When downloading a file, if it starts with a BOM, it will be detected and used for decoding, regardless of the preferred encoding based on the system's locale. The change includes a new test function that verifies the accurate encoding and decoding of files with different types of BOM using the appropriate encoding. Despite the inability to test Databrick notebooks with a BOM due to the Databricks platform modifying the uploaded data, this change enhances support for handling files with various encodings and BOM, improving compatibility with a broader range of file formats, and ensuring more accurate handling of files with BOM.

0.8.1

  • Fixed py3.10 compatibility for _parts in pathlike (#135). The recent update to our open-source library addresses the compatibility issue with Python 3.10 in the _parts property of a certain type. Prior to this change, there was also a _cparts property that returned the same value as _parts, which has been removed and replaced with a direct reference to _parts. The _parts property can now be accessed via reverse equality comparison, and this change has been implemented in the joinpath and __truediv__ methods as well. This enhancement improves the library's compatibility with Python 3.10 and beyond, ensuring continued functionality and stability for software engineers working with the latest Python versions.

0.8.0

  • Added DBFSPath as os.PathLike implementation (#131). The open-source library has been updated with a new class DBFSPath, an implementation of os.PathLike for Databricks File System (DBFS) paths. This new class extends the existing WorkspacePath support and provides pathlib-like functionality for DBFS paths, including methods for creating directories, renaming and deleting files and directories, and reading and writing files. The addition of DBFSPath includes type-hinting for improved code linting and is integrated in the test suite with new and updated tests for path-like objects. The behavior of the exists and unlink methods have been updated for WorkspacePath to improve performance and raise appropriate errors.
  • Fixed .as_uri() and .absolute() implementations for WorkspacePath (#127). In this release, the WorkspacePath class in the paths.py module has been updated with several improvements to the .as_uri() and .absolute() methods. These methods now utilize PathLib internals, providing better cross-version compatibility. The .as_uri() method now uses an f-string for concatenation and returns the UTF-8 encoded string representation of the WorkspacePath object via a new __bytes__() dunder method. Additionally, the .absolute() method has been implemented for the trivial (no-op) case and now supports returning the absolute path of files or directories in Databricks Workspace. Furthermore, the glob() and rglob() methods have been enhanced to support case-sensitive pattern matching based on a new case_sensitive parameter. To ensure the integrity of these changes, two new test cases, test_as_uri() and test_absolute(), have been added, thoroughly testing the functionality of these methods.
  • Fixed WorkspacePath support for python 3.11 (#121). The WorkspacePath class in our open-source library has been updated to improve compatibility with Python 3.11. The .expanduser() and .glob() methods have been modified to address internal changes in Python 3.11. The is_dir() and is_file() methods now include a follow_symlinks parameter, although it is not currently used. A new method, _scandir(), has been added for compatibility with Python 3.11. The expanduser() method has also been updated to expand ~ (but not ~user) constructs. Additionally, a new method is_notebook() has been introduced to check if the path points to a notebook in Databricks Workspace. These changes aim to ensure that the library functions smoothly with the latest version of Python and provides additional functionality for users working with Databricks Workspace.
  • Properly verify versions of python (#118). In this release, we have made significant updates to the pyproject.toml file to enhance project dependency and development environment management. We have added several new packages to the dependencies section to expand the library's functionality and compatibility. Additionally, we have removed the python field, as it is no longer necessary. We have also updated the path field to specify the location of the virtual environment, which can improve integration with popular development tools such as Visual Studio Code and PyCharm. These changes are intended to streamline the development process and make it easier to manage dependencies and set up the development environment.
  • Type annotations on path-related unit tests (#128). In this open-source library update, type annotations have been added to path-related unit tests to enhance code clarity and maintainability. The tests encompass various scenarios, including verifying if a path exists, creating, removing, and checking directories, and testing file attributes such as distinguishing directories, notebooks, and regular files. The additions also cover functionality for opening and manipulating files in different modes like read binary, write binary, read text, and write text. Furthermore, tests for checking file permissions, handling errors, and globbing (pattern-based file path matching) have been incorporated. The tests interact with a WorkspaceClient mock object, simulating file system interactions. This enhancement bolsters the library's reliability and assists developers in creating robust, well-documented code when working with file system paths.
  • Updated WorkspacePath to support Python 3.12 (#122). In this release, the WorkspacePath implementation has been updated to ensure compatibility with Python 3.12, in addition to Python 3.10 and 3.11. The class was modified to replace most of the internal implementation and add extensive tests for public interfaces, ensuring that the superclass implementations are not used unless they are known to be safe. This change is in response to the significant changes in the superclass implementations between Python 3.11 and 3.12, which were found to be incompatible with each other. The WorkspacePath class now includes several new methods and tests to ensure that it functions seamlessly with different versions of Python. These changes include testing for initialization, equality, hash, comparison, path components, and various path manipulations. This update enhances the library's adaptability and ensures it functions correctly with different versions of Python. Classifiers have also been updated to include support for Python 3.12.
  • WorkspacePath fixes for the .resolve() implementation (#129). The .resolve() method for WorkspacePath has been updated to improve its handling of relative paths and the strict argument. Previously, relative paths were not properly validated and would be returned as-is. Now, relative paths will cause the method to fail. The strict argument is now checked, and if set to True and the path does not exist, a FileNotFoundError will be raised. The method .absolute() is used to obtain the absolute path of the file or directory in Databricks Workspace and is used in the implementation of .resolve(). A new test, test_resolve(), has been added to verify these changes, covering scenarios where the path is absolute, the path exists, the path does not exist, and the path is relative. In the case of relative paths, a NotImplementedError is raised, as .resolve() is not supported for them.
  • WorkspacePath: Fix the .rename() and .replace() implementations to return the target path (#130). The .rename() and .replace() methods of the WorkspacePath class have been updated to return the target path as part of the public API, with .rename() no longer accepting the overwrite keyword argument and always failing if the target path already exists. A new private method, ._rename(), has been added to include the overwrite argument and is used by both .rename() and .replace(). This update is a preparatory step for factoring out common code to support DBFS paths. The tests have been updated accordingly, combining and adding functions to test the new and updated methods. The .unlink() method's behavior remains unchanged. Please note that the exact error raised when .rename() fails due to an existing target path is yet to be defined.

Dependency updates:

  • Bump sigstore/gh-action-sigstore-python from 2.1.1 to 3.0.0 (#133).

0.7.0

  • Added databricks.labs.blueprint.paths.WorkspacePath as pathlib.Path equivalent (#115). This commit introduces the databricks.labs.blueprint.paths.WorkspacePath library, providing Python-native pathlib.Path-like interfaces to simplify working with Databricks Workspace paths. The library includes WorkspacePath and WorkspacePathDuringTest classes offering advanced functionality for handling user home folders, relative file paths, browser URLs, and file manipulation methods such as read/write_text(), read/write_bytes(), and glob(). This addition brings enhanced, Pythonic ways to interact with Databricks Workspace paths, including creating and moving files, managing directories, and generating browser-accessible URIs. Additionally, the commit includes updates to existing methods and introduces new fixtures for creating notebooks, accompanied by extensive unit tests to ensure reliability and functionality.
  • Added propagation of blueprint version into User-Agent header when it is used as library (#114). A new feature has been introduced in the library that allows for the propagation of the blueprint version and the name of the command line interface (CLI) command used in the User-Agent header when the library is utilized as a library. This feature includes the addition of two new pairs of OtherInfo: blueprint/X.Y.Z to indicate that the request is made using the blueprint library and cmd/<name> to store the name of the CLI command used for making the request. The implementation involves using the with_user_agent_extra function from databricks.sdk.config to set the user agent consistently with the Databricks CLI. Several changes have been made to the test file for test_useragent.py to include a new test case, test_user_agent_is_propagated, which checks if the blueprint version and the name of the command are correctly propagated to the User-Agent header. A context manager http_fixture_server has been added that creates an HTTP server with a custom handler, which extracts the blueprint version and the command name from the User-Agent header and stores them in the user_agent dictionary. The test case calls the foo command with a mocked WorkspaceClient instance and sets the DATABRICKS_HOST and DATABRICKS_TOKEN environment variables to test the propagation of the blueprint version and the command name in the User-Agent header. The test case then asserts that the blueprint version and the name of the command are present and correctly set in the user_agent dictionary.
  • Bump actions/checkout from 4.1.6 to 4.1.7 (#112). In this release, the version of the "actions/checkout" action used in the Checkout Code step of the acceptance workflow has been updated from 4.1.6 to 4.1.7. This update may include bug fixes, performance improvements, and new features, although specific changes are not mentioned in the commit message. The Unshallow step remains unchanged, continuing to fetch and clean up the repository's history. This update ensures that the latest enhancements from the "actions/checkout" action are utilized, aiming to improve the reliability and performance of the code checkout process in the GitHub Actions workflow. Software engineers should be aware of this update and its potential impact on their workflows.

Dependency updates:

  • Bump actions/checkout from 4.1.6 to 4.1.7 (#112).

0.6.3

  • fixed Command.get_argument_type bug with UnionType (#110). In this release, the Command.get_argument_type method has been updated to include special handling for UnionType, resolving a bug that caused the function to crash when encountering this type. The method now returns the string representation of the annotation if the argument is a UnionType, providing more accurate and reliable results. To facilitate this, modifications were made using the types module. Additionally, the foo function has a new optional argument optional_arg of type str, with a default value of None. This argument is passed to the some function in the assertion. The Prompts type has been added to the foo function signature, and an assertion has been added to verify if prompts is an instance of Prompts. Lastly, the default value of the address argument has been changed from an empty string to "default", and the same changes have been applied to the test_injects_prompts test function.

0.6.2

  • Applied type casting & remove empty kwarg for Command (#108). A new method, get_argument_type, has been added to the Command class in the cli.py file to determine the type of a given argument name based on the function's signature. The _route method has been updated to remove any empty keyword arguments from the kwargs dictionary, and apply type casting based on the argument type using the get_argument_type method. This ensures that the kwargs passed into App.command are correctly typed and eliminates any empty keyword arguments, which were previously passed as empty strings. In the test file for the command-line interface, the foo command's keyword arguments have been updated to include age (int), salary (float), is_customer (bool), and address (str) types, with the name argument remaining and a default value for address. The test_commands and test_injects_prompts functions have been updated accordingly. These changes aim to improve the input validation and type safety of the App.command method.

0.6.1

  • Made ProductInfo.version a cached_property to avoid failure when comparing wheel uploads in development (#105). In this release, the apply method of a class has been updated to sort upgrade scripts in semantic versioning order before applying them, addressing potential issues with version comparison during development. The implementation of ProductInfo.version has been refactored to a cached_property called _version, which calculates and caches the project version, addressing a failure during wheel upload comparisons in development. The Wheels class constructor has also been updated to include explicit keyword-only arguments, and a deprecation warning has been added. These changes aim to improve the reliability and predictability of the upgrade process and the library as a whole.

Dependency updates:

  • Bump actions/checkout from 4.1.5 to 4.1.6 (#106).

0.6.0

  • Added upstream wheel uploads for Databricks Workspaces without Public Internet access (#99). This commit introduces a new feature for uploading upstream wheel dependencies to Databricks Workspaces without Public Internet access. A new flag has been added to upload functions, allowing users to include or exclude dependencies in the download list. The WheelsV2 class has been updated with a new method, upload_wheel_dependencies(prefixes), which checks if each wheel's name starts with any of the provided prefixes before uploading it to the Workspace File System (WSFS). This feature also includes two new tests to verify the functionality of uploading the main wheel package and dependent wheel packages, optimizing downloads based on specific use cases. This enables users to more easily use the package in offline environments with restricted internet access, particularly for Databricks Workspaces with extra layers of network security.
  • Fixed bug for double-uploading of unreleased wheels in air-gapped setups (#103). In this release, we have addressed a bug in the upload_wheel_dependencies method of the WheelsV2 class, which caused double-uploading of unreleased wheels in air-gapped setups. This issue occurred due to the condition if wheel.name == self._local_wheel.name not being met, resulting in undefined behavior. We have introduced a cached property _current_version to tackle this bug for unreleased versions uploaded to air-gapped workspaces. We also added a new method, upload_to_wsfs(), that uploads files to the workspace file system (WSFS) in the integration test. This release also includes new tests to ensure that only the Databricks SDK is uploaded and that the number of installation files is correct. These changes have resolved the double-uploading issue, and the number of installation files, Databricks SDK, Blueprint, and version.json metadata are now uploaded correctly to WSFS.

0.5.0

  • Added content assertion for assert_file_uploaded and assert_file_dbfs_uploaded in MockInstallation (#101). The recent commit introduces a content assertion feature to the MockInstallation class, enhancing its testing capabilities. This is achieved by adding an optional expected parameter of type bytes to the assert_file_uploaded and assert_file_dbfs_uploaded methods, allowing users to verify the uploaded content's correctness. The _assert_upload method has also been updated to accept this new parameter, ensuring the actual uploaded content matches the expected content. Furthermore, the commit includes informative docstrings for the new and updated methods, providing clear explanations of their functionality and usage. To support these improvements, new test cases test_assert_file_uploaded and test_load_empty_data_class have been added to the tests/unit/test_installation.py file, enabling more rigorous testing of the MockInstallation class and ensuring that the expected content is uploaded correctly.
  • Added handling for partial functions in parallel.Threads (#93). In this release, we have enhanced the parallel.Threads module with the ability to handle partial functions, addressing issue #93. This improvement includes the addition of a new static method, _get_result_function_signature, to obtain the signature of a function or a string representation of its arguments and keywords if it is a partial function. The _wrap_result class method has also been updated to log an error message with the function's signature if an exception occurs. Furthermore, we have added a new test case, test_odd_partial_failed, to the unit tests, ensuring that the gather function handles partial functions that raise errors correctly. The Python version required for this project remains at 3.10, and the pyproject.toml file has been updated to include "isort", "mypy", "types-PyYAML", and types-requests in the list of dependencies. These adjustments are aimed at improving the functionality and type checking in the parallel.Threads module.
  • Align configurations with UCX project (#96). This commit brings project configurations in line with the UCX project through various fixes and updates, enhancing compatibility and streamlining collaboration. It addresses pylint configuration warnings, adjusts GitHub Actions workflows, and refines the pyproject.toml file. Additionally, the NiceFormatter class in logger.py has been improved for better code readability, and the versioning scheme has been updated to ensure SemVer and PEP440 compliance, making it easier to manage and understand the project's versioning. Developers adopting the project will benefit from these alignments, as they promote adherence to the project's standards and up-to-date best practices.
  • Check backwards compatibility with UCX, Remorph, and LSQL (#84). This release includes an update to the dependabot configuration to check for daily updates in both the pip and github-actions package ecosystems, with a new directory parameter added for the pip ecosystem for more precise update management. Additionally, a new GitHub Actions workflow, "downstreams", has been added to ensure backwards compatibility with UCX, Remorph, and LSQL by running automated downstream checks on pull requests, merge groups, and pushes to the main branch. The workflow has appropriate permissions for writing id-tokens, reading contents, and writing pull-requests, and runs the downstreams action from the databrickslabs/sandbox repository using GITHUB_TOKEN for authentication. These changes improve the security and maintainability of the project by ensuring compatibility with downstream projects and staying up-to-date with the latest package versions, reducing the risk of potential security vulnerabilities and bugs.

Dependency updates:

  • Bump actions/setup-python from 4 to 5 (#89).
  • Bump softprops/action-gh-release from 1 to 2 (#87).
  • Bump actions/checkout from 2.5.0 to 4.1.2 (#88).
  • Bump codecov/codecov-action from 1 to 4 (#85).
  • Bump actions/checkout from 4.1.2 to 4.1.3 (#95).
  • Bump actions/checkout from 4.1.3 to 4.1.5 (#100).

0.4.4

  • If Threads.strict() raises just one error, don't wrap it with ManyError (#79). The strict method in the gather function of the parallel.py module in the databricks/labs/blueprint package has been updated to change the way it handles errors. Previously, if any task in the tasks sequence failed, the strict method would raise a ManyError exception containing all the errors. With this change, if only one error occurs, that error will be raised directly without being wrapped in a ManyError exception. This simplifies error handling and avoids unnecessary nesting of exceptions. Additionally, the __tracebackhide__ dunder variable has been added to the method to improve the readability of tracebacks by hiding it from the user. This update aims to provide a more streamlined and user-friendly experience for handling errors in parallel processing tasks.

0.4.3

  • Fixed marshalling & unmarshalling edge cases (#76). The serialization and deserialization methods in the code have been updated to improve handling of edge cases during marshalling and unmarshalling of data. When encountering certain edge cases, the _marshal_list method will now return an empty list instead of None, and both the _unmarshal and _unmarshal_dict methods will return None as is if the input is None. Additionally, the _unmarshal method has been updated to call _unmarshal_generic instead of checking if the type reference is a dictionary or list when it is a generic alias. The _unmarshal_generic method has also been updated to handle cases where the input is None. A new test case, test_load_empty_data_class(), has been added to the tests/unit/test_installation.py file to verify this behavior, ensuring that the correct behavior is maintained when encountering these edge cases during the marshalling and unmarshalling processes. These changes increase the reliability of the serialization and deserialization processes.

0.4.2

  • Fixed edge cases when loading typing.Dict, typing.List and typing.ClassVar (#74). In this release, we have implemented changes to improve the handling of edge cases related to the Python typing.Dict, typing.List, and typing.ClassVar during serialization and deserialization of dataclasses and generic types. Specifically, we have modified the _marshal and _unmarshal functions to check for the __origin__ attribute to determine whether the type is a ClassVar and skip it if it is. The _marshal_dataclass and _unmarshal_dataclass functions now check for the __dataclass_fields__ attribute to ensure that only dataclass fields are marshaled and unmarshaled. We have also added a new unit test for loading a complex data class using the MockInstallation class, which contains various attributes such as a string, a nested dictionary, a list of Policy objects, and a dictionary mapping string keys to Policy objects. This test case checks that the installation object correctly serializes and deserializes the ComplexClass instance to and from JSON format according to the specified attribute types, including handling of the typing.Dict, typing.List, and typing.ClassVar types. These changes improve the reliability and robustness of our library in handling complex data types defined in the typing module.
  • MockPrompts.extend() now returns a copy (#72). In the latest release, the extend() method in the MockPrompts class of the tui.py module has been enhanced. Previously, extend() would modify the original MockPrompts object, which could lead to issues when reusing the same object in multiple places within the same test, as its state would be altered each time extend() was called. This has been addressed by updating the extend() method to return a copy of the MockPrompts object with the updated patterns and answers, instead of modifying the original object. This change ensures that the original MockPrompts object can be securely reused in multiple test scenarios without unintended side effects, preserving the integrity of the original state. Furthermore, additional tests have been incorporated to verify the correct behavior of both the new and original prompts.

0.4.1

  • Fixed MockInstallation to emulate workspace-global setup (#69). In this release, the MockInstallation class in the installation module has been updated to better replicate a workspace-global setup, enhancing testing and development accuracy. The is_global method now utilizes the product method instead of _product, and a new instance variable _is_global with a default value of True is introduced in the __init__ method. Moreover, a new product method is included, which consistently returns the string "mock". These enhancements resolve issue #69, "Fixed MockInstallation to emulate workspace-global setup", ensuring the MockInstallation instance behaves as a global installation, facilitating precise and reliable testing and development for our software engineering team.
  • Improved MockPrompts with extend() method (#68). In this release, we've added an extend() method to the MockPrompts class in our library's TUI module. This new method allows developers to add new patterns and corresponding answers to the existing list of questions and answers in a MockPrompts object. The added patterns are compiled as regular expressions and the questions and answers list is sorted by the length of the regular expression patterns in descending order. This feature is particularly useful for writing tests where prompt answers need to be changed, as it enables better control and customization of prompt responses during testing. By extending the list of questions and answers, you can handle additional prompts without modifying the existing ones, resulting in more organized and maintainable test code. If a prompt hasn't been mocked, attempting to ask a question with it will raise a ValueError with an appropriate error message.
  • Use Hatch v1.9.4 to as build machine requirement (#70). The Hatch package version for the build machine requirement has been updated from 1.7.0 to 1.9.4 in this change. This update streamlines the Hatch setup and version management, removing the specific installation step and listing hatch directly in the required field. The pre-setup command now only includes "hatch env create". Additionally, the acceptance tool version has been updated to ensure consistent project building and testing with the specified Hatch version. This change is implemented in the acceptance workflow file and the version of the acceptance tool used by the sandbox. This update ensures that the project can utilize the latest features and bug fixes available in Hatch 1.9.4, improving the reliability and efficiency of the build process. This change is part of the resolution of issue #70.

0.4.0

  • Added commands with interactive prompts (#66). This commit introduces a new feature in the Databricks Labs project to support interactive prompts in the command-line interface (CLI) for enhanced user interactivity. The Prompts argument, imported from databricks.labs.blueprint.tui, is now integrated into the @app.command decorator, enabling the creation of commands with user interaction like confirmation prompts. An example of this is the me command, which confirms whether the user wants to proceed before displaying the current username. The commit also refactored the code to make it more efficient and maintainable, removing redundancy in creating client instances. The AccountClient and WorkspaceClient instances can now be provided automatically with the product name and version. These changes improve the CLI by making it more interactive, user-friendly, and adaptable to various use cases while also optimizing the codebase for better efficiency and maintainability.
  • Added more code documentation (#64). This release introduces new features and updates to various files in the open-source library. The cli.py file in the src/databricks/labs/blueprint directory has been updated with a new decorator, command, which registers a function as a command. The entrypoint.py file in the databricks.labs.blueprint module now includes a module-level docstring describing its purpose, as well as documentation for the various standard libraries it imports. The Installation class in the installers.py file has new methods for handling files, such as load, load_or_default, upload, load_local, and files. The installers.py file also includes a new InstallationState dataclass, which is used to track installations. The limiter.py file now includes code documentation for the RateLimiter class and the rate_limited decorator, which are used to limit the rate of requests. The logger.py file includes a new NiceFormatter class, which provides a nicer format for logging messages with colors and bold text if the console supports it. The parallel.py file has been updated with new methods for running tasks in parallel and returning results and errors. The TUI.py file has been documented, and includes imports for logging, regular expressions, and collections abstract base class. Lastly, the upgrades.py file has been updated with additional code documentation and new methods for loading and applying upgrade scripts. Overall, these changes improve the functionality, maintainability, and usability of the open-source library.
  • Fixed init-project command (#65). In this release, the init-project command has been improved with several bug fixes and new functionalities. A new import statement for the sys module has been added, and a docs directory is now included in the copied directories and files during initialization. The init_project function has been updated to open files using the default system encoding, ensuring proper reading and writing of file contents. The relative_paths function in the entrypoint.py file now returns absolute paths if the common path is the root directory, addressing issue #41. Additionally, several test functions have been added to tests/unit/test_entrypoint.py, enhancing the reliability and robustness of the init-project command by providing comprehensive tests for supporting functions. Overall, these changes significantly improve the functionality and reliability of the init-project command, ensuring a more consistent and accurate project initialization process.
  • Using ProductInfo with integration tests (#63). In this update, the ProductInfo class has been enhanced with a new class method for_testing(klass) to facilitate effective integration testing. This method generates a new ProductInfo object with a random product_name, enabling the creation of distinct installation directories for each test execution. Prior to this change, conflicts and issues could arise when multiple test executions shared the same integration test folder. With the introduction of this new method, developers can now ensure that their integration tests run with unique product names and separate installation directories, enhancing testing isolation and accuracy. This update is demonstrated in the provided code snippet and includes a new test case to confirm the generation of unique product names. Furthermore, a pre-existing test case has been modified to provide a more specific error message related to the SingleSourceVersionError. This enhancement aims to improve the integration testing capabilities of the codebase and is designed to be easily adopted by other software engineers utilizing this project.

0.3.1

  • Fixed the order of marshal to handle Dataclass with as_dict before other types to avoid SerdeError (#60). In this release, we have addressed an issue that caused a SerdeError during the installation.save operation with a Dataclass object. The error was due to the order of evaluation in the _marshal_dataclass method. The order has been updated to evaluate the as_dict method first if it exists in the Dataclass, which resolves the SerdeError. To ensure the correctness of the fix, we have added a new test_data_class function that tests the save and load functionality with a Dataclass object. The test defines a Policy Dataclass with an as_dict method that returns a dictionary representation of the object and checks if the file is written correctly and if the loaded object matches the original object. This change has been thoroughly unit tested to ensure that it works as expected.

0.3.0

  • Added automated upgrade framework (#50). This update introduces an automated upgrade framework for managing and applying upgrades to the product, with a new upgrades.py file that includes a ProductInfo class having methods for version handling, wheel building, and exception handling. The test code organization has been improved, and new test cases, functions, and a directory structure for fixtures and unit tests have been added for the upgrades functionality. The test_wheels.py file now checks the version of the Databricks SDK and handles cases where the version marker is missing or does not contain the __version__ variable. Additionally, a new Application State Migrations section has been added to the README, explaining the process of seamless upgrades from version X to version Z through version Y, addressing the need for configuration or database state migrations as the application evolves. Users can apply these upgrades by following an idiomatic usage pattern involving several classes and functions. Furthermore, improvements have been made to the _trim_leading_whitespace function in the commands.py file of the databricks.labs.blueprint module, ensuring accurate and consistent removal of leading whitespace for each line in the command string, leading to better overall functionality and maintainability.
  • Added brute-forcing SerdeError with as_dict() and from_dict() (#58). This commit introduces a brute-forcing approach for handling SerdeError using as_dict() and from_dict() methods in an open-source library. The new SomePolicy class demonstrates the usage of these methods for manual serialization and deserialization of custom classes. The as_dict() method returns a dictionary representation of the class instance, and the from_dict() method, decorated with @classmethod, creates a new instance from the provided dictionary. Additionally, the GitHub Actions workflow for acceptance tests has been updated to include the ready_for_review event type, ensuring that tests run not only for opened and synchronized pull requests but also when marked as "ready for review." These changes provide developers with more control over the deserialization process and facilitate debugging in cases where default deserialization fails, but should be used judiciously to avoid brittle code.
  • Fixed nightly integration tests run as service principals (#52). In this release, we have enhanced the compatibility of our codebase with service principals, particularly in the context of nightly integration tests. The Installation class in the databricks.labs.blueprint.installation module has been refactored, deprecating the current method and introducing two new methods: assume_global and assume_user_home. These methods enable users to install and manage blueprint as either a global or user-specific installation. Additionally, the existing method has been updated to work with the new Installation methods. In the test suite, the test_installation.py file has been updated to correctly detect global and user-specific installations when running as a service principal. These changes improve the testability and functionality of our software, ensuring seamless operation with service principals during nightly integration tests.
  • Made test_existing_installations_are_detected more resilient (#51). In this release, we have added a new test function test_existing_installations_are_detected that checks if existing installations are correctly detected and retries the test for up to 15 seconds if they are not. This improves the reliability of the test by making it more resilient to potential intermittent failures. We have also added an import from databricks.sdk.retries named retried which is used to retry the test function in case of an AssertionError. Additionally, the test function test_existing has been renamed to test_existing_installations_are_detected and the xfail marker has been removed. We have also renamed the test function test_dataclass to test_loading_dataclass_from_installation for better clarity. This change will help ensure that the library is correctly detecting existing installations and improve the overall quality of the codebase.

0.2.5

  • Automatically enable workspace filesystem if the feature is disabled (#42).

0.2.4

  • Added more integration tests for Installation (#39).
  • Fixed yaml optional import error (#38).

0.2.3

  • Added special handling for notebooks in Installation.upload(...) (#36).

0.2.2

  • Fixed issues with uploading wheels to DBFS and loading a non-existing install state (#34).

0.2.1

  • Aligned Installation framework with UCX project (#32).

0.2.0

  • Added common install state primitives with strong typing (#27).
  • Added documentation for Invoking Databricks Connect (#28).
  • Added more documentation for Databricks CLI command router (#30).
  • Enforced pylint standards (#29).

0.1.0

  • Changed python requirement from 3.10.6 to 3.10 (#25).

0.0.6

  • Make find_project_root more deterministic (#23).

0.0.5

  • Make it work with ucx (#21).

0.0.4

  • Fixed sigstore action (#19).

0.0.3

  • Sign artifacts with Sigstore (#17).

0.0.2

  • Added extensive library documentation (#14).
  • Setup release to PyPI via GitHub OIDC (#15).

Release 0.0.1

  • Added .codegen.json and CHANGELOG.md templates for automated releases.
  • Added CODEOWNERS for code governance.
  • Added command framework for Databricks CLI launcher frontend (#10).
  • Added ProductInfo unreleased version fallback (#9).