You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Add unit test to validate the dicom metadata extraction udf.
Describe the solution you'd like
The real challenge will be to add all the python artifacts necessary to run via db-connect v2.
This is the test_dicom.py that was started and needs to be abandoned...
import pytest
from databricks.connect import DatabricksSession
from pyspark.sql import SparkSession
from dbx.pixels import Catalog
from dbx.pixels.dicom import DicomMetaExtractor, DicomThumbnailExtractor # The Dicom transformers
from dbx.pixels.version import __version__
path = "s3://hls-eng-data-public/dicom/ddsm/benigns/patient0007/"
table = "main.pixels_solacc.object_catalog"
@pytest.fixture
def spark() -> SparkSession:
"""
Create a SparkSession (the entry point to Spark functionality) on
the cluster in the remote Databricks workspace. Unit tests do not
have access to this SparkSession by default.
"""
return DatabricksSession.builder.getOrCreate()
def test_dicom_happy(spark):
import datetime
catalog = Catalog(spark, table=table)
catalog_df = catalog.catalog(path=path)
meta_df = DicomMetaExtractor(catalog).transform(catalog_df)
assert meta_df.count() == 4
assert len(meta_df.columns) == 9
assert meta_df.explain(extended=True) == None
row = meta_df.selectExpr("left(meta,500)").collect()[0]
Describe alternatives you've considered
The automated notebook execution jobs do cover these code paths at a higher level.
Additional context
See internal slack conversations
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
Add unit test to validate the dicom metadata extraction udf.
Describe the solution you'd like
The real challenge will be to add all the python artifacts necessary to run via db-connect v2.
This is the test_dicom.py that was started and needs to be abandoned...
Describe alternatives you've considered
The automated notebook execution jobs do cover these code paths at a higher level.
Additional context
See internal slack conversations
The text was updated successfully, but these errors were encountered: