Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected exception when reading non-qbeast-formatted data #53

Closed
eavilaes opened this issue Dec 9, 2021 · 2 comments
Closed

Unexpected exception when reading non-qbeast-formatted data #53

eavilaes opened this issue Dec 9, 2021 · 2 comments
Labels
good first issue Good for newcomers type: bug Something isn't working

Comments

@eavilaes
Copy link
Contributor

eavilaes commented Dec 9, 2021

What went wrong?
The following exception should be thrown when you load data that is not in qbeast format or when the path does not exist. It works well when the path does not exist; however, a different exception is thrown when the path exists and it is non-qbeast-formatted data:

if (table.exists) {
table.load()
} else {
throw AnalysisExceptionFactory.create(
s"'$tableID' is not a Qbeast formatted data directory.")
}

My conclusion is that the format of the table is not checked, as this happens as well when trying to load a table indexed with an old version of the qbeast-spark format.

How to reproduce?

  1. Code that triggered the bug, or steps to reproduce:
  • Load an empty path:
val df = spark.read.format("qbeast").load("nonExistingPath")
org.apache.spark.sql.AnalysisException: 'nonExistingPath' is not a Qbeast formatted data directory.
  • But try to load a delta-formatted table or a qbeast table written with the old version of the format, and the exception will refer to the revision:
val df = spark.read.format("qbeast").load("deltaTablePath")
org.apache.spark.sql.AnalysisException: No space revision available with -1
  1. Branch and commit id:
    main, d9bd04a

  2. Spark version:
    3.1.1

  3. Hadoop version:
    3.2.0

  4. Are you running Spark inside a container? Are you launching the app on a remote K8s cluster? Or are you just running the tests in a local computer?
    N/A

  5. Stack trace:

val df = spark.read.format("qbeast").load("deltaTablePath")
org.apache.spark.sql.AnalysisException: No space revision available with -1
  at org.apache.spark.sql.AnalysisExceptionFactory$.create(AnalysisExceptionFactory.scala:36)
  at io.qbeast.spark.delta.DeltaQbeastSnapshot.$anonfun$getRevision$1(DeltaQbeastSnapshot.scala:81)
  at scala.collection.immutable.Map$EmptyMap$.getOrElse(Map.scala:104)
  at io.qbeast.spark.delta.DeltaQbeastSnapshot.getRevision(DeltaQbeastSnapshot.scala:81)
  at io.qbeast.spark.delta.DeltaQbeastSnapshot.loadLatestRevision(DeltaQbeastSnapshot.scala:140)
  at io.qbeast.spark.internal.sources.QbeastBaseRelation$.forDeltaTable(QbeastBaseRelation.scala:43)
  at io.qbeast.spark.table.IndexedTableImpl.createQbeastBaseRelation(IndexedTable.scala:194)
  at io.qbeast.spark.table.IndexedTableImpl.load(IndexedTable.scala:171)
  at io.qbeast.spark.internal.sources.QbeastDataSource.createRelation(QbeastDataSource.scala:90)
  at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:354)
  at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:326)
  at org.apache.spark.sql.DataFrameReader.$anonfun$load$1(DataFrameReader.scala:306)
  at scala.Option.map(Option.scala:230)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:266)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:240)
  ... 47 elided
@eavilaes eavilaes added the type: bug Something isn't working label Dec 9, 2021
@osopardo1
Copy link
Member

We can fix this with #44

@osopardo1 osopardo1 added the low label Dec 21, 2021
@eavilaes eavilaes linked a pull request Jan 10, 2022 that will close this issue
3 tasks
@osopardo1 osopardo1 added the good first issue Good for newcomers label Jun 8, 2022
@eavilaes eavilaes removed their assignment Sep 5, 2022
@osopardo1
Copy link
Member

I will close this issue because is more related to #121 and #102

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers type: bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants