Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate schemaParser property to avroSchemaParser task #33

Merged
merged 9 commits into from
Aug 7, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 15 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Add the plugin according to the [sbt documentation](https://www.scala-sbt.org/1.
For instance, add the following lines to `project/plugins.sbt`:

```
addSbtPlugin("com.cavorite" % "sbt-avro" % "2.1.0")
addSbtPlugin("com.cavorite" % "sbt-avro" % "3.0.0-SNAPSHOT")

// Java sources compiled with one version of Avro might be incompatible with a
// different version of the Avro library. Therefore we specify the compiler
Expand All @@ -33,16 +33,19 @@ libraryDependencies += "org.apache.avro" % "avro" % "1.10.0"

## Settings

| Name | Default | Description |
|:------------------------------------|:-------------------------------------------|:------------|
| `avroSource` | `sourceDirectory` / `avro` | Source directory with `*.avsc`, `*.avdl` and `*.avpr` files. |
| `avroUnpackDependencies` / `target` | `sourceManaged` / `avro` | Source directory for schemas packaged in the dependencies |
| `avroGenerate` / `taget` | `sourceManaged` / `compiled_avro` | Source directory for generated `.java` files. |
| `avroDependencyIncludeFilter` | `source` typed `avro` classifier artifacts | Dependencies containing avro schema to be unpacked for generation |
| `avroStringType` | `CharSequence` | Type for representing strings. Possible values: `CharSequence`, `String`, `Utf8`. |
| `avroUseNamespace` | `false` | Validate that directory layout reflects namespaces, i.e. `com/myorg/MyRecord.avsc`. |
| `avroFieldVisibility` | `public_deprecated` | Field Visibility for the properties. Possible values: `private`, `public`, `public_deprecated`. |
| `avroEnableDecimalLogicalType` | `true` | Set to true to use `java.math.BigDecimal` instead of `java.nio.ByteBuffer` for logical type `decimal`. |
| Name | Default | Description |
|:-------------------------------------|:-------------------------------------------|:------------|
| `avroSource` | `sourceDirectory` / `avro` | Source directory with `*.avsc`, `*.avdl` and `*.avpr` files. |
| `avroSchemaParserBuilder` | `DefaultSchemaParserBuilder.default()` | `.avsc` schema parser builder |
| `avroUnpackDependencies` / `target` | `sourceManaged` / `avro` | Source directory for schemas packaged in the dependencies |
| `avroGenerate` / `taget` | `sourceManaged` / `compiled_avro` | Source directory for generated `.java` files. |
| `avroDependencyIncludeFilter` | `source` typed `avro` classifier artifacts | Dependencies containing avro schema to be unpacked for generation |
| `packageAvro` / `artifactClassifier` | `Some("avro")` | Classifier for avro artifact |
| `packageAvro` / `publishArtifact` | `false` | Enable / Disable avro artifact publishing |
| `avroStringType` | `CharSequence` | Type for representing strings. Possible values: `CharSequence`, `String`, `Utf8`. |
| `avroUseNamespace` | `false` | Validate that directory layout reflects namespaces, i.e. `com/myorg/MyRecord.avsc`. |
| `avroFieldVisibility` | `public_deprecated` | Field Visibility for the properties. Possible values: `private`, `public`, `public_deprecated`. |
| `avroEnableDecimalLogicalType` | `true` | Set to true to use `java.math.BigDecimal` instead of `java.nio.ByteBuffer` for logical type `decimal`. |

## Examples

Expand Down Expand Up @@ -108,3 +111,4 @@ This program is distributed under the BSD license. See the file `LICENSE` for mo
- [Przemysław Dubaniewicz](https://github.com/przemekd)
- [Neville Li](https://github.com/nevillelyh)
- [Michel Davit](https://github.com/RustedBones)
- [Mārtiņš Kalvāns](https://github.com/sisidra)
88 changes: 51 additions & 37 deletions src/main/java/com/spotify/avro/mojo/AvscFilesCompiler.java
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ public class AvscFilesCompiler {

private static final Logger LOG = LoggerFactory.getLogger(AvscFilesCompiler.class);

private final SchemaParserBuilder builder;
private Schema.Parser schemaParser;
private String templateDirectory;
private GenericData.StringType stringType;
Expand All @@ -29,8 +30,9 @@ public class AvscFilesCompiler {
private Map<AvroFileRef, Exception> compileExceptions;
private boolean logCompileExceptions;

public AvscFilesCompiler(Schema.Parser schemaParser) {
this.schemaParser = schemaParser;
public AvscFilesCompiler(SchemaParserBuilder builder) {
this.builder = builder;
this.schemaParser = builder.build();
}

public void compileFiles(Set<AvroFileRef> files, File outputDirectory) {
Expand Down Expand Up @@ -74,46 +76,27 @@ public void compileFiles(Set<AvroFileRef> files, File outputDirectory) {
}

private boolean tryCompile(AvroFileRef src, File outputDirectory) {
// on failure Schema.Parser changes cache state.
// We want last successful state.
Schema.Parser successfulSchemaParser = new Schema.Parser();
successfulSchemaParser.addTypes(schemaParser.getTypes());
successfulSchemaParser.setValidate(schemaParser.getValidate());
successfulSchemaParser.setValidateDefaults(schemaParser.getValidateDefaults());

Schema.Parser successfulSchemaParser = stashParser();
final Schema schema;
try {
Schema schema = schemaParser.parse(src.getFile());

if (useNamespace) {
if (schema.getType() != Schema.Type.RECORD && schema.getType() != Schema.Type.ENUM) {
throw new SchemaGenerationException(String.format(
"Error compiling schema file %s. "
+ "Only one root RECORD or ENUM type is allowed per file.",
src
));
} else if (!src.pathToClassName().equals(schema.getFullName())) {
throw new SchemaGenerationException(String.format(
"Error compiling schema file %s. "
+ "File class name %s does not match record class name %s",
src,
src.pathToClassName(),
schema.getFullName()
));
}
}

SpecificCompiler compiler = new SpecificCompiler(schema);
compiler.setTemplateDir(templateDirectory);
compiler.setStringType(stringType);
compiler.setFieldVisibility(fieldVisibility);
compiler.setEnableDecimalLogicalType(enableDecimalLogicalType);
compiler.setCreateSetters(createSetters);
compiler.compileToDestination(src.getFile(), outputDirectory);

schema = schemaParser.parse(src.getFile());
validateParsedSchema(src, schema);
} catch (SchemaParseException e) {
schemaParser = successfulSchemaParser;
compileExceptions.put(src, e);
return false;
} catch (IOException e) {
throw new SchemaGenerationException(String.format("Error parsing schema file %s", src), e);
}

SpecificCompiler compiler = new SpecificCompiler(schema);
compiler.setTemplateDir(templateDirectory);
compiler.setStringType(stringType);
compiler.setFieldVisibility(fieldVisibility);
compiler.setEnableDecimalLogicalType(enableDecimalLogicalType);
compiler.setCreateSetters(createSetters);
try {
compiler.compileToDestination(src.getFile(), outputDirectory);
} catch (IOException e) {
throw new SchemaGenerationException(
String.format("Error compiling schema file %s to %s", src, outputDirectory), e);
Expand All @@ -122,6 +105,37 @@ private boolean tryCompile(AvroFileRef src, File outputDirectory) {
return true;
}

private Schema.Parser stashParser() {
// on failure Schema.Parser changes cache state.
// We want last successful state.
Schema.Parser parser = builder.build();
Set<String> predefinedTypes = parser.getTypes().keySet();
Map<String, Schema> compiledTypes = schemaParser.getTypes();
compiledTypes.keySet().removeAll(predefinedTypes);
parser.addTypes(compiledTypes);
return parser;
}

private void validateParsedSchema(AvroFileRef src, Schema schema) {
if (useNamespace) {
if (schema.getType() != Schema.Type.RECORD && schema.getType() != Schema.Type.ENUM) {
throw new SchemaGenerationException(String.format(
"Error compiling schema file %s. "
+ "Only one root RECORD or ENUM type is allowed per file.",
src
));
} else if (!src.pathToClassName().equals(schema.getFullName())) {
throw new SchemaGenerationException(String.format(
"Error compiling schema file %s. "
+ "File class name %s does not match record class name %s",
src,
src.pathToClassName(),
schema.getFullName()
));
}
}
}

public void setTemplateDirectory(String templateDirectory) {
this.templateDirectory = templateDirectory;
}
Expand Down
7 changes: 7 additions & 0 deletions src/main/java/com/spotify/avro/mojo/SchemaParserBuilder.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
package com.spotify.avro.mojo;

import org.apache.avro.Schema;

public interface SchemaParserBuilder {
Schema.Parser build();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not keeping the withTypes, withValidate and withValidateDefaults since all Schema.Parser have to support those ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When users implement their own builder, they most likely set those values themselves within it.
The default implementation still has ability for setting validations.
This approach allows for smaller api footprint allowing more flexibility how users want to implement builder, and not restricting us to future changes in Avro API (adding more validations, removing some?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then again - adding those function would allow easier control over validations for users if org infra team would provide custom builder (as I imagine it would be for us at Spotify :))

}
33 changes: 33 additions & 0 deletions src/main/scala/sbtavro/DefaultSchemaParserBuilder.scala
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
package sbtavro

import com.spotify.avro.mojo.SchemaParserBuilder
import org.apache.avro.Schema
import scala.collection.JavaConverters._

case class DefaultSchemaParserBuilder(types: Iterable[Schema],
validate: Boolean,
validateDefaults: Boolean)
extends SchemaParserBuilder {

override def build(): Schema.Parser = {
val parser = new Schema.Parser
parser.addTypes(types.map(el => el.getFullName() -> el).toMap.asJava)
parser.setValidate(validate)
parser.setValidateDefaults(validateDefaults)
parser
}
}

object DefaultSchemaParserBuilder {
def default(): DefaultSchemaParserBuilder = {
template(new Schema.Parser())
}

def template(template: Schema.Parser): DefaultSchemaParserBuilder = {
DefaultSchemaParserBuilder(
template.getTypes.values().asScala,
template.getValidate,
template.getValidateDefaults
)
}
}
38 changes: 10 additions & 28 deletions src/main/scala/sbtavro/SbtAvro.scala
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ import org.apache.avro.{Protocol, Schema}
import sbt.Keys._
import sbt._
import Path.relativeTo
import com.spotify.avro.mojo.AvroFileRef
import com.spotify.avro.mojo.{AvroFileRef, SchemaParserBuilder}
import sbt.librarymanagement.DependencyFilter

/**
Expand All @@ -38,8 +38,7 @@ object SbtAvro extends AutoPlugin {
val avroFieldVisibility = settingKey[String]("Field visibility for the properties. Possible values: private, public, public_deprecated. Default: public_deprecated.")
val avroUseNamespace = settingKey[Boolean]("Validate that directory layout reflects namespaces, i.e. src/main/avro/com/myorg/MyRecord.avsc.")
val avroSource = settingKey[File]("Default Avro source directory.")
val avroValidate = settingKey[Boolean]("Avro Schema.Parser name validation. Default: `new Schema.Parser.getValidate()`")
val avroValidateDefaults = settingKey[Boolean]("Avro Schema.Parser default value validation. Default: `new Schema.Parser.getValidateDefaults()`")
val avroSchemaParserBuilder = settingKey[SchemaParserBuilder](".avsc schema parser builder")
val avroUnpackDependencies = taskKey[Seq[File]]("Unpack avro dependencies.")
val avroDependencyIncludeFilter = settingKey[DependencyFilter]("Filter for including modules containing avro dependencies.")

Expand Down Expand Up @@ -71,11 +70,6 @@ object SbtAvro extends AutoPlugin {
// packaging
packageAvro / artifactClassifier := Some(AvroClassifier),
packageAvro / publishArtifact := false,
// clean
clean := {
schemaParser.set(new Schema.Parser())
clean.value
}
) ++ packageTaskSettings(packageAvro, packageAvroMappings) ++ Seq(
packageAvro / artifact := (packageAvro / artifact).value.withType(Artifact.SourceType)
)
Expand All @@ -96,8 +90,7 @@ object SbtAvro extends AutoPlugin {
avroFieldVisibility := "public_deprecated",
avroEnableDecimalLogicalType := true,
avroUseNamespace := false,
avroValidate := schemaParser.get().getValidate,
avroValidateDefaults := schemaParser.get().getValidateDefaults
avroSchemaParserBuilder := DefaultSchemaParserBuilder.default()
)

override lazy val projectSettings: Seq[Setting[_]] = defaultSettings ++
Expand Down Expand Up @@ -150,18 +143,9 @@ object SbtAvro extends AutoPlugin {
compiler.compileToDestination(null, target)
}

val schemaParser = new AtomicReference(new Schema.Parser())

def compileAvscs(refs: Seq[AvroFileRef], target: File, stringType: StringType, fieldVisibility: FieldVisibility, enableDecimalLogicalType: Boolean, useNamespace: Boolean, validate: Boolean, validateDefaults: Boolean) {
def compileAvscs(refs: Seq[AvroFileRef], target: File, stringType: StringType, fieldVisibility: FieldVisibility, enableDecimalLogicalType: Boolean, useNamespace: Boolean, builder: SchemaParserBuilder) {
import com.spotify.avro.mojo._

val global = schemaParser.get()
// copy of global schemaParser to avoid race condition
val parser = new Schema.Parser()
.addTypes(global.getTypes)
.setValidate(validate)
.setValidateDefaults(validateDefaults)
val compiler = new AvscFilesCompiler(parser)
val compiler = new AvscFilesCompiler(builder)
compiler.setStringType(stringType)
compiler.setFieldVisibility(fieldVisibility)
compiler.setUseNamespace(useNamespace)
Expand Down Expand Up @@ -190,8 +174,7 @@ object SbtAvro extends AutoPlugin {
fieldVisibility: FieldVisibility,
enableDecimalLogicalType: Boolean,
useNamespace: Boolean,
validate: Boolean,
validateDefaults: Boolean): Set[File] = {
builder: SchemaParserBuilder): Set[File] = {
(srcDir ** AvroAvdlFilter).get.foreach { idl =>
log.info(s"Compiling Avro IDL $idl")
compileIdl(idl, target, stringType, fieldVisibility, enableDecimalLogicalType)
Expand All @@ -201,7 +184,7 @@ object SbtAvro extends AutoPlugin {
log.info(s"Compiling Avro schemas $avsc")
new AvroFileRef(srcDir, avsc.relativeTo(srcDir).get.toString)
}
compileAvscs(avscs, target, stringType, fieldVisibility, enableDecimalLogicalType, useNamespace, validate, validateDefaults)
compileAvscs(avscs, target, stringType, fieldVisibility, enableDecimalLogicalType, useNamespace, builder)

(srcDir ** AvroAvrpFilter).get.foreach { avpr =>
log.info(s"Compiling Avro protocol $avpr")
Expand All @@ -220,13 +203,12 @@ object SbtAvro extends AutoPlugin {
val fieldVis = SpecificCompiler.FieldVisibility.valueOf(avroFieldVisibility.value.toUpperCase)
val enbDecimal = avroEnableDecimalLogicalType.value
val useNs = avroUseNamespace.value
val validate = avroValidate.value
val validateDefaults = avroValidateDefaults.value
val builder = avroSchemaParserBuilder.value
val cachedCompile = {
FileFunction.cached(out.cacheDirectory / "avro", FilesInfo.lastModified, FilesInfo.exists) { _ =>
out.log.info(s"Avro compiler using stringType=$strType")
compileAvroSchema(externalSrcDir, outDir, out.log, strType, fieldVis, enbDecimal, useNs, validate, validateDefaults)
compileAvroSchema(srcDir, outDir, out.log, strType, fieldVis, enbDecimal, useNs, validate, validateDefaults)
compileAvroSchema(externalSrcDir, outDir, out.log, strType, fieldVis, enbDecimal, useNs, builder)
compileAvroSchema(srcDir, outDir, out.log, strType, fieldVis, enbDecimal, useNs, builder)

}
}
Expand Down
15 changes: 15 additions & 0 deletions src/sbt-test/sbt-avro/avscparser/build.sbt
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
import java.util.Collections.{singletonMap, singletonList}
import org.apache.avro.Schema

name := "avscparser-test"

libraryDependencies ++= Seq(
"org.apache.avro" % "avro" % "1.10.0",
"org.specs2" %% "specs2-core" % "4.9.4" % Test
)

avroSchemaParserBuilder := AnnotateWithArtifactSchemaParser
.newBuilder(projectID.value)
.copy(types = singletonMap(
"B", Schema.createEnum("B", null, "com.cavorite.test.avscparser", singletonList("B1"))
))
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
import com.spotify.avro.mojo.SchemaParserBuilder
import org.apache.avro.Schema
import sbt.ModuleID

class AnnotateWithArtifactSchemaParser(
moduleID: ModuleID,
types: java.util.Map[String, Schema]
) extends org.apache.avro.Schema.Parser {

addTypes(types)

override def parse(file: java.io.File): org.apache.avro.Schema = {
val schema = super.parse(file)
if (schema.getType == org.apache.avro.Schema.Type.RECORD) {
schema.addProp("com.cavorite.sbt-avro.artifact", moduleID.toString())
}
schema
}

}

object AnnotateWithArtifactSchemaParser {

case class Builder(moduleID: ModuleID, types: java.util.Map[String, Schema])
extends SchemaParserBuilder {

override def build(): Schema.Parser =
new AnnotateWithArtifactSchemaParser(moduleID, types)
}

def newBuilder(moduleID: ModuleID): AnnotateWithArtifactSchemaParser.Builder =
new Builder(moduleID, java.util.Collections.emptyMap())

}
1 change: 1 addition & 0 deletions src/sbt-test/sbt-avro/avscparser/project/build.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
sbt.version=1.3.13
7 changes: 7 additions & 0 deletions src/sbt-test/sbt-avro/avscparser/project/plugins.sbt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
sys.props.get("plugin.version") match {
case Some(x) => addSbtPlugin("com.cavorite" % "sbt-avro" % x)
case _ => sys.error("""|The system property 'plugin.version' is not defined.
|Specify this property using the scriptedLaunchOpts -D.""".stripMargin)
}

libraryDependencies += "org.apache.avro" % "avro-compiler" % "1.10.0"
11 changes: 11 additions & 0 deletions src/sbt-test/sbt-avro/avscparser/src/main/avro/a.avsc
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
{
"name": "A",
"namespace": "com.cavorite.test.avscparser",
"type": "record",
"fields": [
{
"name": "supportsCustomRegisteredType",
"type": "B"
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
package sbtavro

import java.io.File

import org.apache.avro.Schema
import org.apache.avro.generic.GenericData.StringType
import org.specs2.mutable.Specification

import com.cavorite.test.avscparser.A

class AvscParserSpec extends Specification {

"A should have artifact property" >> {
A.getClassSchema().getProp("com.cavorite.sbt-avro.artifact") == "avscparser-test:avscparser-test:0.1.0-SNAPSHOT"
}

}
Loading