Skip to content

Commit

Permalink
Package, publish and unpack avro schema artifact (#18)
Browse files Browse the repository at this point in the history
* Package, publish and unpack avro schema artifact

* Update README

* Fix caching and use separate target directory

* Introduce avroDependencyIncludeFilter setting

Extract avro schema from any desired dependency

* Use target setting

* Unpack avro sources dependencies in sourceManaged folder

Co-authored-by: Neville Li <neville@spotify.com>
  • Loading branch information
RustedBones and nevillelyh committed Jun 18, 2020
1 parent dfa04ef commit 3581fe7
Show file tree
Hide file tree
Showing 13 changed files with 294 additions and 43 deletions.
52 changes: 41 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,14 +33,16 @@ libraryDependencies += "org.apache.avro" % "avro" % "1.9.2"

## Settings

| Name | Default | Description |
|:-------------------------------|:----------------------------------|:------------|
| `avroSource` | `sourceDirectory` / `avro` | Source directory with `*.avsc`, `*.avdl` and `*.avpr` files. |
| `avroGenerated / target` | `sourceManaged` / `compiled_avro` | Source directory for generated `.java` files. |
| `avroStringType` | `CharSequence` | Type for representing strings. Possible values: `CharSequence`, `String`, `Utf8`. |
| `avroUseNamespace` | `false` | Validate that directory layout reflects namespaces, i.e. `src/main/avro/com/myorg/MyRecord.avsc`. |
| `avroFieldVisibility` | `public_deprecated` | Field Visibility for the properties. Possible values: `private`, `public`, `public_deprecated`. |
| `avroEnableDecimalLogicalType` | `true` | Set to true to use `java.math.BigDecimal` instead of `java.nio.ByteBuffer` for logical type `decimal`. |
| Name | Default | Description |
|:------------------------------------|:-------------------------------------------|:------------|
| `avroSource` | `sourceDirectory` / `avro` | Source directory with `*.avsc`, `*.avdl` and `*.avpr` files. |
| `avroUnpackDependencies` / `target` | `sourceManaged` / `avro` | Source directory for schemas packaged in the dependencies |
| `avroGenerate` / `taget` | `sourceManaged` / `compiled_avro` | Source directory for generated `.java` files. |
| `avroDependencyIncludeFilter` | `source` typed `avro` classifier artifacts | Dependencies containing avro schema to be unpacked for generation |
| `avroStringType` | `CharSequence` | Type for representing strings. Possible values: `CharSequence`, `String`, `Utf8`. |
| `avroUseNamespace` | `false` | Validate that directory layout reflects namespaces, i.e. `com/myorg/MyRecord.avsc`. |
| `avroFieldVisibility` | `public_deprecated` | Field Visibility for the properties. Possible values: `private`, `public`, `public_deprecated`. |
| `avroEnableDecimalLogicalType` | `true` | Set to true to use `java.math.BigDecimal` instead of `java.nio.ByteBuffer` for logical type `decimal`. |

## Examples

Expand All @@ -52,9 +54,37 @@ avroStringType := "String"

## Tasks

| Name | Description |
|:---------------|:------------|
| `avroGenerate` | Generate Java sources for Avro schemas. This task is automatically executed before `compile`.
| Name | Description |
|:-------------------------|:------------|
| `avroUnpackDependencies` | Unpack avro schemas from dependencies. This task is automatically executed before `avroGenerate`.
| `avroGenerate` | Generate Java sources for Avro schemas. This task is automatically executed before `compile`.
| `packageAvro` | Produces an avro artifact, such as a jar containing avro schemas.

## Packaging Avro files

Avro sources (`*.avsc`, `*.avdl` and `*.avpr` files) can be packaged in a separate jar with the `source` type and
`avro` classifier by running `packageAvro`.

By default, `sbt-avro` does not publish this. You can enable it with
```sbt
packageAvro / publishArtifact := true
```

## Declaring dependencies

You can specify a dependency on an avro source artifact that contains the schemas like so:

```sbt
libraryDependencies += "org" % "name" % "rev" classifier "avro"
```

If some avro schemas are not packaged in a `source/avro` artifact, you can update the `avroDependencyIncludeFilter`
setting to instruct the plugin to look for schemas in the desired dependency:

```sbt
libraryDependencies += "org" % "name" % "rev" // module containing avro schemas
avroDependencyIncludeFilter := avroDependencyIncludeFilter.value || moduleFilter(organization = "org", name = "name")
```

# License
This program is distributed under the BSD license. See the file `LICENSE` for more details.
Expand Down
135 changes: 105 additions & 30 deletions src/main/scala/sbtavro/SbtAvro.scala
Original file line number Diff line number Diff line change
Expand Up @@ -10,18 +10,28 @@ import org.apache.avro.generic.GenericData.StringType
import org.apache.avro.{Protocol, Schema}
import sbt.Keys._
import sbt._
import Path.relativeTo
import com.spotify.avro.mojo.AvroFileRef
import sbt.librarymanagement.DependencyFilter

/**
* Simple plugin for generating the Java sources for Avro schemas and protocols.
*/
object SbtAvro extends AutoPlugin {

val AvroClassifier = "avro"

private val AvroAvrpFilter: NameFilter = "*.avpr"
private val AvroAvdlFilter: NameFilter = "*.avdl"
private val AvroAvscFilter: NameFilter = "*.avsc"
private val AvroFilter: NameFilter = AvroAvscFilter | AvroAvdlFilter | AvroAvrpFilter

private val JavaFileFilter: NameFilter = "*.java"

object autoImport {

import Defaults._

// format: off
val avroStringType = settingKey[String]("Type for representing strings. Possible values: CharSequence, String, Utf8. Default: CharSequence.")
val avroEnableDecimalLogicalType = settingKey[Boolean]("Set to true to use java.math.BigDecimal instead of java.nio.ByteBuffer for logical type \"decimal\".")
Expand All @@ -30,32 +40,51 @@ object SbtAvro extends AutoPlugin {
val avroSource = settingKey[File]("Default Avro source directory.")
val avroValidate = settingKey[Boolean]("Avro Schema.Parser name validation. Default: `new Schema.Parser.getValidate()`")
val avroValidateDefaults = settingKey[Boolean]("Avro Schema.Parser default value validation. Default: `new Schema.Parser.getValidateDefaults()`")
val avroUnpackDependencies = taskKey[Seq[File]]("Unpack avro dependencies.")
val avroDependencyIncludeFilter = settingKey[DependencyFilter]("Filter for including modules containing avro dependencies.")

val avroGenerate = taskKey[Seq[File]]("Generate Java sources for Avro schemas.")
val packageAvro = taskKey[File]("Produces an avro artifact, such as a jar containing avro schemas.")
// format: on

lazy val defaultSettings: Seq[Setting[_]] = Seq(
avroDependencyIncludeFilter := artifactFilter(`type` = Artifact.SourceType, classifier = AvroClassifier)
) ++ addArtifact(Compile / packageAvro / artifact, Compile / packageAvro)

// settings to be applied for both Compile and Test
lazy val configScopedSettings: Seq[Setting[_]] = Seq(
avroSource := sourceDirectory.value / "avro",
// dependencies
avroUnpackDependencies / target := sourceManaged.value / "avro",
avroUnpackDependencies := unpackDependenciesTask(avroUnpackDependencies).value,
// source generation
avroGenerate / target := sourceManaged.value / "compiled_avro",
managedSourceDirectories += (avroGenerate / target).value,

// source generation
avroGenerate := sourceGeneratorTask(avroGenerate).value,
avroGenerate := sourceGeneratorTask(avroGenerate).dependsOn(avroUnpackDependencies).value,
sourceGenerators += avroGenerate.taskValue,
compile := compile.dependsOn(avroGenerate).value,
// packaging
packageAvro / artifactClassifier := Some(AvroClassifier),
packageAvro / publishArtifact := false,
// clean
clean := {
schemaParser.set(new Schema.Parser())
clean.value
}
) ++ packageTaskSettings(packageAvro, packageAvroMappings) ++ Seq(
packageAvro / artifact := (packageAvro / artifact).value.withType(Artifact.SourceType)
)
}

import autoImport._

override def trigger = allRequirements
def packageAvroMappings = Def.task {
(avroSource.value ** AvroFilter) pair relativeTo(avroSource.value)
}

override def trigger: PluginTrigger = allRequirements

override def requires = sbt.plugins.JvmPlugin
override def requires: Plugins = sbt.plugins.JvmPlugin

override lazy val globalSettings: Seq[Setting[_]] = Seq(
avroStringType := "CharSequence",
Expand All @@ -66,9 +95,46 @@ object SbtAvro extends AutoPlugin {
avroValidateDefaults := schemaParser.get().getValidateDefaults
)

override lazy val projectSettings: Seq[Setting[_]] =
override lazy val projectSettings: Seq[Setting[_]] = defaultSettings ++
Seq(Compile, Test).flatMap(c => inConfig(c)(configScopedSettings))

private def unpack(deps: Seq[File],
extractTarget: File,
streams: TaskStreams): Seq[File] = {
def cachedExtractDep(jar: File): Seq[File] = {
val cached = FileFunction.cached(
streams.cacheDirectory / jar.name,
inStyle = FilesInfo.lastModified,
outStyle = FilesInfo.exists
) { deps =>
IO.createDirectory(extractTarget)
deps.flatMap { dep =>
val set = IO.unzip(dep, extractTarget, AvroFilter)
if (set.nonEmpty) {
streams.log.info("Extracted from " + dep + set.mkString(":\n * ", "\n * ", ""))
}
set
}
}
cached(Set(jar)).toSeq
}

deps.flatMap(cachedExtractDep)
}

private def unpackDependenciesTask(key: TaskKey[Seq[File]]) = Def.task {
val avroArtifacts = update
.value
.filter((key / avroDependencyIncludeFilter).value)
.toSeq.map { case (_, _, _, file) => file }.distinct

unpack(
avroArtifacts,
(key / target).value,
(key / streams).value
)
}

def compileIdl(idl: File, target: File, stringType: StringType, fieldVisibility: FieldVisibility, enableDecimalLogicalType: Boolean) {
val parser = new Idl(idl)
val protocol = Protocol.parse(parser.CompilationUnit.toString)
Expand All @@ -81,12 +147,8 @@ object SbtAvro extends AutoPlugin {

val schemaParser = new AtomicReference(new Schema.Parser())

def compileAvscs(srcDir: File, target: File, stringType: StringType, fieldVisibility: FieldVisibility, enableDecimalLogicalType: Boolean, useNamespace: Boolean, validate: Boolean, validateDefaults: Boolean) {
def compileAvscs(refs: Seq[AvroFileRef], target: File, stringType: StringType, fieldVisibility: FieldVisibility, enableDecimalLogicalType: Boolean, useNamespace: Boolean, validate: Boolean, validateDefaults: Boolean) {
import com.spotify.avro.mojo._
val refs = (srcDir ** AvroAvscFilter).get.map { avsc =>
sbt.ConsoleLogger().info("Compiling Avro schemas %s".format(avsc))
new AvroFileRef(srcDir, avsc.relativeTo(srcDir).get.toString)
}

val global = schemaParser.get()
// copy of global schemaParser to avoid race condition
Expand Down Expand Up @@ -116,42 +178,55 @@ object SbtAvro extends AutoPlugin {
compiler.compileToDestination(null, target)
}

private[this] def compileAvroSchema(srcDir: File, target: File, log: Logger, stringTypeName: String, fieldVisibilityName: String, enableDecimalLogicalType: Boolean, useNamespace: Boolean, validate: Boolean, validateDefaults: Boolean): Set[File] = {
val stringType = StringType.valueOf(stringTypeName)
val fieldVisibility = SpecificCompiler.FieldVisibility.valueOf(fieldVisibilityName.toUpperCase)
log.info("Avro compiler using stringType=%s".format(stringType))

for (idl <- (srcDir ** AvroAvdlFilter).get) {
log.info("Compiling Avro IDL %s".format(idl))
private[this] def compileAvroSchema(srcDir: File,
target: File,
log: Logger,
stringType: StringType,
fieldVisibility: FieldVisibility,
enableDecimalLogicalType: Boolean,
useNamespace: Boolean,
validate: Boolean,
validateDefaults: Boolean): Set[File] = {
(srcDir ** AvroAvdlFilter).get.foreach { idl =>
log.info(s"Compiling Avro IDL $idl")
compileIdl(idl, target, stringType, fieldVisibility, enableDecimalLogicalType)
}

compileAvscs(srcDir, target, stringType, fieldVisibility, enableDecimalLogicalType, useNamespace, validate, validateDefaults)
val avscs = (srcDir ** AvroAvscFilter).get.map { avsc =>
log.info(s"Compiling Avro schemas $avsc")
new AvroFileRef(srcDir, avsc.relativeTo(srcDir).get.toString)
}
compileAvscs(avscs, target, stringType, fieldVisibility, enableDecimalLogicalType, useNamespace, validate, validateDefaults)

for (avpr <- (srcDir ** AvroAvrpFilter).get) {
log.info("Compiling Avro protocol %s".format(avpr))
(srcDir ** AvroAvrpFilter).get.foreach { avpr =>
log.info(s"Compiling Avro protocol $avpr")
compileAvpr(avpr, target, stringType, fieldVisibility, enableDecimalLogicalType)
}

(target ** "*.java").get.toSet
(target ** JavaFileFilter).get.toSet
}

private def sourceGeneratorTask(key: TaskKey[Seq[File]]) = Def.task {
val out = (key / streams).value
val srcDir = (key / avroSource).value
val outDir = (key / avroGenerate / target).value
val strType = avroStringType.value
val fieldVis = avroFieldVisibility.value
val externalSrcDir = (avroUnpackDependencies / target).value
val srcDir = avroSource.value
val outDir = (key / target).value
val strType = StringType.valueOf(avroStringType.value)
val fieldVis = SpecificCompiler.FieldVisibility.valueOf(avroFieldVisibility.value.toUpperCase)
val enbDecimal = avroEnableDecimalLogicalType.value
val useNs = avroUseNamespace.value
val validate = avroValidate.value
val validateDefaults = avroValidateDefaults.value
val cachedCompile = FileFunction.cached(out.cacheDirectory / "avro",
inStyle = FilesInfo.lastModified,
outStyle = FilesInfo.exists) { (in: Set[File]) =>
val cachedCompile = {
FileFunction.cached(out.cacheDirectory / "avro", FilesInfo.lastModified, FilesInfo.exists) { _ =>
out.log.info(s"Avro compiler using stringType=$strType")
compileAvroSchema(externalSrcDir, outDir, out.log, strType, fieldVis, enbDecimal, useNs, validate, validateDefaults)
compileAvroSchema(srcDir, outDir, out.log, strType, fieldVis, enbDecimal, useNs, validate, validateDefaults)

}
cachedCompile((srcDir ** AvroFilter).get.toSet).toSeq
}

cachedCompile(((externalSrcDir +++ srcDir) ** AvroFilter).get.toSet).toSeq
}

}
48 changes: 48 additions & 0 deletions src/sbt-test/sbt-avro/publishing/build.sbt
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
import sbt.Keys.scalaVersion

lazy val commonSettings = Seq(
organization := "com.cavorite",
publishTo := Some(Opts.resolver.sonatypeReleases),
libraryDependencies ++= Seq(
"org.apache.avro" % "avro" % "1.9.2"
)
)

lazy val `external`: Project = project
.in(file("external"))
.settings(commonSettings)
.settings(
name := "external",
version := "0.0.1-SNAPSHOT",
crossPaths := false,
autoScalaLibrary := false,
packageAvro / publishArtifact := true
)

lazy val `transitive`: Project = project
.in(file("transitive"))
.settings(commonSettings)
.settings(
name := "transitive",
version := "0.0.1-SNAPSHOT",
scalaVersion := "2.12.11",
packageAvro / publishArtifact := true,
libraryDependencies ++= Seq(
"com.cavorite" % "external" % "0.0.1-SNAPSHOT" classifier "avro",
)
)

lazy val root: Project = project
.in(file("."))
.settings(commonSettings)
.settings(
name := "publishing-test",
scalaVersion := "2.12.11",
avroDependencyIncludeFilter := avroDependencyIncludeFilter.value ||
// add avro jar to unpack its json avsc schema
moduleFilter(organization = "org.apache.avro", name = "avro"),
libraryDependencies ++= Seq(
"com.cavorite" %% "transitive" % "0.0.1-SNAPSHOT" classifier "avro",
"org.specs2" %% "specs2-core" % "3.10.0" % "test"
)
)
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
@namespace("com.cavorite.external")
protocol ProtocolAvdl {
record Avdl {
string stringField;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
{
"namespace": "com.cavorite.external",
"protocol": "ProtocolAvpr",
"types": [
{
"name": "Avpr",
"type": "record",
"fields": [
{
"name": "stringField",
"type": "string"
}
]
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
{
"name": "Avsc",
"namespace": "com.cavorite.external",
"type": "record",
"fields": [
{
"name": "stringField",
"type": "string"
}
]
}
1 change: 1 addition & 0 deletions src/sbt-test/sbt-avro/publishing/project/build.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
sbt.version=1.3.10
7 changes: 7 additions & 0 deletions src/sbt-test/sbt-avro/publishing/project/plugins.sbt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
sys.props.get("plugin.version") match {
case Some(x) => addSbtPlugin("com.cavorite" % "sbt-avro" % x)
case _ => sys.error("""|The system property 'plugin.version' is not defined.
|Specify this property using the scriptedLaunchOpts -D.""".stripMargin)
}

libraryDependencies += "org.apache.avro" % "avro-compiler" % "1.9.2"
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
import com.cavorite.external
import com.cavorite.transitive


object Main extends App {

external.Avsc.newBuilder().setStringField("external").build()
transitive.Avsc.newBuilder().setStringField("transitive").build()

println("success")
}
Loading

0 comments on commit 3581fe7

Please sign in to comment.