-
Notifications
You must be signed in to change notification settings - Fork 500
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add developer documentation about dependency management and rules to …
…follow.
- Loading branch information
1 parent
b52425b
commit e00b23c
Showing
4 changed files
with
225 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,221 @@ | ||
===================== | ||
Dependency Management | ||
===================== | ||
|
||
.. contents:: |toctitle| | ||
:local: | ||
|
||
Dataverse is a (currently) Java EE 7 based application, that uses a lot of additional libraries for special purposes. | ||
This includes features like support of SWORD-API, S3 storage and many others. | ||
|
||
Besides the code that glues together the single pieces, any developer needs to describe used dependencies for the | ||
Maven-based build system. Familiar to any Maven user, this happens inside the "Project Object Model" (POM) living in | ||
``pom.xml`` at the root of the project repository. Recursive and convergent dependency resolution makes dependency | ||
management with Maven very easy. But sometimes, in projects with a lot and big dependencies like Dataverse, you have | ||
to help Maven along making the right choices. | ||
|
||
Terms | ||
----- | ||
|
||
As a developer, you should make yourself familiar with the following terms: | ||
|
||
- **Direct dependencies**: things *you use* yourself in your own code for Dataverse. | ||
- **Transitive dependencies**: things *others use* for things you use, pulled in recursively. | ||
See also at `Maven docs <https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#Transitive_Dependencies>`_. | ||
|
||
.. graphviz:: | ||
|
||
digraph { | ||
rankdir="LR"; | ||
node [fontsize=10] | ||
|
||
yc [label="Your Code"] | ||
da [label="Direct Dependency A"] | ||
db [label="Direct Dependency B"] | ||
ta [label="Transitive Dependency TA"] | ||
tb [label="Transitive Dependency TB"] | ||
tc [label="Transitive Dependency TC"] | ||
dtz [label="Direct/Transitive Dependency Z"] | ||
|
||
yc -> da -> ta; | ||
yc -> db -> tc; | ||
da -> tb -> tc; | ||
db -> dtz; | ||
yc -> dtz; | ||
} | ||
|
||
Direct dependencies | ||
------------------- | ||
|
||
Within the POM, any direct dependencies live within the ``<dependencies>`` tag: | ||
|
||
.. code:: xml | ||
<dependencies> | ||
<dependency> | ||
<groupId>org.example</groupId> | ||
<artifactId>example</artifactId> | ||
<version>1.1.0</version> | ||
<scope>compile</scope> | ||
</dependency> | ||
</dependencies> | ||
Anytime you add a ``<dependency>``, Maven will try to fetch it from defined/configured repositories and use it | ||
within the build lifecycle. You have to define a ``<version>``, but ``<scope>`` is optional for ``compile``. | ||
(See `Maven docs: Dep. Scope <https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#Dependency_Scope>`_) | ||
|
||
|
||
During fetching, Maven will analyse all transitive dependencies (see graph above) and, if necessary, fetch those, too. | ||
Everything downloaded once is cached locally by default, so nothing needs to be fetched again and again, as long as the | ||
dependency definition does not change. | ||
|
||
**Rules to follow:** | ||
|
||
1. You should only use direct dependencies for **things you are actually using** in your code. | ||
2. **Cleanup** direct dependencies no longer in use. It will bloat the deployment package otherwise! | ||
3. Care about the **scope**. Do not include "testing only" dependencies in the package - it will hurt you in IDEs [#ide]_ and bloat things. | ||
4. Avoid using different dependencies for the **same purpose**, e. g. different JSON parsing libraries. | ||
5. Refactor your code to **use Java EE** standards as much as possible. | ||
6. When you rely on big SDKs or similar big cool stuff, try to **include the smallest portion possible**. Complete SDK | ||
bundles are typically heavyweight and most of the time unnecessary. | ||
7. **Don't include transitive dependencies.** [#ide2]_ | ||
|
||
* Exception: if you are relying on it in your code (see *Z* in the graph above), you must declare it. See below | ||
for proper handling in these (rare) cases. | ||
|
||
|
||
Transitive dependencies | ||
----------------------- | ||
|
||
Maven is comfortable for developers as it handles recursive resolution, downloading and adding "dependencies of dependencies". | ||
But as life is a box of chocolates, you might find yourself in *version conflict hell* sooner than later without even | ||
knowing, but experiencing unintended side effects. | ||
|
||
When you look at the graph above, imagine *B* and *TB* rely on different *versions* of *TC*. How does Maven decide | ||
which version it will include? Easy: the dependent version of the nearest version wins: | ||
|
||
.. graphviz:: | ||
|
||
digraph { | ||
rankdir="LR"; | ||
node [fontsize=10] | ||
|
||
yc [label="Your Code"] | ||
db [label="Direct Dependency B"] | ||
dtz1 [label="Z v1.0"] | ||
dtz2 [label="Z v2.0"] | ||
|
||
yc -> db -> dtz1; | ||
yc -> dtz2; | ||
} | ||
|
||
In this case, version "2.0" will be included. If you know something about semantic versioning, a red alert should ring in your mind right now. | ||
How do we know that *B* is compatible with *Z v2.0* when depending on *Z v1.0*? | ||
|
||
Another scenario getting us in trouble: indirect use of transitive dependencies. Imagine the following: we rely on *Z* | ||
in our code, but do not include a direct dependency for it within the POM. Now *B* is updated and removed its dependency | ||
on *Z*. You definitely don't want to head down that road. | ||
|
||
**Follow the rules to be safe:** | ||
|
||
1. Do **not use transitive deps implicit**: add a direct dependency for transitive deps you re-use in your code. | ||
2. On every build check that no implicit usage was added by accident. | ||
3. **Explicitly declare versions** of transitive dependencies in use by multiple direct dependencies. | ||
4. On every build check that there are no convergence problems hiding in the shadows. | ||
5. **Do tests** on every build to verify these explicit combinations work. | ||
|
||
Managing transitive dependencies in ``pom.xml`` | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
Maven can manage versions of transitive dependencies in four ways: | ||
|
||
1. Make a transitive dependency a direct one, which needs a ``<version>`` tag. Typically a bad idea, don't do that. | ||
2. Use ``<optional>`` or ``<exclusion>`` tags on direct dependencies that request the transitive dependency. | ||
*Last resort*, you really should avoid this. Not explained or used here. `See Maven docs <https://maven.apache.org/guides/introduction/introduction-to-optional-and-excludes-dependencies.html>`_. | ||
3. Explicitly declare the dependency in ``<dependencyManagement>`` and add a ``<version>`` tag. | ||
4. For more complex transitive dependencies, reuse a "Bill of Materials" (BOM) within ``<dependencyManagement>`` | ||
and add a ``<version>`` tag. Many bigger and standard use projects provide those, making the POM much less bloated. | ||
|
||
Examples to follow: | ||
|
||
.. code-block:: xml | ||
:linenos: | ||
<properties> | ||
<aws.version>1.11.172</aws.version> | ||
<!-- We need to ensure that our choosen version is compatible with every dependency relying on it. | ||
This is manual work and needs testing, but a good invest in stability and up-to-date dependencies. --> | ||
<jackson.version>2.9.6</jackson.version> | ||
<joda.version>2.10.1</joda.version> | ||
</properties> | ||
<!-- Transitive dependencies, bigger library "bill of materials" (BOM) and | ||
versions of dependencies used both directly and transitive are managed here. --> | ||
<dependencyManagement> | ||
<dependencies> | ||
<!-- First example for case 4. Only one part of the SDK (S3) is used and transitive deps | ||
of that are again managed by the upstream BOM. --> | ||
<dependency> | ||
<groupId>com.amazonaws</groupId> | ||
<artifactId>aws-java-sdk-bom</artifactId> | ||
<version>${aws.version}</version> | ||
<type>pom</type> | ||
<scope>import</scope> | ||
</dependency> | ||
<!-- Second example for case 4 and an example for explicit direct usage of a transitive dependency. | ||
Jackson is used by AWS SDK and others, but we also use it in Dataverse. --> | ||
<dependency> | ||
<groupId>com.fasterxml.jackson</groupId> | ||
<artifactId>jackson-bom</artifactId> | ||
<version>${jackson.version}</version> | ||
<scope>import</scope> | ||
<type>pom</type> | ||
</dependency> | ||
<!-- Example for case 3. Joda is not used in Dataverse (as of writing this). --> | ||
<dependency> | ||
<groupId>joda-time</groupId> | ||
<artifactId>joda-time</artifactId> | ||
<version>${joda.version}</version> | ||
</dependency> | ||
</dependencies> | ||
</dependencyManagement> | ||
<!-- Declare any DIRECT dependencies here. | ||
In case the depency is both transitive and direct (e. g. some common lib for logging), | ||
manage the version above and add the direct dependency here WITHOUT version tag, too. | ||
--> | ||
<dependencies> | ||
<dependency> | ||
<groupId>com.amazonaws</groupId> | ||
<artifactId>aws-java-sdk-s3</artifactId> | ||
<!-- no version here as managed by BOM above! --> | ||
</dependency> | ||
<!-- Should be refactored and removed once on Java EE 8 --> | ||
<dependency> | ||
<groupId>com.fasterxml.jackson.core</groupId> | ||
<artifactId>jackson-core</artifactId> | ||
<!-- no version here as managed above! --> | ||
</dependency> | ||
<!-- Should be refactored and removed once on Java EE 8 --> | ||
<dependency> | ||
<groupId>com.fasterxml.jackson.core</groupId> | ||
<artifactId>jackson-databind</artifactId> | ||
<!-- no version here as managed above! --> | ||
</dependency> | ||
</dependencies> | ||
Helpfull tools | ||
~~~~~~~~~~~~~~ | ||
|
||
TODO | ||
|
||
|
||
.. [#ide] Modern IDEs import your Maven POM and offer import autocompletion for classes based on direct dependencies | ||
in the model. You might end up using legacy or repackaged classes because of a wrong scope. | ||
.. [#ide2] This is going to bite back in modern IDEs when importing classes from transitive dependencies by "autocompletion accident". | ||
---- | ||
|
||
Previous: :doc:`documentation` | Next: :doc:`debugging` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters