Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finishing up the cleanup on org.bdgenomics.adam.rdd. #1264

Merged
merged 1 commit into from
Nov 16, 2016

Conversation

fnothaft
Copy link
Member

  • Made MDTagging private to read. Exposed it through AlignmentRecordRDD.
    Cleaned up where it was used in Transform.
  • Made FlagStat and most related models private to read.
  • Made all read.recalibration and read.realignment classes private to
    package. Depending on the class, this varied from:
    • adam for classes that need to be registered with kryo
    • read for the core RealignIndels and BaseQualityRecalibrator engines
    • Their subpackage where possible.
  • Made class values for genomic partitioners as private as possible.
  • Added documentation to all InFormatters, InFormatterCompanions, and OutFormatters. Made the InFormatters package private with private constructors.
  • Added method/class level scaladoc where missing.
  • Moved org.bdgenomics.adam.cli.FlagStatSuite to org.bdgenomics.adam.read, along with the NA12878.sam test resource, which was otherwise unused in the adam-cli submodule.

@heuermh
Copy link
Member

heuermh commented Nov 13, 2016

I may have asked elsewhere, it seems like the InFormatters and OutFormatters need to be publicly accessible, i.e.

https://github.com/heuermh/adam-snpeff/blob/master/src/main/scala/com/github/heuermh/adam/snpeff/AdamSnpEff.scala#L52

It may be that I'm not using them correctly here, in that they aren't really implicit if I have to explicitly instantiate them.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1595/
Test PASSed.

@fnothaft
Copy link
Member Author

The InFormatterCompanion and OutFormatter need to be public. The InFormatter has to be accessible from rdd, but it's constructor only has to be accessible from the companion.

@heuermh
Copy link
Member

heuermh commented Nov 14, 2016

Thanks for the clarification. I'll take a closer look at the formatter docs and the rest of this pull request tomorrow morning.

Copy link
Member

@heuermh heuermh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

fnothaft added a commit to fnothaft/adam that referenced this pull request Nov 15, 2016
Along with bigdatagenomics#1263 and bigdatagenomics#1264, this resolves bigdatagenomics#1083.

* Removing unused org.bdgenomics.adam.models.ReadBucket class.
* Move org.bdgenomics.adam.models.ReferencePositionPair and
  org.bdgenomics.adam.models.SingleReadBucket in to org.bdgenomics.adam.rdd.read
  and make package private.
* Clean up duplicated methods and methods that were incorrectly in companion
  singleton for SequenceDictionary and ReadGroupDictionary.
* Removed all SamReader references.
* Make writable file headers private to ADAM.
* Eliminated manual VCF parsing code in SnpTable.
* Cleaned up scaladoc for all classes and singleton objects.
* Moved `NonoverlappingRegions` test code out of `InnerBroadcastRegionJoinSuite`.
fnothaft added a commit to fnothaft/adam that referenced this pull request Nov 15, 2016
Along with bigdatagenomics#1263 and bigdatagenomics#1264, this resolves bigdatagenomics#1083.

* Removing unused org.bdgenomics.adam.models.ReadBucket class.
* Move org.bdgenomics.adam.models.ReferencePositionPair and
  org.bdgenomics.adam.models.SingleReadBucket in to org.bdgenomics.adam.rdd.read
  and make package private.
* Clean up duplicated methods and methods that were incorrectly in companion
  singleton for SequenceDictionary and ReadGroupDictionary.
* Removed all SamReader references.
* Make writable file headers private to ADAM.
* Eliminated manual VCF parsing code in SnpTable.
* Cleaned up scaladoc for all classes and singleton objects.
* Moved `NonoverlappingRegions` test code out of `InnerBroadcastRegionJoinSuite`.
* Made `MDTagging` private to `read`. Exposed it through `AlignmentRecordRDD`.
  Cleaned up where it was used in `Transform`.
* Made `FlagStat` and most related models private to `read`.
* Made all `read.recalibration` and `read.realignment` classes private to
  package. Depending on the class, this varied from:
  * `adam` for classes that need to be registered with `kryo`
  * `read` for the core `RealignIndels` and `BaseQualityRecalibrator` engines
  * Their subpackage where possible.
* Made class values for genomic partitioners as private as possible.
* Added documentation to all `InFormatter`s, `InFormatterCompanion`s, and
  `OutFormatter`s. Made the `InFormatter`s package private with private
  constructors.
* Added method/class level scaladoc where missing.
* Moved `org.bdgenomics.adam.cli.FlagStatSuite` to `org.bdgenomics.adam.read`,
  along with the `NA12878.sam` test resource, which was otherwise unused in the
  `adam-cli` submodule.
fnothaft added a commit to fnothaft/adam that referenced this pull request Nov 15, 2016
Along with bigdatagenomics#1263 and bigdatagenomics#1264, this resolves bigdatagenomics#1083.

* Removing unused org.bdgenomics.adam.models.ReadBucket class.
* Move org.bdgenomics.adam.models.ReferencePositionPair and
  org.bdgenomics.adam.models.SingleReadBucket in to org.bdgenomics.adam.rdd.read
  and make package private.
* Clean up duplicated methods and methods that were incorrectly in companion
  singleton for SequenceDictionary and ReadGroupDictionary.
* Removed all SamReader references.
* Make writable file headers private to ADAM.
* Eliminated manual VCF parsing code in SnpTable.
* Cleaned up scaladoc for all classes and singleton objects.
* Moved `NonoverlappingRegions` test code out of `InnerBroadcastRegionJoinSuite`.
@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1608/
Test PASSed.

@heuermh heuermh merged commit e929676 into bigdatagenomics:master Nov 16, 2016
@heuermh
Copy link
Member

heuermh commented Nov 16, 2016

Thank you, @fnothaft!

fnothaft added a commit to fnothaft/adam that referenced this pull request Nov 16, 2016
Along with bigdatagenomics#1263 and bigdatagenomics#1264, this resolves bigdatagenomics#1083.

* Removing unused org.bdgenomics.adam.models.ReadBucket class.
* Move org.bdgenomics.adam.models.ReferencePositionPair and
  org.bdgenomics.adam.models.SingleReadBucket in to org.bdgenomics.adam.rdd.read
  and make package private.
* Clean up duplicated methods and methods that were incorrectly in companion
  singleton for SequenceDictionary and ReadGroupDictionary.
* Removed all SamReader references.
* Make writable file headers private to ADAM.
* Eliminated manual VCF parsing code in SnpTable.
* Cleaned up scaladoc for all classes and singleton objects.
* Moved `NonoverlappingRegions` test code out of `InnerBroadcastRegionJoinSuite`.
fnothaft added a commit to fnothaft/adam that referenced this pull request Nov 16, 2016
Along with bigdatagenomics#1263 and bigdatagenomics#1264, this resolves bigdatagenomics#1083.

* Removing unused org.bdgenomics.adam.models.ReadBucket class.
* Move org.bdgenomics.adam.models.ReferencePositionPair and
  org.bdgenomics.adam.models.SingleReadBucket in to org.bdgenomics.adam.rdd.read
  and make package private.
* Clean up duplicated methods and methods that were incorrectly in companion
  singleton for SequenceDictionary and ReadGroupDictionary.
* Removed all SamReader references.
* Make writable file headers private to ADAM.
* Eliminated manual VCF parsing code in SnpTable.
* Cleaned up scaladoc for all classes and singleton objects.
* Moved `NonoverlappingRegions` test code out of `InnerBroadcastRegionJoinSuite`.
heuermh pushed a commit that referenced this pull request Nov 16, 2016
Along with #1263 and #1264, this resolves #1083.

* Removing unused org.bdgenomics.adam.models.ReadBucket class.
* Move org.bdgenomics.adam.models.ReferencePositionPair and
  org.bdgenomics.adam.models.SingleReadBucket in to org.bdgenomics.adam.rdd.read
  and make package private.
* Clean up duplicated methods and methods that were incorrectly in companion
  singleton for SequenceDictionary and ReadGroupDictionary.
* Removed all SamReader references.
* Make writable file headers private to ADAM.
* Eliminated manual VCF parsing code in SnpTable.
* Cleaned up scaladoc for all classes and singleton objects.
* Moved `NonoverlappingRegions` test code out of `InnerBroadcastRegionJoinSuite`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants