-
-
Notifications
You must be signed in to change notification settings - Fork 505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[2.0] Add GridFS implementation on top of mongodb/mongodb #1790
Conversation
2efc50b
to
d852a68
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing work so far! Thanks @alcaeus
*/ | ||
public function setDefaultRepositoryClassName($className) | ||
public function setDefaultDocumentRepositoryClassName($className) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A note for future: this requires deprecation notice and maybe a forward compatibility in 1.3
} | ||
|
||
/** | ||
* Get default repository class for documents. |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
} | ||
|
||
/** | ||
* Get default repository class. | ||
* Get default repository class for GridFS files. |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
@@ -2,15 +2,22 @@ | |||
|
|||
declare(strict_types=1); | |||
|
|||
namespace Doctrine\ODM\MongoDB; | |||
namespace Doctrine\ODM\MongoDB\Repository; |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
*/ | ||
public function setDefaultGridFSRepositoryClassName($className) | ||
{ | ||
$reflectionClass = new \ReflectionClass($className); |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
|
||
private function isAllowedGridFSField(string $name): bool | ||
{ | ||
return in_array($name, ['_id', 'chunkSize', 'filename', 'length', 'metadata', 'uploadDate'], true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this array to a constant?
@@ -451,6 +461,15 @@ public function update($document, array $options = []) | |||
*/ | |||
public function delete($document, array $options = []) | |||
{ | |||
if ($this->bucket) { | |||
$id = $this->uow->getDocumentIdentifier($document); | |||
$id = $this->class->getDatabaseIdentifierValue($id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renaming these variables might make things a little more clear.
$documentIdentifier = $this->uow->getDocumentIdentifier($document);
$databaseIdentifier = $this->class->getDatabaseIdentifierValue($documentIdentifier);
*/ | ||
public function setDefaultRepositoryClassName($className) | ||
public function setDefaultDocumentRepositoryClassName($className) |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
$metadata = $this->metadataFactory->getMetadataFor($className); | ||
if ($metadata->isFile) { | ||
return $this->getDocumentBucket($className)->getFilesCollection(); |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
@@ -758,6 +765,10 @@ public function setDiscriminatorField($discriminatorField) | |||
return; | |||
} | |||
|
|||
if ($this->isFile) { | |||
throw MappingException::discriminatorNotAllowedForGridFS($this->name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense for this to go above the previous conditional that allows null
assignment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initial thought was to allow an extending class to remove the discriminator. On second thought, that kind of object hierarchy doesn't really make sense - I'll move the check.
@@ -820,6 +835,10 @@ public function setDefaultDiscriminatorValue($defaultDiscriminatorValue) | |||
return; | |||
} | |||
|
|||
if ($this->isFile) { | |||
throw MappingException::discriminatorNotAllowedForGridFS($this->name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise a question if it makes sense to do this before the null
assignment condition above.
@@ -81,6 +85,9 @@ public function loadMetadataForClass($className, \Doctrine\Common\Persistence\Ma | |||
$class->setCollection((string) $xmlRoot['collection']); | |||
} | |||
} | |||
if (isset($xmlRoot['bucket'])) { | |||
$class->setCollection((string) $xmlRoot['bucket']); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, are you setting the GridFS bucket prefix (e.g. "fs") as the collection name? If so, does that mean that interpretation of the collection property depends on the value of $class->isFile
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, yes. The alternative would be to bring in a bucket
property and use that. That would mean consuming bucket
if $class->isFile
is true
and collection
if it isn't. It doesn't change the logic, but it removes the dual-use of the collection
property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should enhance this to make it more clear and remove the dual-use of the collection property.
|
||
class DefaultGridFSRepository extends DocumentRepository implements GridFSRepository | ||
{ | ||
public function downloadToStream($id, $destination): void |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to have placeholder doc blocks here, even if they only report @see MongoDB\GridFS\Bucket::method()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the style for return types with a space before :
also used in ODM?
public function downloadToStream($id, $destination) : void
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We currently exclude the rule - this will be adjusted in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for the original comment, I'll add a docblock with a reference to the original method.
|
||
$id = $this->getDocumentBucket()->uploadFromStream($filename, $source, $options); | ||
|
||
// TODO: apply primary read preference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this to ensure that the subsequent find()
sees the same document that was just uploaded? Is there a particular reason you don't want to just return the document's ID as the library does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct - it's to avoid issues like doctrine/mongodb#324 from the start. I haven't considered only returning the ID, it would've been nice to get a file you can use directly. The other option would be to return a proxy to avoid a read cycle that may be useless if the user doesn't want to use the newly persisted file.
public function uploadFromFile(string $source, ?string $filename = null, $metadata = null) | ||
{ | ||
$resource = fopen($source, 'r'); | ||
if (! $resource) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a strict check for false
? I'm not sure if a valid file pointer in PHP can possibly evaluate to zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I know a successful created resource is never true with a check like this $resource == false
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a type-safe check never hurts - will update this.
throw MongoDBException::cannotReadGridFSSourceFile($source); | ||
} | ||
|
||
if (! $filename) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to explicitly check for null
here, in case someone wanted to use an empty string or "0"
as a filename?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the check should be stricter.
@@ -225,6 +234,10 @@ public function ensureDocumentIndexes($documentName, $timeout = null) | |||
throw new \InvalidArgumentException('Cannot create document indexes for mapped super classes, embedded documents or query result documents.'); | |||
} | |||
|
|||
if ($class->isFile) { | |||
$this->ensureGridFSIndexes($class); |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
{ | ||
$chunksCollection = $this->dm->getDocumentBucket($class->getName())->getChunksCollection(); | ||
foreach ($chunksCollection->listIndexes() as $index) { | ||
if ($index->isUnique() && $index->getKey() === ['files_id' => 1, 'n' => 1]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this duplicated array be in a constant or variable? It's also used in createIndex()
a few lines later.
{ | ||
$filesCollection = $this->dm->getDocumentCollection($class->getName()); | ||
foreach ($filesCollection->listIndexes() as $index) { | ||
if ($index->getKey() === ['filename' => 1, 'uploadDate' => 1]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as with the array above.
try { | ||
$gridFSMetadata = $class->getFieldMappingByDbFieldName('metadata'); | ||
$gridFSMetadataProperty = $gridFSMetadata['fieldName']; | ||
} catch (MappingException $e) { |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
/** | ||
* @Annotation | ||
*/ | ||
final class Metadata extends AbstractField |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
XML mappings have been finalized, see original comment for example. I've also created #1806 to track changing the ID mapping to the same kind of mapping as with GridFS. This is mainly done to avoid allowing generic |
5033ad0
to
35be2ef
Compare
Optional attributes: | ||
|
||
- | ||
targetDocument - A |FQCN| of the target document. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrap targetDocument, discriminatorField, etc. in ``? We do it in other parts of the docs and it is done in storing-files-with-mongogridfs.rst in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copy-pasted it from the original documentation, I'll fix it throughout the file for consistency 👍
37243a5
to
11ba674
Compare
$image = $profile->getImage(); | ||
$stream = fopen('tmp/path/to/copy', 'w+'); | ||
try { | ||
$repository->downloadToStream($file->getid(), $stream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpicking, but you have lowercase "i" in getId()
;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha, will fix before merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to disregard my comments and let's have this merged 👍
@@ -342,20 +345,39 @@ public function getFilterParameters(string $name): ?array | |||
/** | |||
* @throws MongoDBException If not is a ObjectRepository. | |||
*/ | |||
public function setDefaultRepositoryClassName(string $className): void | |||
public function setDefaultDocumentRepositoryClassName(string $className): void |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think compat layer for this could actually should land in 1.3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, definitely.
@@ -183,6 +185,12 @@ class ClassMetadata implements BaseClassMetadata | |||
*/ | |||
public $collection; | |||
|
|||
/** | |||
* READ-ONLY: The name of the GridFS bucket the document is mapped to. | |||
* @var string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be string|null
I think
* | ||
* @var int|null | ||
*/ | ||
public $chunkSizeBytes = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
= null
is redundant
|
||
public static function cannotPersistGridFSFile(string $className): self | ||
{ | ||
return new self(sprintf('Cannot persist GridFS file for class "%s" through UnitOfWork', $className)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing "." by the end
|
||
public static function cannotReadGridFSSourceFile(string $filename): self | ||
{ | ||
return new self(sprintf('Cannot open file "%s" for uploading to GridFS', $filename)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing "." by the end
|
||
.. _file_chunksize: | ||
|
||
@File\ChunkSize |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
ObjectId, although you can override this in the mapping. | ||
- | ||
``chunkSize`` stores the size of a single chunk in bytes. By default, chunks | ||
are 255 kb in size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"kb" may be ambiguous. I'd suggest quoting the PHPLIB docs here: "261120 (i.e. 255 KiB)"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, thanks for the suggestion.
echo $image->getFile()->getBytes(); | ||
With XML mappings, the fields are automatically mapped to camel-cased properties. | ||
To change property names, simply override the ``fieldName`` attribute for each | ||
field. You can not override any other options for GridFS fields. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be "cannot"? I'm not sure what is most commonly used in the docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A quick search favours "cannot" in our docs - updated.
You can of course make references to this Image document from | ||
another document. Imagine you had a Profile document and you wanted | ||
every Profile to have a profile image: | ||
The ``ImageMetadata`` class can be any embedded document: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense to say that the metadata document can be any embedded document class, but it seems odd to say that about ImageMetadata
. ImageMetadata
either is an embedded document class (can be used) or it isn't (cannot be used).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reworded to "must be an embedded document".
$image->setName('Test image'); | ||
$image->setFile('/path/to/image.png'); | ||
$repository = $documentManager->getRepository(Documents\Image::class); | ||
$file = $repository->uploadFromFile('image.jpg', '/tmp/path/to/image', new Documents\ImageMetadata('image/jpeg')); |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
$dm->flush(); | ||
When reading GridFS files, they behave like all other documents. You can query | ||
for them using the ``find*`` methods in the repository, create query or | ||
aggregation pipeline builders and also use them as ``targetDocument`` in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oxford comma: "builders, and"
*/ | ||
public function setDefaultGridFSRepositoryClassName(string $className): void | ||
{ | ||
$reflectionClass = new \ReflectionClass($className); |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
public $db; | ||
|
||
/** @var string */ | ||
public $bucketName = 'fs'; |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
/** | ||
* @see Bucket::downloadToStream | ||
*/ | ||
public function downloadToStream($id, $destination): void |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you still intend to create an openDownloadStream()
method? If not in this PR, should there be a separate ticket to do so before 2.0?
See: https://github.com/doctrine/mongodb-odm/pull/1790/files#r203132136
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That must have slipped through, I added it now.
This removes the dual-use from the collection property in ClassMetadata, storing the bucket name for GridFS files in the bucketName property. For compatibility, the collection name is initialized to the fs.files collection for the corresponding bucket.
3b04df8
to
d75dccc
Compare
private $name; | ||
If you want to pass options, such as a metadata object to the uploaded file, you | ||
If you want to add metadata to the uploaded file, you can pass it as the last | ||
can pass an ``UploadOptions`` object as the last argument to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"can pass it as the last" on the line above should be deleted.
|
||
/** @Field */ | ||
private $name; | ||
If you want to pass options, such as a metadata object to the uploaded file, you |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
If you want to pass options, such as a metadata object to the uploaded file, you | ||
If you want to add metadata to the uploaded file, you can pass it as the last | ||
can pass an ``UploadOptions`` object as the last argument to the | ||
``uploadFromFile``, ``uploadFromStream`` or ``openUploadStream`` method call: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oxford comma after uploadFromStream
.
Summary
This PR brings back support for GridFS. The idea is to provide mapped documents for GridFS files, while not re-implementing upload/download functionality currently in mongodb/mongodb. The general idea is:
_id
,length
,chunkSize
,filename
,uploadDate
andmetadata
metadata
field is checked for changes (if mapped)persist
on a new file instance is not supported - the file has to be uploaded using the methods in the newGridFSRepository
. This is because the specification suggests using streams, which isn't entirely feasible at this timeMapping
Annotation mapping is done using special annotations (e.g.
@File
,@File\Filename
,@File\UploadDate
) that have predefined types and database field names and are marked asnotSaved
as a precaution.XML mapping is done using special field mappings:
Note: the metadata field is treated like any other
embed-one
relationship; the repository does not check the given metadata object for its type when uploading a new file (see below).Files and objects
Files can be read and hydrated into objects like all other documents. File contents are no longer available in the object itself but has to be read using the repository methods (
downloadToStream
at this time).GridFS repository
This PR introduces a new interface for GridFS repositories along with a default implementation. The repository provides additional methods to upload and download files from and to streams. The repository is the only way to upload files, as mentioned above persisting a new file is not allowed and will cause an exception in
UnitOfWork
when callingpersist($newFile)
. Once a file has been read from the database, it can be updated like all other documents.BC breaks
Doctrine\ODM\MongoDB\Configuration
:setDefaultRepositoryClassName
andgetDefaultRepositoryClassName
have been replaced bysetDefaultDocumentRepositoryClassName
andgetDefaultDocumentRepositoryClassName
, respectively to accomodate the new GridFS repositories.@File
annotation has been changed and is no longer a property-level annotation but a class-level annotationmetadata
documentDoctrine\ODM\MongoDB\DocumentRepository
class was moved toDoctrine\ODM\MongoDB\Repository\DocumentRepository
.Todo
Document BC breaks inUPGRADE
document