-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Datasources - part 4: documentation #3105
Conversation
47bf1a7
to
2a4ba27
Compare
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
8916995
to
86fd669
Compare
Signed-off-by: Florian Dupuy <florian.dupuy@rte-france.com>
|
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
… into nro/datasources_4_documentation
It allows users to read and write files. It is for example used under the hood by Importers to access the filesystem | ||
during Network imports when using `Network.read()` methods. | ||
|
||
For importers and exporters, datasources are used to access files corresponding to a single network |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For importers and exporters, datasources are used to access files corresponding to a single network | |
For importers and exporters, datasources are used to access files corresponding to a single network. |
Note: this does not apply to compression extensions. | ||
|
||
_**Example:** | ||
For a file named `europe.west.xiidm.gz`, the base name could be `europe.west` for instance (or `europe` or `europe.w` or ...), while the data extension would be `xiidm`._ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a file named `europe.west.xiidm.gz`, the base name could be `europe.west` for instance (or `europe` or `europe.w` or ...), while the data extension would be `xiidm`._ | |
For a file named `europe.west.xiidm.gz`, the base name could be `europe.west` for instance (or `europe` or `europe.w` or ...), while the data extension would be `xiidm` and the compression extension `gz`._ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to remove the .gz
in the filename since we are not yet talking about compression
Two classes implement the `DataSource` interface: | ||
- `MemDataSource`: extension of `ReadOnlyMemDataSource` implementing the writing features of `DataSource` | ||
- `AbstractFileSystemDataSource`: abstract class used to define datasources based on files present in the file system, | ||
either directly or in an archive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
either directly or in an archive. | |
either directly (see below the DirectoryDataSource class and its children) or in an archive (see below the AbstractArchiveDataSource and its children). |
`ZstdDirectoryDataSource`. | ||
|
||
`DirectoryDataSource` integrates the notions of base name and data extension: | ||
- The base name is used to access files that all start with the same String. For example, `network` would |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The base name is used to access files that all start with the same String. For example, `network` would | |
- The base name is used to access files that all start with the same prefix. For example, `network` would |
`(String suffix, String ext)` as parameters, you still have the possibility to use files that do not correspond to the | ||
base name and data extension by using the methods with `(String filename)` as parameter, excluding the compression | ||
extension if there is one. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
document that listNames filters by the basename contrary to exists(String filename)
given as a parameter in the datasource constructor, the archive file name is even defined using the base name and the | ||
data extension, as `<directory>/<basename>.<dataExtension>.<archiveExtension>.<compressionExtension>` with the | ||
compression extension being optional depending on the archive format. For example `network.xiidm.zip` contains | ||
`network.xiidm`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
document that listnames lists everything without filtering by the basename
datasource.exists("network.south") // Returns false: the file "network.south.gz" does not exist | ||
datasource.exists("network.xiidm") // Returns true: the file "network.xiidm.gz" exists | ||
|
||
// Check if some files exist in the datasource by using the `exists(String fileName)` method |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Check if some files exist in the datasource by using the `exists(String fileName)` method | |
// Check if some files exist in the datasource by using the `exists(String suffix, String ext)` method |
} | ||
|
||
// List the files in the datasource | ||
Set<String> files = datasource.listNames(".*") // returns a set containing: "network", "network.south", "network.xiidm", "network.v3.xiidm", "network_test.txt" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
highlight that it filters out toto.xiidm.gz because of the basename filtering
|
||
// List the files in the datasource | ||
Set<String> files = datasource.listNames(".*") // returns a set containing: "network", "network.south", "network.xiidm", "network.v3.xiidm", "network_test.txt" | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More things to eventually document (in this PR or in another):
- Use different datasource on the same directory to select different network
- exists(filename) on a file with a different basename that returns true
## Principles | ||
|
||
Datasources are Java-objects used for I/O operations around PowSyBl. | ||
It allows users to read and write files. It is for example used under the hood by Importers to access the filesystem |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It allows users to read and write files. It is for example used under the hood by Importers to access the filesystem | |
A Datasource allows users to read and write files. It is for example used under the hood by Importers to access the filesystem |
reading features. | ||
It has two parameters: | ||
- a base name, which is a prefix that can be used to consider only files with this prefix (while reading) or as a prefix for | ||
the output file (while writing), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the output file (while writing), | |
the output file name (while writing), |
`ReadOnlyDataSource` is the most basic datasource interface available. As you can tell by the name, it only provides | ||
reading features. | ||
It has two parameters: | ||
- a base name, which is a prefix that can be used to consider only files with this prefix (while reading) or as a prefix for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- a base name, which is a prefix that can be used to consider only files with this prefix (while reading) or as a prefix for | |
- a base name, which is a prefix that can be used to consider only files with names starting with this prefix (while reading) or as a prefix for |
Two classes implement the `DataSource` interface: | ||
- `MemDataSource`: extension of `ReadOnlyMemDataSource` implementing the writing features of `DataSource` | ||
- `AbstractFileSystemDataSource`: abstract class used to define datasources based on files present in the file system, | ||
either (see below the DirectoryDataSource class and its children) or in an archive (see below the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
either (see below the DirectoryDataSource class and its children) or in an archive (see below the | |
either directly (see below the DirectoryDataSource class and its children) or in an archive (see below the |
be a good base name if your files are `network.xiidm`, `network_mapping.csv`, etc. | ||
- The data extension is the last extension of your main files, excluding the compression extension if they have one. | ||
It usually corresponds to the data format extension: `csv`, `xml`, `json`, `xiidm`, etc. This extension is mainly used | ||
to disambiguate the files to use in the datasource, for example when you have files that differ only by the data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a bit similar to the earlier sentence "(optionally) a data extension, mainly used to disambiguate identically named data of different type." so maybe either remove or make it a lot more specific: not "mainly used". maybesomething like
"just like you can create 2 different datasources selecting a different subset of files in a folder based on a different prefix (e.g. france.xiidm and europe.xiidm), you can use the data extension to select either france.xiidm or france.uct"
|
||
// Using a datasource with different parameters allows to use other files, even on the same directory | ||
GzDirectoryDataSource totoDatasource = new GzDirectoryDataSource(testDir, "toto", "xiidm", observer); | ||
oolean totoDatasource.exists(null, "xiidm"); // Returns true: the file "toto.xiidm.gz" exists in the directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oolean totoDatasource.exists(null, "xiidm"); // Returns true: the file "toto.xiidm.gz" exists in the directory | |
totoDatasource.exists(null, "xiidm"); // Returns true: the file "toto.xiidm.gz" exists in the directory |
|
Please check if the PR fulfills these requirements
What kind of change does this PR introduce?
Documentation update --> add documentation on datasources.
Does this PR introduce a breaking change or deprecate an API?