-
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Translate script - S3 bucket support is no longer compatible with options for translating a file or sequence of files #685
Comments
@mmartin9684-sil Are you currently blocked, because of this issue? @TaperChipmunk32 Can you take a look at this when you get a chance? |
Yes, I can look into this tomorrow. |
@ddaspit - at the moment, I just need support for the |
It might be good to check the hybrid drafting feature and this issue. I believe the hybrid drafting feature modifies the file names of the target language drafts that get created, and it would be good to be sure that those target language drafts with those modified file names can get copied back back from the remote server. |
When we produce multiple translations, we add a draft number to the output file name. For example, if your single-draft output was |
The first two bullet points appear to be straight-forward to address, and would just be behind the scenes changes and not affect how you would run the command. But for the third, I have a question. I believe I can get Alternatively, we could just add a message specifying that all files must be in the |
If the For the --src-prefix et al option, it would be enough to plan for a single source/target directory; no need to support multiple directories. If you wanted to support an option for a different target directory (instead of placing the drafts in the same directory as the source files), that would be a nice flexibility. |
Could we make |
I can do both of those things. I will also go ahead and support multiple directories, it does not add much complexity. |
Yes, I could do that |
@mmartin9684-sil Do you have an example project/command I could use for testing? |
With the move from Gutenberg to the S3 bucket, the translate script is no longer usable with the option for translating a single specific source file (
--src
and--trg
) or a series of source files (--src-prefix
,--trg-prefix
,--start-seq
,--end-seq
).copy_experiment_from_bucket
are only coded to copy the model checkpoints from the S3 bucket to the local drive. The source files for these 2 translation options are not copied.copy_experiment_to_bucket
are only coded to copy *.SFM files from the local drive to the S3 bucket. No other files types (e.g., *.txt, *.docx) supported by these 2 translation options are not targeted.An additional improvement would be an intuitive scheme for specifying the source and target files on the command line, or help text of some kind to explain how to place the source files in the S3 bucket, where to find the target files once they are created, and how to specify these files as command line arguments for this command.
Note that this issue is not due to the introduction of MinIO.
The text was updated successfully, but these errors were encountered: