-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Append HADOOP_CONF_DIR to the tools CLASSPATH execution cmd #1308
Conversation
Signed-off-by: Ahmed Hussein <ahussein@nvidia.com> Fixes NVIDIA#1253 Fixes NVIDIA#1302 This change includes the following: - the python wrapper pulls the hadoop configuration directory `$HADOOP_CONF_DIR` env var. If the latter is not defined, the wrapper tries `$HADDOP_HOME/etc/hadoop`. - If the `hadoop_conf_dir` is defined then it is appended to the java CLASSPATH iff it is a valid local directory path - If none of the above applies, the class path will be the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this fixing #1253? Just because we pick up conf now?
Signed-off-by: Partho Sarthi <psarthi@nvidia.com>
Signed-off-by: Ahmed Hussein <ahussein@nvidia.com>
Signed-off-by: Ahmed Hussein <ahussein@nvidia.com> Fixes NVIDIA#1303
except Exception as ex: # pylint: disable=broad-except | ||
self.logger.error('Failed in processing output arguments. Output_folder must be a local directory') | ||
raise ex | ||
# self.output_folder = FSUtil.get_abs_path(self.output_folder) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can be removed
deps_arr = [self.prop_container.get_jar_file()] | ||
hadoop_cp = self._get_hadoop_classpath() | ||
# append hadoop conf dir if any | ||
if hadoop_cp is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this throw an exception when hadoop conf dir was not found but event logs are in hdfs?
Signed-off-by: Ahmed Hussein <ahussein@nvidia.com>
Signed-off-by: Ahmed Hussein <ahussein@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @amahussein.
Signed-off-by: Ahmed Hussein ahussein@nvidia.com
Fixes #1253
Fixes #1283
Fixes #1302
Fixes #1303
This change includes the following:
The wrapper gets Hadoop's configuration directory from the environment variables. The first valid directory is added to the java cmd CLASSPATH. The order of available hadoop configuration directories are:
This PR also enforces URI to the
--output-folder
argument to the java cmd. This is required to prevent the core tools from storing the output-folder on the remote storage in case HDFS defines a default FileSystem.