Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-20816][CORE] MetricsConfig doen't trim the properties file cau… #18041

Closed
wants to merge 1 commit into from

Conversation

LantaoJin
Copy link
Contributor

@LantaoJin LantaoJin commented May 20, 2017

…se the exception very confused

What changes were proposed in this pull request?

Spark Metrics System use a Properties File to load the configurations but doesn't trim the keys and values. It might cause the exception very confused if the property is a class name.
For example below, you must do not notice there is a space at the line end.

*.sink.ganglia.class=org.apache.spark.metrics.sink.GangliaSink

Unfortunately, the ClassNotFoundException throwing from Driver also doesn't tell me what happens and confuses me because I am sure the related jar is in the CLASSPATH.

17/05/20 12:47:04 ERROR SparkContext: Error initializing SparkContext.
java.lang.ClassNotFoundException: org.apache.spark.metrics.sink.GangliaSink
at scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:62)

As a reference, I check the code of Log4j, a classic Properties using library. It do the trim when load the properties. See org.apache.log4j.filter.PropertyFilter.java

private Hashtable parseProperties(String props) {
	Hashtable hashTable = new Hashtable();
	StringTokenizer pairs = new StringTokenizer(props, ",");
	while (pairs.hasMoreTokens()) {
		StringTokenizer entry = new StringTokenizer(pairs.nextToken(), "=");
		hashTable.put(entry.nextElement().toString().trim(), entry.nextElement().toString().trim());
	}
	return hashTable;
}

How was this patch tested?

Add unit tests

Also can test manually by setting metrics.properties file
(Replace "_" with " ")

*.sink.csv.class=_org.apache.spark.metrics.sink.CsvSink_

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@srowen
Copy link
Member

srowen commented May 20, 2017

Why do you think this relates to trimming whitespace?

@LantaoJin
Copy link
Contributor Author

@srowen It's not a real normal class not found case. And I do know what happened here. What I point out is a case that a whitespace at the end of the class name will cause ClassNotFound exception. This case is very confused to user. If it can be trimmed before reflection, that's much good I think.

@jerryshao
Copy link
Contributor

@LantaoJin I don't think you have to do trim for metrics conf coming from SparkConf:

  1. SparkConf already handles trim when reading from spark-defaults.conf file.
  2. If you deliberately add trailing WS from code like here, I think it's up to you to guarantee the correctness of such configuration.

For metrics conf reading from metrics property file, I think we could trim the trailing whitespace when reading from property file.

@srowen
Copy link
Member

srowen commented May 30, 2017

Let's close this

@jerryshao
Copy link
Contributor

@srowen , this issue existed when reading from metrics.properties conf file, I think we should fix this part. As for SparkConf part, I don't think it is necessary to fix.

@srowen srowen mentioned this pull request Jun 7, 2017
@asfgit asfgit closed this in b771fed Jun 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants