Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CARBONDATA-736]Fixed Performance issue for dictionary loading during decoder #470

Conversation

kumarvishal09
Copy link
Contributor

Problem
Currently during dictionary loading in carbon decoder is slow as get method is getting called.

Solution
Call getAll api to load the dictionary the dictionary concurrently

@CarbonDataQA
Copy link

Build Failed with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/332/

@kumarvishal09 kumarvishal09 force-pushed the DictionaryLoadingPerformanceIssue branch from 44abae4 to 29051d8 Compare December 28, 2016 14:56
@CarbonDataQA
Copy link

Build Failed with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/360/

@kumarvishal09 kumarvishal09 force-pushed the DictionaryLoadingPerformanceIssue branch from 29051d8 to e534a05 Compare December 28, 2016 15:11
@CarbonDataQA
Copy link

Build Failed with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/361/

@kumarvishal09 kumarvishal09 force-pushed the DictionaryLoadingPerformanceIssue branch from e534a05 to 6668f75 Compare December 28, 2016 15:18
@CarbonDataQA
Copy link

Build Failed with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/362/

@kumarvishal09 kumarvishal09 force-pushed the DictionaryLoadingPerformanceIssue branch 2 times, most recently from a1a0d74 to 3fa0b0f Compare January 8, 2017 17:48
@CarbonDataQA
Copy link

Build Failed with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/509/

@CarbonDataQA
Copy link

Build Failed with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/510/

@kumarvishal09
Copy link
Contributor Author

retest this please

@CarbonDataQA
Copy link

Build Failed with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/517/

try {
val noDictionaryIndexes = new java.util.ArrayList[Int]()
dictionaryColumnIds.zipWithIndex.foreach { x =>
if (x._1 == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use (columnId, index) instead of x?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

cache: Cache[DictionaryColumnUniqueIdentifier, Dictionary]) = {
val dicts: Seq[Dictionary] = getDictionaryColumnIds.map { f =>
cache: Cache[DictionaryColumnUniqueIdentifier, Dictionary]) = {
val dictionaryColumnIds = getDictionaryColumnIds.map { f =>
if (f._2 != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please avoid using f, use a more meaningful tuple with names

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

val dict = cache.getAll(dictionaryColumnIds.filter(_ != null).toSeq.asJava);
val finalDict = new java.util.ArrayList[Dictionary]()
var dictIndex: Int = 0
dictionaryColumnIds.zipWithIndex.foreach { x =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as previous comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

finalDict.add(dict.get(dictIndex))
dictIndex += 1
} else {
finalDict.add(null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why adding null?

@kumarvishal09 kumarvishal09 force-pushed the DictionaryLoadingPerformanceIssue branch from 3fa0b0f to 87410a8 Compare February 28, 2017 10:19
@CarbonDataQA
Copy link

Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/975/

@kumarvishal09 kumarvishal09 force-pushed the DictionaryLoadingPerformanceIssue branch from 87410a8 to 0c68fb5 Compare February 28, 2017 14:02
@kumarvishal09 kumarvishal09 changed the title [WIP]Fixed Performance issue for dictionary loading during decoder [CARBONDATA-736]Fixed Performance issue for dictionary loading during decoder Mar 1, 2017
@jackylk
Copy link
Contributor

jackylk commented Mar 2, 2017

test this please

@CarbonDataQA
Copy link

Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/997/

@jackylk
Copy link
Contributor

jackylk commented Mar 5, 2017

LGTM

@asfgit asfgit closed this in 50ec532 Mar 5, 2017
Beyyes pushed a commit to Beyyes/carbondata that referenced this pull request Jul 12, 2018
…eLogUrl

[Web Portal] Fix the url of service log.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants