Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-20386][Spark Core]modify the log info if the block exists on the slave already #17683

Closed
wants to merge 10 commits into from

Conversation

eatoncys
Copy link
Contributor

What changes were proposed in this pull request?

Modify the added memory size to memSize-originalMemSize if the block exists on the slave already
since if the block exists, the added memory size should be memSize-originalMemSize; if originalMemSize is bigger than memSize ,then the log info should be Removed memory, removed size should be originalMemSize-memSize

How was this patch tested?

Multiple runs on existing unit tests

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

Please review http://spark.apache.org/contributing.html before opening a pull request.

blockId, blockManagerId.hostPort, Utils.bytesToString(memSize-originalMemSize),
Utils.bytesToString(_remainingMem)))
}
else{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of style problems in this change. Spaces are missing around operators, else shouldn't be on a new line. Also go ahead and use interpolated string style in the log message, and pull out some helper vars for clarity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, i'm dealing right away

@@ -497,11 +497,12 @@ private[spark] class BlockManagerInfo(

updateLastSeenMs()

var originalMemSize: Long = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why pull out a var here?

logInfo("Added %s in memory on %s (size: %s, free: %s)".format(
blockId, blockManagerId.hostPort, Utils.bytesToString(memSize),
Utils.bytesToString(_remainingMem)))
if((memSize-originalMemSize) >= 0){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (memSize >= originalMemSize) is clearer and safer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, i'm dealing right away

@eatoncys
Copy link
Contributor Author

@srowen Thank you for suggestion, i have modified the style problems, any other suggestion?

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally I think you might be able to further unify and standardize the log statements in the code below where it deals with just dropping the blocks. This should stay somewhat consistent.

@@ -520,8 +520,18 @@ private[spark] class BlockManagerInfo(
blockStatus = BlockStatus(storageLevel, memSize = memSize, diskSize = 0)
_blocks.put(blockId, blockStatus)
_remainingMem -= memSize
logInfo("Added %s in memory on %s (size: %s, free: %s)".format(
blockId, blockManagerId.hostPort, Utils.bytesToString(memSize),
val originalMemSize = if (_blocks.containsKey(blockId)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, it's not a Scala Map or else this could be .getOrElse(..., 0). Now I understand why you pulled out the variable earlier. I actually dont' think this works now because it happens after the new status has been put in the Map. OK, you could go back to what you had.

} else {
0
}
val (addedOrRemoved, size) = if (memSize >= originalMemSize) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference doesn't need to be conditional. You can use math.abs

} else {
("Removed", originalMemSize - memSize)
}
logInfo("%s %s in memory on %s (size: %s, free: %s)".format(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still doesn't use interpolation

("Removed", originalMemSize - memSize)
}
logInfo("%s %s in memory on %s (size: %s, free: %s)".format(
addedOrRemoved, blockId, blockManagerId.hostPort, Utils.bytesToString(size),
Utils.bytesToString(_remainingMem)))
}
if (storageLevel.useDisk) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the disk case need a similar treatment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I have modified the disk case, and add a "update" style log info, any other suggestions? thank you

@eatoncys
Copy link
Contributor Author

@srowen I have modified the "Removed" to "Updated" if the block exists already, and keep the size to current block size. Add a "Updated" log info including current size and original size maybe better

logInfo("Added %s on disk on %s (size: %s)".format(
blockId, blockManagerId.hostPort, Utils.bytesToString(diskSize)))
if (blockExists) {
logInfo("Updated %s on disk on %s (current size: %s, original size %s)".format(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are still not using string interpolation, please do

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry,i don't understand the meaning of "string interpolation" very clearly, is it like this: logInfo(s"Added $blockId on disk on ${blockManagerId.hostPort} (size: $diskSize)") ? but this is different from other log infos in this file

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you've got it. It's worth changing all of the related log messages here, but not necessarily every one in the file. We are slowly updating to use string interpolation as this code gets updated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok,i have modified it, thank you!

if (_blocks.containsKey(blockId)) {
// The block exists on the slave already.
val blockStatus: BlockStatus = _blocks.get(blockId)
val originalLevel: StorageLevel = blockStatus.storageLevel
val originalMemSize: Long = blockStatus.memSize
originalMemSize = blockStatus.memSize
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BLocks of code even farther down can reuse these new values and need similar changes to the log comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok,thanks for suggestion, I have modified it

@@ -497,11 +497,18 @@ private[spark] class BlockManagerInfo(

updateLastSeenMs()

var blockExists = false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although you could just assign val blockExists = _blocks.containsKey(blockId) once and reuse it instead of a var, it's minor. This looks good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good idea, i have modified it, thank you

@SparkQA
Copy link

SparkQA commented Apr 21, 2017

Test build #3670 has finished for PR 17683 at commit 60ca9e0.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Apr 21, 2017

@eatoncys I think you have lines that are too long

@eatoncys
Copy link
Contributor Author

eatoncys commented Apr 21, 2017

@srowen Yes, i have two lines more than 100 characters, i have modified it, thank you

@SparkQA
Copy link

SparkQA commented Apr 21, 2017

Test build #3672 has finished for PR 17683 at commit 664dfb8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@asfgit asfgit closed this in 05a4514 Apr 22, 2017
@srowen
Copy link
Member

srowen commented Apr 22, 2017

Merged to master/2.2

asfgit pushed a commit that referenced this pull request Apr 22, 2017
…the slave already

## What changes were proposed in this pull request?
Modify the added memory size to memSize-originalMemSize if the  block exists on the slave already
since if the  block exists, the added memory size should be memSize-originalMemSize; if originalMemSize is bigger than memSize ,then the log info should be Removed memory, removed size should be originalMemSize-memSize

## How was this patch tested?
Multiple runs on existing unit tests

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

Please review http://spark.apache.org/contributing.html before opening a pull request.

Author: eatoncys <chen.yanshan@zte.com.cn>

Closes #17683 from eatoncys/SPARK-20386.

(cherry picked from commit 05a4514)
Signed-off-by: Sean Owen <sowen@cloudera.com>
peter-toth pushed a commit to peter-toth/spark that referenced this pull request Oct 6, 2018
…the slave already

## What changes were proposed in this pull request?
Modify the added memory size to memSize-originalMemSize if the  block exists on the slave already
since if the  block exists, the added memory size should be memSize-originalMemSize; if originalMemSize is bigger than memSize ,then the log info should be Removed memory, removed size should be originalMemSize-memSize

## How was this patch tested?
Multiple runs on existing unit tests

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

Please review http://spark.apache.org/contributing.html before opening a pull request.

Author: eatoncys <chen.yanshan@zte.com.cn>

Closes apache#17683 from eatoncys/SPARK-20386.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants