Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inefficient Usage of Java Collection #1079

Merged
merged 12 commits into from
Oct 25, 2021
Merged

Conversation

FastAtlas
Copy link
Contributor

Hi,

We find that there are several ArrayList objects which are not manipulated by random access. Due to the memory reallocation triggered in the successive insertions, the time complexity of add method of ArrayList is amortized O(1). We notice that these objects are only used for traversal, the retrieval, and removal of the first or the last element.

This functionality can be implemented by LinkedList. Moreover, the insertion of LinkedList is strictly O(1) time complexity because no memory reallocation occurs.

Meanwhile, we also found several LinkedHashSet and LinkedHashMap objects which are not necessary to maintain the order of insertions. To achieve the same functionality, HashHap and HashSet are enough. The replacement can also reduce the time cost in the modification of the map objects.

We discovered the above inefficient usage of containers by our tool Ditto. The patch is submitted. Could you please check and accept it? We have tested the patch on our PC. The patched program works well.

Bests

Ditto

Transform LinkedHashMap to HashMap
Transform LinkedHashMap to HashMap
Transform LinkedHashMap to HashMap
Transform LinkedHashMap to HashMap
Transform LinkedHashSet to HashSet
Transform LinkedHashSet to HashSet
Transform ArrayList to LinkedList
Transform ArrayList to LinkedList
Transform ArrayList to LinkedList
Transform ArrayList to LinkedList
Transform ArrayList to LinkedList
@sofastack-bot
Copy link

sofastack-bot bot commented Sep 6, 2021

Hi @DittoTool, welcome to SOFAStack community, Please sign Contributor License Agreement!

After you signed CLA, we will automatically sync the status of this pull request in 3 minutes.

@sofastack-bot sofastack-bot bot added cla:no Need sign CLA First-time contributor First-time contributor size/M labels Sep 6, 2021
@codecov
Copy link

codecov bot commented Sep 7, 2021

Codecov Report

Merging #1079 (522b9c8) into master (5f3dd7e) will decrease coverage by 0.00%.
The diff coverage is 88.88%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #1079      +/-   ##
============================================
- Coverage     68.93%   68.93%   -0.01%     
- Complexity      825      826       +1     
============================================
  Files           409      409              
  Lines         17693    17693              
  Branches       2744     2744              
============================================
- Hits          12197    12196       -1     
- Misses         4079     4081       +2     
+ Partials       1417     1416       -1     
Impacted Files Coverage Δ
...ay/sofa/rpc/client/AllConnectConnectionHolder.java 60.51% <0.00%> (-0.26%) ⬇️
...ofa/rpc/codec/sofahessian/BlackListFileLoader.java 77.55% <100.00%> (ø)
...n/java/com/alipay/sofa/rpc/client/RouterChain.java 89.87% <100.00%> (ø)
...n/java/com/alipay/sofa/rpc/filter/FilterChain.java 83.15% <100.00%> (ø)
...m/alipay/sofa/rpc/config/JAXRSProviderManager.java 75.00% <100.00%> (ø)
...ava/com/alipay/sofa/rpc/common/MetadataHolder.java 90.00% <100.00%> (ø)
.../sofa/rpc/tracer/sofatracer/RestTracerAdapter.java 54.83% <100.00%> (ø)
...a/com/alipay/sofa/rpc/common/utils/ClassUtils.java 78.23% <0.00%> (-2.05%) ⬇️
...lipay/sofa/rpc/registry/consul/ConsulRegistry.java 47.80% <0.00%> (+0.54%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5f3dd7e...522b9c8. Read the comment docs.

@OrezzerO
Copy link
Contributor

OrezzerO commented Sep 7, 2021

Thank you for your PR, we will check the changes later.

@leyou240
Copy link
Contributor

leyou240 commented Sep 8, 2021

@DittoTool Where can I get the Ditto tool?

@FastAtlas
Copy link
Contributor Author

@DittoTool Where can I get the Ditto tool?

Currently, it is a commercial tool. It might be open-sourced later.:)

Copy link
Contributor

@OrezzerO OrezzerO left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@FastAtlas
Copy link
Contributor Author

LGTM

Thank you for confirmation :)

@leizhiyuan
Copy link
Contributor

please see quarkusio/quarkus#19962 (comment)

I won't comment on the other changes, but I'd like to highlight one thing about the ArrayList -> LinkedList move.

The advantages of linked lists over arrays in scenarios where the list is only iterated or appended are only real under the RAM computation model. Real computers are not RAM computers, they employ memory hierarchies, and so the same logic does not apply. Using a LinkedList is almost always worse than using an ArrayList.

@leizhiyuan leizhiyuan self-requested a review September 11, 2021 11:26
@FastAtlas
Copy link
Contributor Author

FastAtlas commented Sep 12, 2021

please see quarkusio/quarkus#19962 (comment)

I won't comment on the other changes, but I'd like to highlight one thing about the ArrayList -> LinkedList move.

The advantages of linked lists over arrays in scenarios where the list is only iterated or appended are only real under the RAM computation model. Real computers are not RAM computers, they employ memory hierarchies, and so the same logic does not apply. Using a LinkedList is almost always worse than using an ArrayList.

I think there exists a gap between the design and implementation of these collections. The new version of JVM might introduce some optimizations upon ArrayList. However, from the complexity perspective, LinkedList is still better than ArrayList in some cases. Otherwise, why LinkedList still exists in JCF?

Ditto receives the complexity specification as its input and is unaware of the JVM optimization. On the one hand, the result only suggests the potential inefficient usage. Also, dynamic profiling statistics can be introduced to make the recommendation more precise. On the other hand, we just take the complexity specification as our assumption. If we adopt the implementation in the real JVM, we can also recommend replacing the LinkedList objects with ArrayList objects in the program.

Of course, such changes might not make big difference. The project sofa-rpc does not have serve cases of inefficient container usage. In the patch, we only mention three kinds of transformations: ArrayList->LinkedList, LinkedHashMap->HashMap, and LinkedHashSet->HashSet. The three patterns actually have little influence on complexity. However, in other cases, for example, a list can be used as a set, so Ditto will recommend replacing a list object with a HashSet object. The contains methods have totally different complexities in list and set.

@EvenLjj EvenLjj added this to the 5.8.0 milestone Sep 14, 2021
@EvenLjj EvenLjj linked an issue Sep 18, 2021 that may be closed by this pull request
Copy link
Collaborator

@EvenLjj EvenLjj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@EvenLjj EvenLjj merged commit e8ac663 into sofastack:master Oct 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla:no Need sign CLA First-time contributor First-time contributor size/M
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inefficient Usage of JCF
5 participants