Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix container inefficiencies in FieldInfos.java #13254

Merged
merged 2 commits into from
Apr 2, 2024
Merged

Conversation

cinsttool
Copy link
Contributor

@cinsttool cinsttool commented Apr 1, 2024

Description

Hi,

We find that there exist container inefficiencies in FieldInfos.java.

At line 169, there maybe a redundant operation that Arrays.asList(valuesTemp.toArray(new FieldInfo[0])) transform List object valuesTemp to array and then back to List.

We discovered the above containers inefficiencies by our tool cinst. The patch is submitted. Could you please check and accept it? We have tested the patch on our PC. The patched program works well.

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking for inefficiencies in the Lucene code. For these two ones that you found, the change doesn't look like a good trade-off to me however.

@@ -165,8 +164,7 @@ public FieldInfos(FieldInfo[] infos) {
valuesTemp.add(byNumberTemp[i]);
}
}
values =
Collections.unmodifiableCollection(Arrays.asList(valuesTemp.toArray(new FieldInfo[0])));
values = Collections.unmodifiableCollection(valuesTemp);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This essentially trades RAM for CPU, as we are saving a copy but the values list may now be oversized. This doesn't obviously look like a better trade-off to me?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about counting the number of non-null infos in the for loop at line 86 and converting valuesTemp into a fixed-size array to store them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about counting the number of non-null infos in the for loop at line 86 and converting valuesTemp into a fixed-size array to store them?

Or use that non-null infos count to set the initial capacity of valuesTemp.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we pre-size valuesTemp correctly, it looks like it would work indeed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we pre-size valuesTemp correctly, it looks like it would work indeed.

We observe that the info will never be null since the if statements already visit info.number here at line 87. Therefore, we pre-size valuesTemp to infos.length.

maxFreqForLowerNorms = maxFreq;
}
}
return Collections.unmodifiableSet(freqNormPairs);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is optimizing the uncommon path (when otherFreqNormPairs is not empty) at the expense of additional code duplication. This doesn't look like a good trade-off to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it!

@jpountz jpountz changed the title Fix container inefficiencies in FieldInfos.java and CompetitiveImpact… Fix container inefficiencies in FieldInfos.java Apr 2, 2024
@jpountz jpountz merged commit 7c3ee79 into apache:main Apr 2, 2024
3 checks passed
@mikemccand
Copy link
Member

We discovered the above containers inefficiencies by our tool cinst.

Could you share a pointer to this tool? I'm curious how it works... thanks.

@cinsttool
Copy link
Contributor Author

We discovered the above containers inefficiencies by our tool cinst.

Could you share a pointer to this tool? I'm curious how it works... thanks.

Thank you for your interest in cinst! Our tool has not been released yet. Once cinst is open-sourced, we will inform you at first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants