Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NotificationIndex Content Column Size is too high #13935

Merged
merged 1 commit into from
Jul 6, 2023
Merged

Conversation

jtkech
Copy link
Member

@jtkech jtkech commented Jul 3, 2023

Fixes #13934

With MySql in a given inner index the total of the included columns sizes should not be greater than 3072 bytes, so with the utf8mb4_general_ci not greater than 3072 / 4 = 768 chars. We already fixed the same issue for the Alias column size that was a little too high, see #13585 (comment).

So at least for MySql, NotificationIndexContentLength which is equal to 2500 is too high.


For info when using directly .QueryIndex<SomeIndex>() it is useful that the related inner index include all columns, being a full inner index, this because .QueryIndex<SomeIndex>() selects all columns. Most of our inner indexes are missing the Id column but we began to include it, see below.

SchemaBuilder.AlterIndexTable<PublishLaterPartIndex>(table => table
.CreateIndex($"IDX_{nameof(PublishLaterPartIndex)}_{nameof(ContentItemIndex.DocumentId)}",
"Id",
nameof(ContentItemIndex.DocumentId),
nameof(PublishLaterPartIndex.ContentItemId),
nameof(PublishLaterPartIndex.ScheduledPublishDateTimeUtc),
nameof(PublishLaterPartIndex.Published),
nameof(PublishLaterPartIndex.Latest))
);

About the NotificationIndex table, its inner index is missing the Id column too and also the ReadAtUtc column, but because no direct .QueryIndex<NotificationIndex>() is done, for now I let it as is.

/// If this number is increased after the site is setup, you must alter the <see cref="Notification" /> using migration
/// </summary>
public const int NotificationIndexContentLength = 2500;
// Maximum length that MySql can support in an inner index under utf8mb4 collation is 768,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we store the 768 as part of the provider default settings so it is not hard coded here or other places?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I thought about having for example a value depending on the provider. Hmm but potentially we can deploy data from a database to another, so I don't know, for now the issue/PR is just to notify the limitation but open to any suggestion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But 2500 chars is high for a column to include in a inner index, if only for searching we would need to remove duplicate terms, stop words, delimiters, like a search engine may do for indexing.

Just saw that you use your own StripHTML() method using a Regex, not sure it is good, we have a RemoveTags() string extension, also allowing to specify whether we htmlDecode or not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that column isn't a good idea. It would be nice to at least have full-index search column. The fact we are searching large index is very slow when we have lots of indexes. Maybe we should consider using a search index for notification. The only downside would be that we would have to force using the search module.

@jtkech jtkech merged commit 7f76f77 into main Jul 6, 2023
@jtkech jtkech deleted the jtkech/notif-content branch July 6, 2023 19:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NotificationIndex Content Column Size is too high
3 participants