Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BRD/Auth API - table data is very slow to load when there are a lot of records #25080

Open
severinbeauvais opened this issue Dec 27, 2024 · 6 comments
Assignees
Labels
Entities - Olga A label to filter on the tickets for the Entities based team that Olga is PO for. Priority1 sbc-connect

Comments

@severinbeauvais
Copy link
Collaborator

severinbeauvais commented Dec 27, 2024

When the registry dashboard (both old MBR and new BRD) has a lot of entries, the table is very slow to load. I've also heard rumours that the affiliations call to fetch the data can time out in Prod.

This ticket is to analyze or fix the slow load issue. I imagine the best way will be to paginate the results (so that less data is fetched), but there may be other options, left to the Assignee to look into.

See also #25073 as a potential or partial solution.

I also got the sense that Thor had some ideas on refactoring the "middleware" that might solve this issue, so he should be consulted.

@severinbeauvais severinbeauvais added Entities - Olga A label to filter on the tickets for the Entities based team that Olga is PO for. sbc-connect labels Dec 27, 2024
@severinbeauvais severinbeauvais changed the title BRD - table data is very slow to load when there are a lot of records MBR/BRD - table data is very slow to load when there are a lot of records Dec 27, 2024
@JazzarKarim JazzarKarim changed the title MBR/BRD - table data is very slow to load when there are a lot of records MBR/BRD/Auth API - table data is very slow to load when there are a lot of records Feb 3, 2025
@JazzarKarim JazzarKarim changed the title MBR/BRD/Auth API - table data is very slow to load when there are a lot of records BRD/Auth API - table data is very slow to load when there are a lot of records Feb 3, 2025
@thorwolpert
Copy link
Collaborator

I think there's an old ticket around this, but I wasn't able to easily find it.
Afaik, pagination has to be added to the API, and then the UI table needs to use it. With some much larger accounts coming online, this will probably be needed.

@severinbeauvais
Copy link
Collaborator Author

The plot thickens. This is probably not as easy as straight pagination, because we should honour any filtering or sorting that the user has selected, but that would force the API to look up all results (which is the time-consuming part) before filtering, sorting and returning just a specified set (page) of results.

This is being discuss in a MS Teams chat with Thor and Travis.

@severinbeauvais
Copy link
Collaborator Author

severinbeauvais commented Feb 14, 2025

@janisrogers @jacqueline-williams-549

The current technical design makes it virtually impossible to improve the performance of the BRD as it is.

Idea 1 - if we can ignore filtering and sorting of the table items then we can paginate on "raw affiliations". But how would we convey to users that their filters and sorting only apply to the current page (eg, records 100-199)?

Idea 2 - could we split NRs and businesses into separate concerns (separate tabs?). Then each set of data would be quicker to process and load than everything at once. We would also not need to "combine" NRs and businesses, which would allow us to optimize the back end.

To really improve things will require a significant redesign of the APIs and databases.

Any thoughts so far?

@severinbeauvais
Copy link
Collaborator Author

severinbeauvais commented Feb 19, 2025

Severin to describe potential solutions.

Describe best solution for today.

Describe tickets needed and size. tickets will be created according to selected solution

Describe dependencies and deployment plan. will be recorded in task tickets

@severinbeauvais
Copy link
Collaborator Author

severinbeauvais commented Feb 21, 2025

(See epic for The Problem and Goals.)

The basic solution

We need to handle smaller amount of data (at least in the UI). This would be done by "paging" results (eg, show only 100 at a time).

However, it is expected that users will use filtering (eg, show only "Approved" NRs) or searching (eg, show only business "BC1234567") to quickly access the item they want to manage. For filtering and pagination to work together correctly, first all the records for the subject account need to be filtered, then the first 100 (or next 100, etc) need to be returned. Therefore, a large amount of data is still being handled at some point, but filtering/paging as close to the db as possible means less data transfer and faster API operations.

Approach # 1

Have the required data already available in one API/database. For example, one can imagine a "rich" affiliations table that is kept up to date whenever NR/business statuses change. Then, it's a fast, simple query for filtering, sorting and paging.

Option 1

Add a table to Auth db (or a new API+db), and implement async updates to it.

Pros:

  • can fetch "rich" affiliations, including filters and pagination, in 1 database query
  • should be very fast

Cons:

  • would pollute Auth with NRs and business data, unless we stand up another API + db (increased cost?)
  • need to implement async updates (from Namex API and Legal API/Filer)

Option 2

Use a new search service (using SOLR/ElasticSearch/Google Cloud Search) to store all data and return it based on search parameters, and implement async updates to it.

Pros:

  • should be very fast
  • no pollution of Auth space with NR or businesses
  • could be called directly by BRD UI

Cons:

  • use of new or not-well-understood technology
  • possibly increased costs
  • need to implement async updates (from Namex API and Legal API/Filer)

Option 3

Enhance existing Business Search (BTR/SOLR) to replace the affiliations lookup.

  • need to add auth details, NR details and business details to SOLR
  • but how often is SOLR db updated? (it's not real-time)
  • call Business Search instead of Auth API to fetch affiliations; filter out NRs with drafts in UI

Pros:

  • should be very fast
  • no pollution of Auth space with NR or businesses
  • could be called directly by BRD UI

Cons:

  • doesn't contain all NRs (or all NR data)
  • doesn't contain all business state (eg, frozen, in dissolution)
  • not refreshed often enough (not real-time)

Option 4

Use one db for Auth+Names+Legal with a bunch of schemas (for Auth, Names and Legal); this allows a single query across all data (eg, similar to data warehouse).

Pros:

  • should be very fast
  • no pollution of Auth space with NR or businesses
  • could be called directly by BRD UI

Cons:

  • some databases are not yet migrated to GCP
  • authn/authz issues

Approach # 2

Perform real-time queries from respective APIs/databases, but filter the data as early as possible and return partial results (eg, pages of 100 results each).

Option 1

  • BRD UI passes along filtering/sorting/paging parameters
  • Auth API calls Namex API and Legal API for latest NR/business data
  • Namex API and Legal API search for matching NRs/businesses and return a (slim, if possible) set of all matching NRs/businesses
  • Auth API combines the data, filters it and paginates it (eg, return page X, Y items)
  • BRD UI displays a page of data at a time

Pros:

  • supports filtering and paging
  • maybe simple changes to Namex API and Legal API
  • minimal changes to Auth API (just need to pass along the extra parameters)
  • retain existing Auth combination logic
  • few or no changes to Auth API and Legal API

Cons:

  • almost as slow as today (except for the benefits of filtering and pagination)
  • continues to pollute Auth space with NR and business data

Option 2

  • BRD UI passes along filtering/sorting/paging parameters
  • Auth API calls Namex API and Legal API, with extra parameters, for latest NR/business data
  • Namex API and Legal API search/filter/paginate matching NRs/businesses and return a (slim, if possible) set of all matching NRs/businesses
  • Auth API combines the data and returns all of it
    • for example, if Namex returns a page of 50 NRs and Legal returns a page of 100 businesses, then there will be 150 results even though the user asked for 100
  • BRD UI displays a page of data at a time

Pros:

  • should be fast (but not as fast as Approach 1)
  • filtering works correctly
  • maybe simple changes to Namex API and Legal API
  • minimal changes to Auth API (just need to pass along the extra parameters)
  • retain existing Auth combination logic

Cons:

  • it might seem weird to return more than the requested items per page
  • continues to pollute Auth space with NR and business data

Option 3

  • BRD UI has separate tables for NRs and businesses
  • BRD could call Namex API and Legal API separately (or proxy via Auth API)
  • BRD passes along filtering/sorting/paging parameters
  • Namex API and Legal API search/filter/paginate matching NRs/businesses and return a (slim, if possible) set of all matching NRs or businesses
  • BRD UI displays a page of data at a time

Pros:

  • should be very fast
  • filtering and pagination work as expected
  • stop polluting Auth with Names and Legal data

Cons:

  • separation of NRs and businesses may be undesirable (per UI/UX Team and Kaine)

Option 4

When user lands on BRD, show a page of (eg, first 100) unfiltered affiliations, with table options to show "next 100" or whatever. (Paging uses the order of affiliations in Auth db.)

If user applies a filter then return all data that matches the filter (where pagination is optional -- wouldn't make much difference in response time so it's a UX decision -- as long the number of items remains workable, say, less than 1000).

Pros:

  • first page should be fast (but not as fast as Approach 1)
  • filtering and pagination work as expected
  • should be moderately fast
  • reatains existing architecture (not a crazy amount of code changes)

Cons:

  • second page could be slow if there are a lot of items matching the filters (Auth will still paginate according to results, but it means all results need to be fetched from Namex and Legal API)
  • continues to pollute Auth space with NR and business data
  • affects BRD UI, Auth API, Namex API and Legal API

Approach 3

Ask users to have a limited number of affiliations per account (say, less than 5000). The BRD UI page would no longer fail to load, though it might take 30-60 seconds. Large accounts would have to be split up into smaller accounts (eg, 20000 / 5000 = 4 accounts). This may complicate the user's process for finding a particular NR or business.

This could be mitigated by providing a "search for business" feature that would lead them to the respective Business Dashboard (and similar actions for a NR).

Pros:

  • no changes to existing code
  • limits the BRD response time (but 5000 is still too many items)

Cons:

  • large users will not be able to work with this
  • will still be slow
  • not scalable

Approach 4

Improve the performance of the current Names and Legal databases in OCP. See #26052 and #26053.

Pros:

  • no code changes
  • every little bit helps

Cons:

  • some risk and effort to changing storage parameters
  • probably won't help as much as we want (existing code is inefficient)

@severinbeauvais
Copy link
Collaborator Author

Recommendation

Approach 1, Option 4 (ie, combined database with separate schemas) is probably the cleanest and highest-performance option, but many things need to be completed before we can get there.

Approach 2, Option 4 (ie, initial show first 100, when filtering show all results) is probably the most feasible for MVP, and should perform well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Entities - Olga A label to filter on the tickets for the Entities based team that Olga is PO for. Priority1 sbc-connect
Projects
None yet
Development

No branches or pull requests

4 participants