Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shutdown (reason: CPU time limit reached) #2

Closed
KristianRykkje opened this issue Nov 22, 2023 · 32 comments
Closed

shutdown (reason: CPU time limit reached) #2

KristianRykkje opened this issue Nov 22, 2023 · 32 comments
Labels
bug Something isn't working

Comments

@KristianRykkje
Copy link

Bug report

Describe the bug

First of all, thank you for this great resource!

Trying to follow along the video you posted a few days ago, but when I upload a md file and the edge functions start I get a

shutdown (reason: CPU time limit reached)

I am using the same rome (sample_files) you used in the video.

One key difference is that I am not running it locally, but in the cloud (in supabase.com)

To Reproduce

  1. Clone the repo
  2. Create a project in supabase with the lowest architecture.
  3. Link the project locally to the one in the cloud and run npx supabase db push
  4. Add the env vars to the .env file
  5. Run the project locally npm run dev
  6. Upload a file (roman-empire)
  7. Look at edge function logs and see a warning and look at db and see missing data in embedding column for most rows

Expected behavior

To not throw a warning and shutdown before all the rows has been processed.

Screenshots

image

System information

  • OS: Windows
  • Version of supabase-js: [e.g. 6.0.2]
  • Version of Node.js: [e.g. 10.10.0]

Additional context

@KristianRykkje KristianRykkje added the bug Something isn't working label Nov 22, 2023
@gregnr
Copy link
Collaborator

gregnr commented Nov 22, 2023

Thanks for reporting @KristianRykkje - we're seeing a couple of other reports about this. Will get back when I have more info!

@gregnr
Copy link
Collaborator

gregnr commented Nov 23, 2023

@KristianRykkje reporting back:

On the hosted cloud platform, CPU time is currently limited on edge functions to 100ms on the free tier - but note this is talking about actual time executing instructions on the CPU, not the total wall-clock time. Looks like this CPU limit is reached before all your embeddings finished generating.

To solve this I would first try adjusting the batch size on the embed trigger to somewhere in the 1-5 range - hopefully this will allow the embedding generation to complete before hitting the limit.

Otherwise if you are on a paid plan, I would create a support ticket to see if there's an opportunity to increase the limits. Edit: We've changed the way CPU limits are enforced - see below

@gregnr
Copy link
Collaborator

gregnr commented Nov 23, 2023

I lowered the batch size in this repo to 5 to help address this: #5

Going to close this for now, but feel free to report back and let me know if this solved the issue for you.

@gregnr gregnr closed this as completed Nov 23, 2023
@KristianRykkje
Copy link
Author

Hello! Thanks for looking into this! I still get the error and I have now tried with batch size of 5, 3, 2 and 1... Maybe you can try to explain what the problem is and I can look into it?
image

I even tried setting the batch size manually to 1 and it still gave the same errors:
image

@gregnr
Copy link
Collaborator

gregnr commented Nov 24, 2023

@KristianRykkje sorry to hear that didn't fix it - I'm surprised a batch size of 1 continued to fail. I'll take a further look to see if I can understand why.

As for what the problem is - did my explanation about CPU limits above make sense? You're unfortunately hitting those limits during inference which causes the Edge Function to exit early. I'm going to continue looking into this to see why a batch size of even 1 still fails.

@gregnr gregnr reopened this Nov 24, 2023
@KristianRykkje
Copy link
Author

@gregnr thank you for reopening the issue. I am very much looking forward to taking it to production use :)

No sorry, I think I was confused with the problem maybe being the trigger or the edge function. You of course wrote that it was the edge function so that's my bad. 😅

@gregnr
Copy link
Collaborator

gregnr commented Nov 28, 2023

@KristianRykkje our edge function team has adjusted the way CPU limits are enforced which I'm hoping will fix this issue. Would you mind redeploying your edge functions and letting me know if you continue experiencing the CPU limit?

@ZacharyHangoc
Copy link

@gregnr Thanks for the awesome tutorial Greg. Unfortunately I just ran the edge function by following the video and even set it to 2 and still hitting the CPU limit.

@ZacharyHangoc
Copy link

This is the log @gregnr :

{ "boot_time": null, "cpu_time_used": 3019, "deployment_id": "dcwtesuimokwvarpfwgl_be10764a-7159-4ae6-99fa-6de280d0ec53_1", "event_type": "Shutdown", "execution_id": "e7423631-963e-4920-8b1f-6fa0112fc226", "function_id": "be10764a-7159-4ae6-99fa-6de280d0ec53", "level": "warning", "memory_used": [ { "external": 138765157, "heap": 18596992, "total": 157362149 } ],...other project info}

@KristianRykkje
Copy link
Author

KristianRykkje commented Dec 6, 2023

I still get the shutdown (reason: CPU time limit reached) @gregnr ...

Any ideas how I can fix it? Or is it a bug?

image

The image is from a tried batching with 1, I tried with 5 as well.

@enormousrodent
Copy link

I am getting the exact same issue. I have a pro subscription so not sure what else i can do.

@nanana123
Copy link

nanana123 commented Dec 26, 2023

@KristianRykkje our edge function team has adjusted the way CPU limits are enforced which I'm hoping will fix this issue. Would you mind redeploying your edge functions and letting me know if you continue experiencing the CPU limit?

Still getting this issue. I'm trying to figure out how to incorporate this PR to make it work but there is no clear direction on how to achieve either tweaking the limits or making this code work without rewriting the logic completely.

For my use case I actually want to upload documents myself and not users. So I'll just proceed to do embedding locally on-the-fly, but from my brief research there seems to be no way to get around this issue without taking the embedding outside of edge functions.

@stonediggity
Copy link

I've just worked through this video and the repo and am experiencing the same issue. I upgraded to the pro plan also as I was planning on using it for a production app but it's still shuts down as the CPU time limit is reached. I have also reduced the batch size to 1 and reduced the text chunk size to 500 in the process-markdown script.
Screenshot 2024-01-05 174646

@Skrekliam
Copy link

Hi @gregnr. I researched this and my information may be useful. First of all, I updated xenova/transformers from 2.6.1 to 2.13.4. Then I tried to compare batch size 1 and 10. When using 1 I got the next error TypeError: error sending a request for URL (https://cdn-lfs.huggingface.co/repos/... which means that it could be some DDOS protection since we are generating a bunch of workers and sending a huge amount of requests per second to huggingface. When using 10 I didn't get this error but got some fails because of timeout. The solution can be a caching model like we are doing on the UI chat component.
Also, check this reply from Xenova huggingface/transformers.js#4 (comment)

@gregnr
Copy link
Collaborator

gregnr commented Jan 10, 2024

Thanks everyone for reporting.

@Skrekliam - this CDN error is interesting (never seen this one yet). Let's open a new issue for this and discuss there if it persists.

With respect to CPU limits, I'm currently working with the edge function team to see what other options are available - thanks for your patience.

@jmn
Copy link

jmn commented Jan 22, 2024

I migrated my function that was running into CPU limits to Deno deploy and it seems to be working well. For what it's worth.

@sreeshas
Copy link

sreeshas commented Feb 2, 2024

this is what i have experienced.
if your sections are small, running inference in edge function might pass within cpu limit.
if your sections are big or if your document is anything other than small, embed function will timeout.
the only way around it is to do embeddings outside.

for my use case, i make a network call to openai for embeddings and was able to resolve it.
now, im trying to solve the optimum number of calls i want to make for each statement.

@Huzaifa785
Copy link

Hi @sreeshas

Could you please share the code snippets from embed function of supabase and chat page of frontend where you integrated OpenAIEmbeddings?

Thanks!

@johnonline35
Copy link

johnonline35 commented Feb 14, 2024

Hi @gregnr - amazing tutorial - thank you!

I am developing this locally and have run into the exact same issues as everyone else: "CPU time limit reached." I am on the pro plan.

I have tried adjusting the batch size to 1, but this also has not worked. Everything else up to this point is working as expected. The only problem is this embedding function timing out.

Just wanted to check if there is a fix for this?

@johnonline35
Copy link

Thanks everyone for reporting.

@Skrekliam - this CDN error is interesting (never seen this one yet). Let's open a new issue for this and discuss there if it persists.

With respect to CPU limits, I'm currently working with the edge function team to see what other options are available - thanks for your patience.

Was there any outcome from the discussion with the edge functions team?

@gregnr
Copy link
Collaborator

gregnr commented Feb 15, 2024

Hey folks, thanks for your patience (and thanks for sharing @jmn, @sreeshas, @johnonline35).

The edge function team is working on a way to perform inference natively within the runtime itself, which means faster load times (ie. because models can be cached), faster inference (inference is performed natively), and less CPU usage.

There is experimental support in the latest versions. Since these are early stages, definitely consider this version of the API unstable and likely to change. Mainly just wanted to provide an update so that you know what's coming up. I'll continue to post updates as things stabilize.

In the mean time if you're in a crunch to get this working ASAP, @sreeshas's suggestion of using a third party embedding API (like OpenAI) should work well. OpenAI's latest models can shorten output dimensions, which means you can continue to use a smaller dimension size if that's important to you (or you can take advantage of faster vector search techniques like adaptive retrieval - see the linked post for more details!).

@tim-nelson
Copy link

Hi everyone, and thank you @gregnr for the fantastic workshop!

I have been building on this project locally for the past few weeks. I had no problems with CPU time limit, even after increasing the length of my segments. However, yesterday, I decided to move from local to hosted, following the "🚀 Going to prod" steps at the end. I immediately encountered the CPU time limit, which persists when I switch back to local development. I'm a stuck with it! I'm going to follow the suggestion to try OpenAI's latest models as a workaround.

I thought it was too much of a coincidence that the issue cropped up when moving to hosted, so hoping this nugget of info is helpful. 😊

@kallebysantos
Copy link

kallebysantos commented Apr 9, 2024

Hi, I'm facing the same issue but running completely local.

I'd follow the video, doing a complete project from scratch and also clone this repo.
Both failed with the following logs:

Download https://cdn.jsdelivr.net/npm/@xenova/transformers@2.6.1
Download https://cdn.jsdelivr.net/npm/@xenova/transformers@2.6.1
Download https://cdn.jsdelivr.net/npm/@xenova/transformers@2.6.1
CPU time soft limit reached. isolate: 535e3205-9869-473c-a247-7003421f3e49
memory limit reached for the worker. isolate: 16207a13-5f51-41b7-a408-496a3009baaa
failed to send request to user worker: request has been cancelled by supervisor
user worker failed to respond: request has been cancelled by supervisor
WorkerRequestCancelled: request has been cancelled by supervisor
    at async Promise.allSettled (index 1)
    at async UserWorker.fetch (ext:sb_user_workers/user_workers.js:70:21)
    at async Server.<anonymous> (file:///home/deno/main/index.ts:146:12)
    at async #respond (https://deno.land/std@0.182.0/http/server.ts:220:18) {
  name: "WorkerRequestCancelled"
}
ReferenceError: Status is not defined
    at Server.<anonymous> (file:///home/deno/main/index.ts:164:13)
    at eventLoopTick (ext:core/01_core.js:64:7)
    at async #respond (https://deno.land/std@0.182.0/http/server.ts:220:18)
runtime has escaped from the event loop unexpectedly: event loop error: Top-level await promise never resolved
    at file:///home/deno/functions/embed/index.ts:6:27
memory limit reached for the worker. isolate: 535e3205-9869-473c-a247-7003421f3e49
failed to send request to user worker: request has been cancelled by supervisor
user worker failed to respond: request has been cancelled by supervisor
WorkerRequestCancelled: request has been cancelled by supervisor
    at async Promise.allSettled (index 1)
    at async UserWorker.fetch (ext:sb_user_workers/user_workers.js:70:21)
    at async Server.<anonymous> (file:///home/deno/main/index.ts:146:12)
    at async #respond (https://deno.land/std@0.182.0/http/server.ts:220:18) {
  name: "WorkerRequestCancelled"
}
ReferenceError: Status is not defined
    at Server.<anonymous> (file:///home/deno/main/index.ts:164:13)
    at eventLoopTick (ext:core/01_core.js:64:7)
    at async #respond (https://deno.land/std@0.182.0/http/server.ts:220:18)
memory limit reached for the worker. isolate: 481812d3-3469-4bfd-9bbd-30b8ac86cfba
failed to send request to user worker: request has been cancelled by supervisor
user worker failed to respond: request has been cancelled by supervisor
WorkerRequestCancelled: request has been cancelled by supervisor
    at async Promise.allSettled (index 1)
    at async UserWorker.fetch (ext:sb_user_workers/user_workers.js:70:21)
    at async Server.<anonymous> (file:///home/deno/main/index.ts:146:12)
    at async #respond (https://deno.land/std@0.182.0/http/server.ts:220:18) {
  name: "WorkerRequestCancelled"
}
ReferenceError: Status is not defined
    at Server.<anonymous> (file:///home/deno/main/index.ts:164:13)
    at eventLoopTick (ext:core/01_core.js:64:7)
    at async #respond (https://deno.land/std@0.182.0/http/server.ts:220:18)
runtime has escaped from the event loop unexpectedly: event loop error: Top-level await promise never resolved
    at file:///home/deno/functions/embed/index.ts:6:27
wall clock duration warning. isolate: ebfbdc6b-b69e-4e17-8f29-48f2817fde45
wall clock duration reached. isolate: ebfbdc6b-b69e-4e17-8f29-48f2817fde45

I'd also try to just return a Ok status response without process any embed. I figured out that the problem occurs in the following:

const generateEmbedding = await pipeline(
  'feature-extraction',
  'Supabase/gte-small'
);

I have a GPU in my local machine that I already did run LLMs and other tools, everything worked well. But when using the edge function by Supabase locally it doesn't work. Is there anyway to specify to deno that I want to run the models using my GPU instead? I pretend move this project for our local servers to a proof of concept. So is very critical to get it working soon as possible. By the way, I will prepare a external python API to handle our embedding. But would be nice to put it all integrated with supabase, to reduce the complexity of our POC.

@gregnr
Copy link
Collaborator

gregnr commented Apr 22, 2024

Good news everyone, the native Supabase inference engine is now available 🚀

This means:

Background/recap

  • The edge runtime enforces CPU limits per worker isolate to prevent overloading the platform
  • Transformers.js uses WASM in edge functions, which turns out to be expensive CPU-wise
  • Many people were hitting CPU limits because of this
  • The new Supabase.ai inference APIs replace Transformers.js and performs inference within the runtime itself (vs. WASM)
  • The new inference APIs still count toward CPU time, but do so more efficiently and should no longer reach the limits.

How to upgrade

The exact changes needed to use these APIs can be found in this PR:
#26

tl;dr

  1. Stop supabase if it's running locally:
    npx supabase stop
  2. Update the supabase NPM dependency to the latest version:
    npm i -D supabase
    This will reference the latest version of the edge runtime.
  3. Start supabase back up:
    npx supabase start
  4. Update /embed to use the new APIs.

I'm going to close this issue now, but please do reach out if you continue to experience CPU limits.

@JanDoeTian
Copy link

JanDoeTian commented Oct 30, 2024

Hi @gregnr, first of all thanks for the great video and high quality codes!

Unfortunately this error still persists after using the Edge-Runtime's Supabase.ai API from CLI with batch set to 3.

What's more, I'm getting another error that says **InvalidWorkerCreation: worker did not respond in time** This seems to be related to the edge-function-runtime rather than the code after some research.
Screenshot 2024-10-30 at 10 30 46

We intent to use this codebase in production, but are a little worried about this.

Can you reopen the issue and possibly ask them for a solution?

Using supabase-edge-runtime-1.59.0 (compatible with Deno v1.45.2)
Thanks!

@laktek
Copy link

laktek commented Oct 31, 2024

@JanDoeTian Can you try deploying your function to a Supabase project? Do you still get the same CPU time limit error?

Also, how many records per batch you got?

@kallebysantos
Copy link

kallebysantos commented Oct 31, 2024

Hi @JanDoeTian and future readers,
First of all "CPU time limit reached" is not a bug but how edge-runtime is doing its job.

A brief explanation about Edge Workers

image

An edge-runtime worker is a sandbox where in fact the egde function will be executed. This isolation allows to apply a boundary box over the function, it means that we can limit the CPU and memory consumption.

To simplify, the worker have two main stages:

  • Init: Called once on worker creation, every code before Deno.serve() will be executed in this point. It explains why we need to do model loading outside.
  • Execution: Called on every request.

After 1º request the worker stay warm and will skip the init phase, so that we don't need to reload everything again.

How it affects the embedding processing

Since the function haves a limited CPU and memory, we can't do massively tasks in a single request. So is impossible to batch a loot of sections at once.

Also the current Supabase.ai doesn't support native batch inference, it means that while Transformers models can handle batch input we still need to for loop over each received section to process them, increasing the CPU time consuption:

for (const row of rows) {
    const { id, [contentColumn]: content } = row;

    if (!content) {
      console.error(`No content available in column '${contentColumn}'`);
      continue;
    }

    const output = (await model.run(content, {
      mean_pool: true,
      normalize: true,
    })) as number[];

    const embedding = JSON.stringify(output);

...
}

It may works on Supabase hosted since it's backend by cloud with a loot of optimizations and auto-scaling features, but in a dev environment with supabase cli the resources are more limited and somethings we'll be facing CPU time exceptions.

How to avoid heavy processing errors

Instead of fighting with egde runtime supervisor why not prepare our application to self-heal? Processing errors can always occur so we should know how to deal with that.

In the case of embedding processing a cool thing to do is apply request tracking + automated recover jobs with that we can auto embed missing/failed sections.

The request tracker

I always start my projects by adding a request tracker to pg_net

Automated recover jobs

With a request tracker we can create a retry_failed_requests function that can be used with side pg_cron

-- Retry failed requests scheduling
create or replace function net.retry_failed_requests(
  retry_count int default 5
)
returns bigint
language plpgsql
security definer
as $$
declare
    request_count bigint;
begin
  with retry_request as (
    select
        id, 
        method,
        url,
        request_params as params,
        request_body as body,
        request_headers as headers
    from net.failed_requests
    limit retry_count
  ),
  delete_http_response as (
      delete from net._http_response
      where id in (select id from retry_request)
      returning *
  ),
  delete_http_request_tracker as (
    delete from net.request_tracker
    where request_id in (select id from retry_request)
    returning *
  )
  select count(*) from (
    select
      net.http_request(
        method := retry_request.method,
        url := retry_request.url,
        params := retry_request.params,
        body := retry_request.body,
        headers := retry_request.headers
    ) from retry_request
  ) as new_requests into request_count;

  return request_count;
end;
$$;

-- retry_failed_requests At every 3th minute  (adjust it as you need)
select cron.schedule('retry-failed-requests', '*/3 * * * *', 'select net.retry_failed_requests(50)');

Also an apply_missing_embeds function could be useful for when we want to trigger it manually from database

--  Function that can be trigged manually
create or replace function private.apply_embed_document_sections(
  batch_size int default 50,
  apply_limit int default 5,
  timeout_milliseconds int default 5 * 60 * 1000
)
returns void
language plpgsql
security definer
as $$
declare
  batch_count int = least(
    ceiling((select count(*) from document_sections where (ai).embedding is null) / batch_size::float),
    apply_limit
  );
begin
  SET search_path = 'public','extensions';

  -- Loop through each batch and invoke an edge function to handle the embedding generation
  for i in 0 .. (batch_count-1) loop
  perform
    net.http_request(
      method := 'POST',
      url := supabase_url() || '/functions/v1/embed',
      body := jsonb_build_object(
        'ids', (select json_agg(id) from (
            select id from document_sections
              where (ai).embedding is null
              order by id limit batch_size offset i*batch_size
          ) as ds)
      ),
      headers := jsonb_build_object(
        'Content-Type', 'application/json',
        'Authorization','Bearer ' || vault.get('supabase_anon_key') -- Consider add the key to vault and create a helper function to get it
      ),
      timeout_milliseconds := timeout_milliseconds
    );
  end loop;
end;


-- Apply missing embedding at every 1th minute
-- select cron.schedule('apply-missing-embedding', '*/1 * * * *', 'select private.apply_embed_document_sections(10,5)');

@JanDoeTian
Copy link

Hi @JanDoeTian and future readers, First of all "CPU time limit reached" is not a bug but how edge-runtime is doing its job.

A brief explanation about Edge Workers

image

An edge-runtime worker is a sandbox where in fact the egde function will be executed. This isolation allows to apply a boundary box over the function, it means that we can limit the CPU and memory consumption.

To simplify, the worker have two main stages:

  • Init: Called once on worker creation, every code before Deno.serve() will be executed in this point. It explains why we need to do model loading outside.
  • Execution: Called on every request.

After 1º request the worker stay warm and will skip the init phase, so that we don't need to reload everything again.

How it affects the embedding processing

Since the function haves a limited CPU and memory, we can't do massively tasks in a single request. So is impossible to batch a loot of sections at once.

Also the current Supabase.ai doesn't support native batch inference, it means that while Transformers models can handle batch input we still need to for loop over each received section to process them, increasing the CPU time consuption:

for (const row of rows) {
    const { id, [contentColumn]: content } = row;

    if (!content) {
      console.error(`No content available in column '${contentColumn}'`);
      continue;
    }

    const output = (await model.run(content, {
      mean_pool: true,
      normalize: true,
    })) as number[];

    const embedding = JSON.stringify(output);

...
}

It may works on Supabase hosted since it's backend by cloud with a loot of optimizations and auto-scaling features, but in a dev environment with supabase cli the resources are more limited and somethings we'll be facing CPU time exceptions.

How to avoid heavy processing errors

Instead of fighting with egde runtime supervisor why not prepare our application to self-heal? Processing errors can always occur so we should know how to deal with that.

In the case of embedding processing a cool thing to do is apply request tracking + automated recover jobs with that we can auto embed missing/failed sections.

The request tracker

I always start my projects by adding a request tracker to pg_net

Automated recover jobs

Hey Kelly,

Thank you so much for the detailed explanation & insights, I gave it a detailed read and really enjoyed it.

There is something I disagree with you, correct me if I'm wrong because I'm not expert. You mentioned: Processing errors can always occur so we should know how to deal with that. I don't really agree with you in this use case where embedding should be deterministic and we shouldn't expect any errors.

I guess what troubled me a little is that the author of this repo manages to run edge function with the given dataset sucessfully but when I deploy the exact same codebase it fails.

Is there anyway to push Supabase team to change their pricing model on the edge function in a way to allow a flexible CPU-time limit? Because it seems that using Automated recover jobs are an anti-pattern.

Best.

@JanDoeTian
Copy link

JanDoeTian commented Oct 31, 2024

@JanDoeTian Can you try deploying your function to a Supabase project? Do you still get the same CPU time limit error?

Also, how many records per batch you got?

Hi Laktek, @laktek

I'm running with only 1-row and still observing cpu-time limit.

I have run a few more experiments and have some observations to share:

  1. Failling due to CPU-time limit can occur on both CLI and Cloud.
  2. Failing occur randomly, not related to a specific row.
  • A separate issue:
    The Invocations tab of edge functions on cloud dashboard seems to produce the wrong number of invocations, while the booted log shows the correct number, I have 26 rows, and booted 26 times under Logs but only counted 20 invocations. (This doesn't relate to failing due to hitting cpu-time limit)

@kallebysantos
Copy link

kallebysantos commented Oct 31, 2024

Hi @JanDoeTian

There is something I disagree with you, correct me if I'm wrong because I'm not expert. You mentioned: Processing errors can always occur so we should know how to deal with that. I don't really agree with you in this use case where embedding should be deterministic and we shouldn't expect any errors.
To simplify, lets take this repo use case as example: After upload a document we must do a background processing until it becomes available for users to search. It was done by database triggers, so the embed function is called once after each section insert.

If something happens, like in your case by CPU limit exceed or other kind of fail (network, memory etc...) we may lost this particular section. So my point was about how we can deal with fails, because they will always occur and our system should be prepared to recover unprocessed sections.

The current use case is about document embedding but I already did applied the same strategies in other kind of situations, like document conversion, image extraction, etc...

  1. Failing occur randomly, not related to a specific row.

It can occurs when we try to invoke a lot of requests at same time, I'd face the same issues while I was embedding on self-host (with GPU support). Using the strategies that I mentioned before I could process a lot of embeddings more than 1 million.

@JanDoeTian
Copy link

Hi @JanDoeTian

There is something I disagree with you, correct me if I'm wrong because I'm not expert. You mentioned: Processing errors can always occur so we should know how to deal with that. I don't really agree with you in this use case where embedding should be deterministic and we shouldn't expect any errors.
To simplify, lets take this repo use case as example: After upload a document we must do a background processing until it becomes available for users to search. It was done by database triggers, so the embed function is called once after each section insert.

If something happens, like in your case by CPU limit exceed or other kind of fail (network, memory etc...) we may lost this particular section. So my point was about how we can deal with fails, because they will always occur and our system should be prepared to recover unprocessed sections.

The current use case is about document embedding but I already did applied the same strategies in other kind of situations, like document conversion, image extraction, etc...

  1. Failing occur randomly, not related to a specific row.

It can occurs when we try to invoke a lot of requests at same time, I'd face the same issues while I was embedding on self-host (with GPU support). Using the strategies that I mentioned before I could process a lot of embeddings more than 1 million.

Thanks Kelly, it makes sense, appreciate the insights from your experiences.

I guess I'll use a separate table to do embedding and move the succesful ones to the destination table!

@kallebysantos
Copy link

kallebysantos commented Dec 5, 2024

Hi @JanDoeTian, I found some useful information about the following question:

The Invocations tab of edge functions on cloud dashboard seems to produce the wrong number of invocations, while the booted log shows the correct number, I have 26 rows, and booted 26 times under Logs but only counted 20 invocations. (This doesn't relate to failing due to hitting cpu-time limit)

In this issue supabase/edge-runtime#408 , Nyannyacha did explain about how edge functions are handled in dev.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests