Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cupsd blocked in read(2) #1157

Closed
michaelrsweet opened this issue Apr 26, 2005 · 12 comments
Closed

cupsd blocked in read(2) #1157

michaelrsweet opened this issue Apr 26, 2005 · 12 comments

Comments

@michaelrsweet
Copy link
Collaborator

Version: 1.1-current
CUPS.org User: jlovell

I received a bug saying a busy cupsd can become blocked with the following backtrace:

Analysis of sampling pid 2216 every 10.000000 milliseconds
Call graph:
4998 Thread_0e0b
4998 start (in cupsd) (ipp.c:1017)
4998 __start (in cupsd) (crt.c:267)
4998 _main (in cupsd) (main.c:633)
4998 _UpdateJob (in cupsd) (job.c:2134)
4998 read

My theory is file descriptors can be reused within one pass of the main select loop. It goes something like this:

  • A job is started which creates a pipe for status messages.
  • In one pass of the select loop data is available on a client connection and the job's status pipe.
  • The client data turns out to be an IPP_HOLD_JOB request. This stops the job and clears the status fd from 'InputSet' (but not from main's 'input' fd_set)
  • As part of the hold CheckJobs() starts another job. This in turn reuses the fd for this new job's status pipe. No data is yet available on this fd.
  • Farther down in main we get to "if (job->pipe >= 0 && FD_ISSET(job->pipe, input))" which is still true. UpdateJob is called which blocks in read().

A fix is to make the local fd_set 'input' a global and wherever we see this kind of code:

close(current->pipe);
FD_CLR(current->pipe, InputSet);

add a:
FD_CLR(current->pipe, input);

Thoughts?

Thanks

@michaelrsweet
Copy link
Collaborator Author

CUPS.org User: mike

I believe this is a bug we fixed a while back (checking the revision history, looks like it was fixed in CUPS 1.1.19) - the current code clears the main() input set bit:

  if (job->pipe >= 0 && FD_ISSET(job->pipe, input))
  {
   /*
    * Clear the input bit to avoid updating the next job
* using the same status pipe file descriptor...
*/

    FD_CLR(job->pipe, input);

Which version of CUPS did you import from?

@michaelrsweet
Copy link
Collaborator Author

CUPS.org User: jlovell

You did fix the case where UpdateJob starts a new job (the usual path to job completion / next job starting) but not when ReadClient puts a job on hold or cancels it.

Thanks!

@michaelrsweet
Copy link
Collaborator Author

CUPS.org User: mike

OK, probably the easiest fix is to move the client checking AFTER all of the job checking (or visa-versa) so that that condition can't occur... :)

I really don't want to make the main() FD_SET global, since it is really just transient state info...

Try the attached patch and let me know if that fixes things for you...

@michaelrsweet
Copy link
Collaborator Author

CUPS.org User: jlovell

Hmm... I think it can still be a problem between two calls to ReadClient. if the first call causes CloseClient to be called (invalid IPP request) and the second starts a cgi such that a reused 'con->file' is added to InputSet then WriteClient() can block trying to read the cgi data.

Sound possible?

Thanks.

@michaelrsweet
Copy link
Collaborator Author

CUPS.org User: mike

Hmm, possibly.

Try the additional patch below which will take care of that edge case...

@michaelrsweet
Copy link
Collaborator Author

CUPS.org User: jlovell

I think there's still a small window here... WriteClient gets an error closing the fd, next ReadClient starts a cgi, a following WriteClient then blocks reading cgi data. It might be enough to "FD_CLR(con->file, input)" no matter the return value from ReadClient.

It's a little worse in our main() because mdns (Bonjour)involves opening & closing sockets when printers come & go. This provides another opportunity to reuse an fd. I'll track your changes but I'll probably also go with my original plan of adding FD_CLRs of a global 'input' since it removes all the ordering issues.

Thanks!

@michaelrsweet
Copy link
Collaborator Author

CUPS.org User: mike

Sorry I didn't respond sooner, still in the "get everything commited to 1.2" mode... :)

OK, so if I understand the issue correctly, it is the non-clearing of the output set here:

  if (FD_ISSET(con->http.fd, output) &&
      (!con->pipe_pid || con->file_ready))
    if (!WriteClient(con))
{
  con --;
  continue;
}

If so, then the attached patch should clear things up (literally :) without making the transient locale input or output sets global.

I might look at adding cupsdSetInput, cupsdSetOutput, cupsdClearInput, and cupsdClearOutput calls in CUPS 1.2 which manage both the current global sets as well as the sets used by the main() loop...

@michaelrsweet
Copy link
Collaborator Author

CUPS.org User: jlovell

Thanks. I think the patch solves the WriteClient case.

I have some mostly completed code that implements something like what you suggest; the goal was to add encapsulated support for select(2), poll(2) and kqueue(2). The SPI looks something like:

extern int cupsPollInit();
extern void cupsPollFree();
extern int cupsPollAdd(int fd, int event_mask, int (*notifier)(void * context), void *context);
extern void cupsPollRemove(int fd);
extern int cupsPollForEvents(int timeout);

The general idea came from:
http://www.kegel.com/c10k.html#nb.edge
http://www.kegel.com/dkftpbench/doc/Poller.html

(I found this when was using his excellent ISDN info back when I used one of those cool dual channel ISDN modems -- Ha!)

@michaelrsweet
Copy link
Collaborator Author

CUPS.org User: mike

Feel free to send a patch in a separate STR for the poller SPI - it might come in useful for future porting work...

@michaelrsweet
Copy link
Collaborator Author

"str1157.patch":

Index: main.c

--- main.c (revision 4504)
+++ main.c (working copy)
@@ -583,6 +583,66 @@
break;
}

  • /*
  • * Check for status info from job filters...
  • */
  • for (job = Jobs; job != NULL; job = next)
  • {
  •  next = job->next;
    
  •  if (job->pipe >= 0 && FD_ISSET(job->pipe, input))
    
  •  {
    
  •   /*
    
  •    \* Clear the input bit to avoid updating the next job
    
  • * using the same status pipe file descriptor...
  • */
  •    FD_CLR(job->pipe, input);
    
  •   /*
    
  •    \* Read any status messages from the filters...
    
  • */
  •    UpdateJob(job);
    
  •  }
    
  • }
  • /*
  • * Update CGI messages as needed...
  • */
  • if (CGIPipes[0] >= 0 && FD_ISSET(CGIPipes[0], input))
  •  UpdateCGI();
    
  • /*
  • * Update the browse list as needed...
  • */
  • if (Browsing && BrowseProtocols)
  • {
  •  if (BrowseSocket >= 0 && FD_ISSET(BrowseSocket, input))
    
  •    UpdateCUPSBrowse();
    
  •  if (PollPipe >= 0 && FD_ISSET(PollPipe, input))
    
  •    UpdatePolling();
    
    +#ifdef HAVE_LIBSLP
  •  if ((BrowseProtocols & BROWSE_SLP) && BrowseSLPRefresh <= time(NULL))
    
  •    UpdateSLPBrowse();
    
    +#endif /* HAVE_LIBSLP */
    +
  •  if (time(NULL) > browse_time)
    
  •  {
    
  •    SendBrowseList();
    
  • browse_time = time(NULL);
  •  }
    
  • }
  • /*
  • * Check for new connections on the "listen" sockets...
  • */

for (i = NumListeners, lis = Listeners; i > 0; i --, lis ++)
if (FD_ISSET(lis->fd, input))
{
@@ -590,6 +650,10 @@
AcceptClient(lis);
}

  • /*
  • * Check for new data on the client sockets...
  • _/

for (i = NumClients, con = Clients; i > 0; i --, con ++)
{
/_
@@ -658,62 +722,6 @@
}

/*

  • * Check for status info from job filters...

- */

  • for (job = Jobs; job != NULL; job = next)
  • {

- next = job->next;

  •  if (job->pipe >= 0 && FD_ISSET(job->pipe, input))
    
  •  {
    
  •   /*
    
  •    \* Clear the input bit to avoid updating the next job
    
  • * using the same status pipe file descriptor...

- */

- FD_CLR(job->pipe, input);

  •   /*
    
  •    \* Read any status messages from the filters...
    

- */

  •    UpdateJob(job);
    
  •  }
    

- }

  • /*
  • * Update CGI messages as needed...

- */

  • if (CGIPipes[0] >= 0 && FD_ISSET(CGIPipes[0], input))

- UpdateCGI();

  • /*
  • * Update the browse list as needed...

- */

  • if (Browsing && BrowseProtocols)
  • {
  •  if (BrowseSocket >= 0 && FD_ISSET(BrowseSocket, input))
    

- UpdateCUPSBrowse();

  •  if (PollPipe >= 0 && FD_ISSET(PollPipe, input))
    

- UpdatePolling();

-#ifdef HAVE_LIBSLP

  •  if ((BrowseProtocols & BROWSE_SLP) && BrowseSLPRefresh <= time(NULL))
    
  •    UpdateSLPBrowse();
    

    -#endif /* HAVE_LIBSLP */

  •  if (time(NULL) > browse_time)
    
  •  {
    
  •    SendBrowseList();
    
  • browse_time = time(NULL);
  •  }
    

- }

  • /*
  • Update any pending multi-file documents...
    */

@michaelrsweet
Copy link
Collaborator Author

"str1157part2.patch":

Index: main.c

--- main.c (revision 4509)
+++ main.c (working copy)
@@ -666,6 +666,9 @@

     if (!ReadClient(con))
{
  • if (con->pipe_pid)
    
  •   FD_CLR(con->file, input);
    
    • con --;
      continue;
      }

@michaelrsweet
Copy link
Collaborator Author

"str1157part3.patch":

Index: main.c

--- main.c (revision 4515)
+++ main.c (working copy)
@@ -700,14 +700,18 @@
}
}

  •  if (FD_ISSET(con->http.fd, output) &&
    
  •      (!con->pipe_pid || con->file_ready))
    
  •    if (!WriteClient(con))
    
  • {
  • con --;
    
  • continue;
    
  • }
  •  if (FD_ISSET(con->http.fd, output))
    
  •  {
    
  •    FD_CLR(con->http.fd, output);
    
  • if (!con->pipe_pid || con->file_ready)
  •      if (!WriteClient(con))
    
  • {
    
  •   con --;
    
  •   continue;
    
  • }
    
  •  }
    
    • /*
    • Check the activity and close old clients...
      */

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant