Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gearmand cannot read hyphenated function names used in Redis keys #159

Open
efilson opened this issue Mar 13, 2018 · 3 comments
Open

Gearmand cannot read hyphenated function names used in Redis keys #159

efilson opened this issue Mar 13, 2018 · 3 comments

Comments

@efilson
Copy link

efilson commented Mar 13, 2018

When gearmand is restarted, and needs to read the persistent queue back into memory, it drops anything after the first hyphen for the function name.

Debian 9
gearmand-1.1.18 (from source)
redis-server 3.2.6
hiredis 0.13.3

For example:

127.0.0.1:6379> keys *
  1) "_gear_-foo-bar-1-8e1000a0-2629-11e8-8e09-f23c91d45ae0"
  2) "_gear_-foo-bar-1-44f8384e-2618-11e8-99e7-f23c91d45ae0"
  3) "_gear_-foo-bar-2-2c37c006-2625-11e8-9f1b-f23c91d45a58"
  4) "_gear_-foo-bar-2-f3dc10ca-25bd-11e8-8c7a-f23c91d45a58"
  5) "_gear_-foo-bar-2-9efa27c2-260d-11e8-a2be-f23c91d45a58"
  6) "_gear_-foo-bar-1-050d6344-2631-11e8-99e7-f23c91d45ae0"
  7) "_gear_-foo-bar-2-8037f7c6-25bb-11e8-8c7a-f23c91d45a58"
  8) "_gear_-foo-bar-2-ce7359b8-25a6-11e8-b058-f23c91d45a58"
  9) "_gear_-foo-bar-1-0a92824e-2623-11e8-8e09-f23c91d45ae0"
 10) "_gear_-foo-bar-1-56e1aa5a-2621-11e8-99e7-f23c91d45ae0"

Shows up in gearman_top as:

Queue Name       | Worker Available | Jobs Waiting | Jobs Running
---------------------------------------------------------------------------------
 foo-bar-1       |              10  |           0  |           0
 foo-bar-2       |              10  |           0  |           0
 foo             |               0  |          10  |           0

Python is not my forte, but I suspect Line 405 of libgearman-server/plugins/queue/redis/queue.cc to be the culprit:

    int fmt_str_length= snprintf(fmt_str, sizeof(fmt_str), "%%%d[^-]-%%%d[^-]-%%%ds",
                                 int(GEARMAND_QUEUE_GEARMAND_DEFAULT_PREFIX_SIZE),
                                 int(GEARMAN_FUNCTION_MAX_SIZE),
                                 int(GEARMAN_MAX_UNIQUE_SIZE));

It might make sense, now that gearmand utilizes Redis' Hash Keys, to instead add a hash field for function. Since we aren't utilizing any sort key name pattern matching at the moment.

If key name pattern matching did become a necessity, I believe it would be a bit cleaner to create a set keyed by Gearman function (with a unique prefix) containing a pointer to the hash keys.

To illustrate using the above example (it would be best to use a pipeline if possible):

> HMSET "_gear_8e1000a0-2629-11e8-8e09-f23c91d45ae0" "data" "{foo:\"bar\"}" "priority" 1
OK
> SADD "_gearFunc_foo-bar-1" "_gear_8e1000a0-2629-11e8-8e09-f23c91d45ae0"
(integer) 1
> HMSET "_gear_44f8384e-2618-11e8-99e7-f23c91d45ae0" "data" "{foo:\"bar\"}" "priority" 1
OK
> SADD "_gearFunc_foo-bar-2" "_gear_44f8384e-2618-11e8-99e7-f23c91d45ae0"
(integer) 1
> etc...
@SpamapS
Copy link
Member

SpamapS commented Mar 13, 2018

Not sure how Python is involved?

Anyway, I think that's a legitimate bug, and we should replace "-" with a more deliberate string to separate the function name from the unique ID.

@efilson
Copy link
Author

efilson commented Mar 13, 2018

Hahah Python definitely not involved. Burning the candle at both ends, bringing up a bunch of new Gearman workers for text and image classification (Python) to handle a large traffic spike. Which is, of course, the same time gearman job servers need a kick, and come back up with all jobs in truncated buckets. But hey, at least they persisted and were accessible in an easy to access format to recreate job payloads 👍

@SpamapS
Copy link
Member

SpamapS commented Mar 14, 2018

Indeed.

Just, as a general rule, I'd suggest that you have your workers store the data in redis, and not rely on gearmand to do it. Using background jobs and a queue plugin is basically the least scalable way to use gearmand.

I understand, the queue plugins are there, so they seem attractive, but you're far better off having workers store and recover in-flight jobs in a place that scales with the workers.

I wrote an example worker in python to do just that:

https://github.com/SpamapS/gearstore

It's not production tested, but it follows the general pattern others have used for large gearmand based reliable queuing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants