Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

redis driver for blob cache information and metadb #2865

Merged
merged 18 commits into from
Jan 30, 2025

Conversation

andaaron
Copy link
Contributor

@andaaron andaaron commented Jan 7, 2025

Rebased #2412 and continued work on missing method implementation for cache driver, and implemented MetaDB methods.

Still need to:

  • run linter
  • make sure all existing tests pass
  • add more tests to cover redis
  • account for separate redis configuration for different substores
  • see if we can do anything for image signatures (right now testing with local storage, but we should have an alternative).
  • use an effective locking system:
    • need to check what support they offer in this library for concurrent goroutines
    • even if they have support for concurrent goroutines in the library we need a solution to work with multiple zot instances using the same database instance - using https://github.com/go-redsync/redsync

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

This was referenced Jan 7, 2025
@elee1766
Copy link
Contributor

elee1766 commented Jan 7, 2025

@andaaron

this is awesome work. thank you.

i wish i could finish this but company needs put me elsewhere. will be super excited to use this feature when it is ready.

I wanted to note a few things, since i see some notes in your PR.

  1. for locks, i have had good success in https://github.com/go-redsync/redsync in the past, using it with redis-v9 driver. however, they only provide a somewhat-safe mutex, and you still need to implement either a leadership election on top of the mutex primative, or force the hot-paths to use mutex. for my needs, this has been okay, since mostly I am using mutexes to save on duplication of work.

  2. this is a slight tangent, but the more modern redis library is https://github.com/redis/rueidis
    which has https://github.com/redis/rueidis/tree/main/rueidislock. i have been using it for some more recent work, as it claims function to better in a redis cluster, however, the locking package is much less tested and used vs redsync/v4. I wouldn't use it for this project personally, but the option does exist, so i thought i would say

  3. redis cluster, maybe you know, has very interesting behavior + latency during failover, reshuffle, etc, or at least in my experience. i'm not sure if this is in scope for the project, but especially if a lock is going to be in the hot path, it may be important to test that zot does not explode when there are 10-15 second timeouts on random queries to redis

  4. fwiw, i dont think that redis is a great choice for a cache driver. the max value size is 512mb, which is much smaller than the max size for an oci blob (i think its a few gb?), there are many layers with size above 512mb. You could potentially shard/multipart the blob into multiple keys, by creating some container format around blob parts to indicate where the next "part" of the blob is, (now you get into some weird stuff with eviction, so you probably want this to be a lua script), but i dont know, it seems a lot more effort than its worth.

@rchincha
Copy link
Contributor

rchincha commented Jan 7, 2025

@andaaron thx for taking this up

@andaaron
Copy link
Contributor Author

andaaron commented Jan 8, 2025

  1. fwiw, i dont think that redis is a great choice for a cache driver. the max value size is 512mb, which is much smaller than the max size for an oci blob (i think its a few gb?), there are many layers with size above 512mb. You could potentially shard/multipart the blob into multiple keys, by creating some container format around blob parts to indicate where the next "part" of the blob is, (now you get into some weird stuff with eviction, so you probably want this to be a lua script), but i dont know, it seems a lot more effort than its worth.

With regards to this specific point, we have a misunderstanding. The role of the cache driver is to store information on what blob was deduped and and where are the duplicates. The role of metadb is to store information taken from manifests, as well as other metadata about manifests, repos, and users.

We wouldn't be using it to store and serve the actual image blobs, that is a separate storage implementation, where we support local folders and S3.

@andaaron andaaron force-pushed the redis2 branch 5 times, most recently from 2651e26 to b82e496 Compare January 9, 2025 16:14
Copy link

codecov bot commented Jan 9, 2025

Codecov Report

Attention: Patch coverage is 82.78689% with 378 lines in your changes missing coverage. Please review.

Project coverage is 91.03%. Comparing base (6723123) to head (11f462a).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
pkg/meta/redis/redis.go 78.90% 275 Missing and 76 partials ⚠️
pkg/storage/cache/redis.go 90.67% 18 Missing and 4 partials ⚠️
pkg/meta/meta.go 93.47% 2 Missing and 1 partial ⚠️
pkg/api/config/redis/redis.go 98.98% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2865      +/-   ##
==========================================
- Coverage   91.58%   91.03%   -0.55%     
==========================================
  Files         171      174       +3     
  Lines       30488    32636    +2148     
==========================================
+ Hits        27921    29710    +1789     
- Misses       1931     2213     +282     
- Partials      636      713      +77     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@andaaron andaaron force-pushed the redis2 branch 19 times, most recently from 4f0eb30 to 68f3926 Compare January 14, 2025 13:53
@andaaron andaaron force-pushed the redis2 branch 2 times, most recently from eca3bdd to c5f0d20 Compare January 27, 2025 08:30
@vooon
Copy link
Contributor

vooon commented Jan 28, 2025

Need to pass logger into redis client, else it uses default, which do not obey log config.

Jan 28 09:59:44 vuktyl systemd[1]: Started OCI Distribution Registry for dev.
Jan 28 09:59:45 vuktyl zot[3195275]: redis: 2025/01/28 09:59:45 sentinel.go:745: sentinel: discovered new sentinel="65.1.1.1:26379" for master="zotmeta"
Jan 28 09:59:45 vuktyl zot[3195275]: redis: 2025/01/28 09:59:45 sentinel.go:745: sentinel: discovered new sentinel="213.1.1.1:26379" for master="zotmeta"
Jan 28 09:59:45 vuktyl zot[3195275]: redis: 2025/01/28 09:59:45 sentinel.go:709: sentinel: new master="zotmeta" addr="213.1.1.1:6379"
Jan 28 09:59:46 vuktyl zot[3195275]: redis: 2025/01/28 09:59:46 sentinel.go:745: sentinel: discovered new sentinel="213.1.1.1:26379" for master="zotmeta"
Jan 28 09:59:46 vuktyl zot[3195275]: redis: 2025/01/28 09:59:46 sentinel.go:745: sentinel: discovered new sentinel="148.2.2.2:26379" for master="zotmeta"
Jan 28 09:59:46 vuktyl zot[3195275]: redis: 2025/01/28 09:59:46 sentinel.go:709: sentinel: new master="zotmeta" addr="213.1.1.1:6379"

@andaaron
Copy link
Contributor Author

Ah, we'll see what we can do about that.

@andaaron
Copy link
Contributor Author

andaaron commented Jan 28, 2025

Looked at the code they have for logging and here's some notes:

  • they set the logger as a global variable, and setting it is not protected by a mutex (we'd have to add some global locking when setting it), as you have already seen, some messages repeat because there's 2 separate client instances.
    • I don't think locking in Printf() would be effective as the lock needs to protect accessing internal.Logger attribute, not what it does inside Printf(). My concern is it would make a call to an object which has already been replaced.
    • Maybe we should find a way to call redis.SetLogger() just once, and have a lock for setting the log.Logger attribute of the RedisLogger
  • we can't customize the messages too much, we'd have formatted string resulting in message sentinel: discovered new sentinel="65.1.1.1:26379" for master="zotmeta", in our json
  • the advantage would be we log to the same file, in json format.

Implementation would be similar to:

var redisLoggerLock sync.Mutex

type RedisLogger struct {
	log log.Logger
}

func (r RedisLogger) Printf(ctx context.Context, format string, v ...interface{}) {
	redisLoggerLock.Lock()
	defer redisLoggerLock.Unlock()

	r.log.Debug().Msgf(format, v...)
}

func setRedisLock(log log.Logger) {
	redisLoggerLock.Lock()
	defer redisLoggerLock.Unlock()

	redis.SetLogger(RedisLogger{log})
}

func GetRedisClient(redisConfig map[string]interface{}, log log.Logger) (redis.UniversalClient, error) {
	setRedisLock(log)
        [...]
}

I really don't like globals, but I don't see an alternative.

Probably better, if it works:

var once sync.Once

type RedisLogger struct {
	log log.Logger
}

func (r RedisLogger) Printf(ctx context.Context, format string, v ...interface{}) {
	r.log.Debug().Msgf(format, v...)
}

func GetRedisClient(redisConfig map[string]interface{}, log log.Logger) (redis.UniversalClient, error) {
	once.Do(func(){ redis.SetLogger(RedisLogger{log}) })
        [...]
}

Still uses a global.

@andaaron
Copy link
Contributor Author

Need to pass logger into redis client, else it uses default, which do not obey log config.

Jan 28 09:59:44 vuktyl systemd[1]: Started OCI Distribution Registry for dev.
Jan 28 09:59:45 vuktyl zot[3195275]: redis: 2025/01/28 09:59:45 sentinel.go:745: sentinel: discovered new sentinel="65.1.1.1:26379" for master="zotmeta"
Jan 28 09:59:45 vuktyl zot[3195275]: redis: 2025/01/28 09:59:45 sentinel.go:745: sentinel: discovered new sentinel="213.1.1.1:26379" for master="zotmeta"
Jan 28 09:59:45 vuktyl zot[3195275]: redis: 2025/01/28 09:59:45 sentinel.go:709: sentinel: new master="zotmeta" addr="213.1.1.1:6379"
Jan 28 09:59:46 vuktyl zot[3195275]: redis: 2025/01/28 09:59:46 sentinel.go:745: sentinel: discovered new sentinel="213.1.1.1:26379" for master="zotmeta"
Jan 28 09:59:46 vuktyl zot[3195275]: redis: 2025/01/28 09:59:46 sentinel.go:745: sentinel: discovered new sentinel="148.2.2.2:26379" for master="zotmeta"
Jan 28 09:59:46 vuktyl zot[3195275]: redis: 2025/01/28 09:59:46 sentinel.go:709: sentinel: new master="zotmeta" addr="213.1.1.1:6379"

Can you try with my latest commit and see if the result is better?

@vooon
Copy link
Contributor

vooon commented Jan 29, 2025

@andaaron no more messages on stderr:

Jan 29 11:04:56 volkhov systemd[1]: Starting OCI Distribution Registry for dev...
Jan 29 11:04:56 volkhov zot[543219]: {"level":"info","config":"/etc/zot/config.dev.json","time":"2025-01-29T11:04:56.186315426+01:00","message":"config file is valid"}
Jan 29 11:04:56 volkhov systemd[1]: Started OCI Distribution Registry for dev.

zot.log:

{"level":"info","clusterMember":"213.1.1.1:8181","clusterMemberIndex":"1","goroutine":1,"caller":"zotregistry.dev/zot/pkg/api/config/redis/redis.go:204","time":"2025-01-29T11:17:52.105865075+01:00","message":"finished parsing redis universal options"}
{"level":"debug","clusterMember":"213.1.1.1:8181","clusterMemberIndex":"1","goroutine":1,"caller":"zotregistry.dev/zot/pkg/api/config/redis/redis.go:24","time":"2025-01-29T11:17:52.347111167+01:00","message":"sentinel: discovered new sentinel=\"65.1.1.1:26379\" for master=\"zotmeta\""}
{"level":"debug","clusterMember":"213.1.1.1:8181","clusterMemberIndex":"1","goroutine":1,"caller":"zotregistry.dev/zot/pkg/api/config/redis/redis.go:24","time":"2025-01-29T11:17:52.347178456+01:00","message":"sentinel: discovered new sentinel=\"213.1.1.1:26379\" for master=\"zotmeta\""}
{"level":"debug","clusterMember":"213.1.1.1:8181","clusterMemberIndex":"1","goroutine":1,"caller":"zotregistry.dev/zot/pkg/api/config/redis/redis.go:24","time":"2025-01-29T11:17:52.406516304+01:00","message":"sentinel: new master=\"zotmeta\" addr=\"213.1.1.1:6379\""}
{"level":"info","clusterMember":"213.1.1.1:8181","clusterMemberIndex":"1","redisConfig":{"addr":["a.repo:26379","b.repo:26379","c.repo:26379"],"db":1,"keyprefix":"zotdev","master_name":"zotmeta","name":"redis","password":"******"},"goroutine":1,"caller":"zotregistry.dev/zot/pkg/api/config/redis/redis.go:78","time":"2025-01-29T11:17:52.40720876+01:00","message":"parsing redis universal options"}
{"level":"info","clusterMember":"213.1.1.1:8181","clusterMemberIndex":"1","goroutine":1,"caller":"zotregistry.dev/zot/pkg/api/config/redis/redis.go:204","time":"2025-01-29T11:17:52.407257593+01:00","message":"finished parsing redis universal options"}
{"level":"debug","clusterMember":"213.1.1.1:8181","clusterMemberIndex":"1","goroutine":1,"caller":"zotregistry.dev/zot/pkg/api/config/redis/redis.go:24","time":"2025-01-29T11:17:52.624540182+01:00","message":"sentinel: discovered new sentinel=\"213.1.1.1:26379\" for master=\"zotmeta\""}
{"level":"debug","clusterMember":"213.1.1.1:8181","clusterMemberIndex":"1","goroutine":1,"caller":"zotregistry.dev/zot/pkg/api/config/redis/redis.go:24","time":"2025-01-29T11:17:52.624603664+01:00","message":"sentinel: discovered new sentinel=\"148.2.2.2:26379\" for master=\"zotmeta\""}
{"level":"debug","clusterMember":"213.1.1.1:8181","clusterMemberIndex":"1","goroutine":1,"caller":"zotregistry.dev/zot/pkg/api/config/redis/redis.go:24","time":"2025-01-29T11:17:52.719892916+01:00","message":"sentinel: new master=\"zotmeta\" addr=\"213.1.1.1:6379\""}
{"level":"info","clusterMember":"213.1.1.1:8181","clusterMemberIndex":"1","component":"metadb","goroutine":1,"caller":"zotregistry.dev/zot/pkg/meta/parse.go:29","time":"2025-01-29T11:17:52.720962422+01:00","message":"parsing storage and initializing"}

elee1766 and others added 18 commits January 30, 2025 17:55
Currently, we have dynamoDB as the remote shared cache but ideal only
for the cloud use case.
For on-prem use case, add support for redis.

Signed-off-by: Ramkumar Chinchani <rchincha@cisco.com>
Signed-off-by: Petu Eusebiu <peusebiu@cisco.com>
Signed-off-by: Alexei Dodon <adodon@cisco.com>
Signed-off-by: Andrei Aaron <aaaron@luxoft.com>
- add missing method GetAllBlobs
- add redis cache tests, with and without mocking

Signed-off-by: Andrei Aaron <aaaron@luxoft.com>
Signed-off-by: Andrei Aaron <aaaron@luxoft.com>
…s DB

Signed-off-by: Andrei Aaron <aaaron@luxoft.com>
Signed-off-by: Andrei Aaron <aaaron@luxoft.com>
…ethod meta.Crate()

Signed-off-by: Andrei Aaron <aaaron@luxoft.com>
Signed-off-by: Andrei Aaron <aaaron@luxoft.com>
Signed-off-by: Andrei Aaron <aaaron@luxoft.com>
Signed-off-by: Andrei Aaron <aaaron@luxoft.com>
…r pkg/api/config/redis

Signed-off-by: Andrei Aaron <aaaron@luxoft.com>
Signed-off-by: Andrei Aaron <aaaron@luxoft.com>
Signed-off-by: Andrei Aaron <aaaron@luxoft.com>
Signed-off-by: Andrei Aaron <aaaron@luxoft.com>
Signed-off-by: Andrei Aaron <aaaron@luxoft.com>
@rchincha rchincha merged commit 05823cd into project-zot:main Jan 30, 2025
37 of 41 checks passed
@andaaron
Copy link
Contributor Author

andaaron commented Jan 30, 2025

@vooon, thank you for taking the time to test this change. It is now merged into main branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants