Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] loading saved weights of an open_clip model does give back the same results #915

Closed
yxchng opened this issue Jul 13, 2024 · 2 comments

Comments

@yxchng
Copy link

yxchng commented Jul 13, 2024

import torch
from PIL import Image
import open_clip

import logging
import math
from typing import List, Tuple, Optional, Union

import torch
import torch.nn.functional as F


model, _, preprocess = open_clip.create_model_and_transforms('ViT-L-14-336', pretrained='openai')
model.eval()  # model in train mode by default, impacts some models with BatchNorm or stochastic depth active
tokenizer = open_clip.get_tokenizer('ViT-L-14-336')

image = preprocess(Image.open("CLIP.png")).unsqueeze(0)
text = tokenizer(["a diagram", "a dog", "a cat"])

with torch.no_grad(), torch.cuda.amp.autocast():
    image_features = model.encode_image(image)
    text_features = model.encode_text(text)
    image_features /= image_features.norm(dim=-1, keepdim=True)
    text_features /= text_features.norm(dim=-1, keepdim=True)

    text_probs = (100.0 * image_features @ text_features.T).softmax(dim=-1)

print("Label probs:", text_probs)  # prints: [[1., 0., 0.]]

    


torch.save(model.state_dict(), 'tmp.pt')


model, _, preprocess = open_clip.create_model_and_transforms('ViT-L-14-336', pretrained='tmp.pt')
model.eval()  # model in train mode by default, impacts some models with BatchNorm or stochastic depth active
tokenizer = open_clip.get_tokenizer('ViT-L-14-336')

image = preprocess(Image.open("CLIP.png")).unsqueeze(0)
text = tokenizer(["a diagram", "a dog", "a cat"])

with torch.no_grad(), torch.cuda.amp.autocast():
    image_features = model.encode_image(image)
    text_features = model.encode_text(text)
    image_features /= image_features.norm(dim=-1, keepdim=True)
    text_features /= text_features.norm(dim=-1, keepdim=True)

    text_probs = (100.0 * image_features @ text_features.T).softmax(dim=-1)

print("Label probs:", text_probs)  # prints: [[1., 0., 0.]]

gives

Label probs: tensor([[0.9326, 0.0627, 0.0047]])
Label probs: tensor([[0.8960, 0.0976, 0.0064]])
@yxchng yxchng changed the title loading saved weights of an open_clip model does give back the same results [bug] loading saved weights of an open_clip model does give back the same results Jul 13, 2024
@rwightman
Copy link
Collaborator

It's #771 ... openai force overrides activaiton to quickgelu (which is less efficient and uses more memory than nn.GELU), need to manually force via argument to create to use afterwards (or use a model config with quick gelu)

@yxchng
Copy link
Author

yxchng commented Jul 14, 2024

@rwightman how to manually force via argument?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants