Add `gh models view` command #6

cheshire137 · 2024-10-08T19:24:04Z

Closes https://github.com/github/models/issues/320

I referenced the kind of information shown on a page like https://github.com/marketplace/models/azure-openai/gpt-4o-mini. I think there's still additional formatting we could do on the output produced, but I thought this was a good enough first pass to get some kind of details view in there.

Sample output:

% script/build && ./gh-models view gpt-4o-mini
Building extension (GOOS= GOARCH=)
Output: gh-models
Display name:            OpenAI GPT-4o mini
Summary name:            gpt-4o-mini
Publisher:               OpenAI
Summary:                 An affordable, efficient AI solution for diverse text and image tasks.
Context:                 up to 131072 input tokens and 4096 output tokens
Rate limit tier:         low
Tags:                    multipurpose, multilingual, multimodal
Supported input types:   text, image, audio
Supported output types:  text
Supported languages:     English, Italian, Afrikaans, Spanish, German, French, Indonesian, Russian, Polish, Ukrainian, Greek, Latvian, Chinese, A...
License:                 custom
License description:     Use of Azure OpenAI Service is subject to applicable Microsoft                                                          
Product Terms https://www.microsoft.com/licensing/terms/welcome/welcomepage including the Universal License Terms for   
Microsoft Generative AI Services and the service-specific terms for the Azure OpenAI product offering.                  

                       
Description:             GPT-4o mini enables a broad range of tasks with its low cost and latency, such as applications that chain or parallelize
multiple model calls (e.g., calling multiple APIs), pass a large volume of context to the model (e.g., full code base or
conversation history), or interact with customers through fast, real-time text responses (e.g., customer support        
chatbots).                                                                                                              
                                                                                                                        
Today, GPT-4o mini supports text and vision in the API, with support for text, image, video and audio inputs and outputs
coming in the future. The model has a context window of 128K tokens and knowledge up to October 2023. Thanks to the     
improved tokenizer shared with GPT-4o, handling non-English text is now even more cost effective.                       
                                                                                                                        
GPT-4o mini surpasses GPT-3.5 Turbo and other small models on academic benchmarks across both textual intelligence and  
multimodal reasoning, and supports the same range of languages as GPT-4o. It also demonstrates strong performance in    
function calling, which can enable developers to build applications that fetch data or take actions with external       
systems, and improved long-context performance compared to GPT-3.5 Turbo.                                               
                                                                                                                        
Resources                                                                                                               
OpenAI announcement https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/                         

                       
Notes:                   Model Provider                                                                                                          
This model is provided through the Azure OpenAI service.                                                                
                                                                                                                        
Relevant documents                                                                                                      
The following documents are applicable:                                                                                 
                                                                                                                        
Overview of Responsible AI practices for Azure OpenAI models https://learn.microsoft.com/en-us/legal/cognitive-         
services/openai/overview                                                                                                
Transparency Note for Azure OpenAI Service https://learn.microsoft.com/en-us/legal/cognitive-services/openai/transparency-
note                                                                                                                    
                                                                                                                        
Acknowledgments                                                                                                         
Leads: Jacob Menick, Kevin Lu, Shengjia Zhao, Eric Wallace, Hongyu Ren, Haitang Hu, Nick Stathas, Felipe Petroski Such  
                                                                                                                        
Program Lead: Mianna Chen                                                                                               
                                                                                                                        
Contributions noted in https://openai.com/gpt-4o-contributions/                                                         
                                                                                                                        
Responsible AI Considerations                                                                                           
Built-in safety measures - Safety is built into our models from the beginning, and reinforced at every step of our      
development process. In pre-training, we filter out information that we do not want our models to learn from or output, 
such as hate speech, adult content, sites that primarily aggregate personal information, and spam. In post-training, we 
align the model's behavior to our policies using techniques such as reinforcement learning with human feedback (RLHF) to
improve the accuracy and reliability of the models' responses.                                                          
                                                                                                                        
GPT-4o mini has the same safety mitigations built-in as GPT-4o, which we carefully assessed using both automated and human
evaluations according to our Preparedness Framework and in line with our voluntary commitments. More than 70 external   
experts in fields like social psychology and misinformation tested GPT-4o to identify potential risks, which we have    
addressed and plan to share the details of in the forthcoming GPT-4o system card and Preparedness scorecard. Insights   
from these expert evaluations have helped improve the safety of both GPT-4o and GPT-4o mini.                            
                                                                                                                        
Building on these learnings, our teams also worked to improve the safety of GPT-4o mini using new techniques informed by
our research. GPT-4o mini in the API is the first model to apply our instruction hierarchy method, which helps to improve
the model's ability to resist jailbreaks, prompt injections, and system prompt extractions. This makes the model's      
responses more reliable and helps make it safer to use in applications at scale.                                        
                                                                                                                        
We'll continue to monitor how GPT-4o mini is being used and improve the model's safety as we identify new risks.        
                                                                                                                        
Content Filtering                                                                                                       
Prompts and completions are passed through a default configuration of Azure AI Content Safety classification models to  
detect and prevent the output of harmful content. Learn more about Azure AI Content Safety                              
https://learn.microsoft.com/en-us/azure/ai-services/content-safety/overview. Additional classification models and       
configuration options are available when you deploy an Azure OpenAI model in production; learn more                     
https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/content-filter?tabs=warning%2Cuser-prompt%2Cpython-new.

                       
Evaluation:              GPT-4o mini surpasses GPT-3.5 Turbo and other small models on academic benchmarks across both textual intelligence and  
multimodal reasoning, and supports the same range of languages as GPT-4o. It also demonstrates strong performance in    
function calling, which can enable developers to build applications that fetch data or take actions with external       
systems, and improved long-context performance compared to GPT-3.5 Turbo.                                               
                                                                                                                        
GPT-4o mini has been evaluated across several key benchmarks.                                                           
                                                                                                                        
Reasoning tasks: GPT-4o mini is better than other small models at reasoning tasks involving both text and vision, scoring
82.0% on MMLU, a textual intelligence and reasoning benchmark, as compared to 77.9% for Gemini Flash and 73.8% for      
Claude Haiku.                                                                                                           
                                                                                                                        
Math and coding proficiency: GPT-4o mini excels in mathematical reasoning and coding tasks, outperforming previous small
models on the market. On MGSM, measuring math reasoning, GPT-4o mini scored 87.0%, compared to 75.5% for Gemini Flash and
71.7% for Claude Haiku. GPT-4o mini scored 87.2% on HumanEval, which measures coding performance, compared to 71.5% for 
Gemini Flash and 75.9% for Claude Haiku.                                                                                
                                                                                                                        
Multimodal reasoning: GPT-4o mini also shows strong performance on MMMU, a multimodal reasoning eval, scoring 59.4%     
compared to 56.1% for Gemini Flash and 50.2% for Claude Haiku.                                                          
                                                                                                                        
               TASK              | GPT-4O MINI SCORE | GEMINI FLASH SCORE | CLAUDE HAIKU SCORE                          
---------------------------------+-------------------+--------------------+---------------------                        
  MMLU (Reasoning Text and       | 82.0%             | 77.9%              | 73.8%                                       
  Vision)                        |                   |                    |                                             
  MGSM (Math Reasoning)          | 87.0%             | 75.5%              | 71.7%                                       
  HumanEval (Coding Performance) | 87.2%             | 71.5%              | 75.9%                                       
  MMMU (Multimodal Reasoning)    | 59.4%             | 56.1%              | 50.2%                                       
                                                                                                                        
Source: GPT-4o mini: advancing cost-efficient intelligence https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-
intelligence/.

As taken from the run command.

Based on https://github.com/github/github/blob/e8381fbe7672a009dc8e76045148b7af8fcd7a28/packages/marketplace/app/models/azure_models/client.rb#L36-L43

Necessary for fetching details of a particular model.

Only used in two places, and I'm not convinced I like the light gray + underline for the 'view' command.

Only used there now, after fixing merge conflicts.

Labelling keywords as tags and lowercasing them like we do to present on the Models Marketplace.

Better display of tables and links.

sgoedecke

LGTM, nice work!

sgoedecke · 2024-10-09T21:53:42Z

cmd/run/run.go

@@ -234,7 +234,7 @@ func NewRunCommand() *cobra.Command {

 			foundMatch := false
 			for _, model := range models {
-				if strings.EqualFold(model.FriendlyName, modelName) || strings.EqualFold(model.Name, modelName) {
+				if model.HasName(modelName) {


cheshire137 · 2024-10-09T21:56:24Z

I feel like such a rebel, merging this without any passing tests. 😅 I swear it works on my machine, though!

Start on 'view' command

04404f4

cheshire137 self-assigned this Oct 8, 2024

cheshire137 added 4 commits October 8, 2024 14:31

Prompt to select a model if none specified

2ac9122

As taken from the run command.

Add util.GetValidModelName

08b184a

Bail early when specified model name is invalid

5ebf021

Bail early when provided model name for 'run' is invalid

1e92fd9

cheshire137 changed the title ~~Add gh models view command~~ Add gh models view command and update run model validation Oct 8, 2024

cheshire137 added 18 commits October 8, 2024 15:09

Replace with ValidateModelName

f9b7694

Replace with GetModelByName, output friendly name

c873994

Pull out LightGrayUnderline shared color

07858b8

Pull out modelPrinter type

8602571

Show publisher and summary fields

b995626

Pull out azureAiStudioURL constant

7aa576e

Start on client GetModelDetails

d75afc9

Based on https://github.com/github/github/blob/e8381fbe7672a009dc8e76045148b7af8fcd7a28/packages/marketplace/app/models/azure_models/client.rb#L36-L43

Get the version and registry name, too

02ca397

Necessary for fetching details of a particular model.

Dump out model details at the end

a042eb6

Add modelCatalogDetailsResponse type

4f833dc

Display model description from details

0a56b20

Output license details with 'view' command

bb9e140

Merge branch 'main' into view-cmd

509238f

Drop centralized colors

584ac6e

Only used in two places, and I'm not convinced I like the light gray + underline for the 'view' command.

Move getModelByName into view package

fdd8d03

Only used there now, after fixing merge conflicts.

Add HasName to ModelSummary

c924fe4

Display model notes, add blank line between multi-line fields

287062a

Include tags in view output

2606848

Labelling keywords as tags and lowercasing them like we do to present on the Models Marketplace.

cheshire137 changed the title ~~Add gh models view command and update run model validation~~ Add gh models view command Oct 9, 2024

cheshire137 added 4 commits October 9, 2024 11:55

Don't output labels for blank values

4ecefd9

Display supported inputs and outputs

c1da574

Display 'evaluation' field

73fd787

Use snake case for JSON fields like our other external-facing types

3600097

cheshire137 added 4 commits October 9, 2024 12:28

Display supported languages

b761cee

Display context max tokens

adfefec

Display rate limit tier

d2ccb20

Do some Markdown processing on multi-line fields

0e61970

Better display of tables and links.

cheshire137 marked this pull request as ready for review October 9, 2024 21:47

cheshire137 requested a review from sgoedecke October 9, 2024 21:48

Don't truncate list of languages

c62a6cc

sgoedecke approved these changes Oct 9, 2024

View reviewed changes

Reduce whitespace in output

4c441d9

cheshire137 requested a review from a team as a code owner October 9, 2024 21:55

cheshire137 merged commit ec140bd into main Oct 9, 2024
3 checks passed

cheshire137 deleted the view-cmd branch October 9, 2024 22:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `gh models view` command #6

Add `gh models view` command #6

cheshire137 commented Oct 8, 2024 •

edited

Loading

sgoedecke left a comment

sgoedecke Oct 9, 2024

cheshire137 commented Oct 9, 2024 •

edited

Loading

Add gh models view command #6

Add gh models view command #6

Conversation

cheshire137 commented Oct 8, 2024 • edited Loading

sgoedecke left a comment

Choose a reason for hiding this comment

sgoedecke Oct 9, 2024

Choose a reason for hiding this comment

cheshire137 commented Oct 9, 2024 • edited Loading

Add `gh models view` command #6

Add `gh models view` command #6

cheshire137 commented Oct 8, 2024 •

edited

Loading

cheshire137 commented Oct 9, 2024 •

edited

Loading