sonnet3-7

Star

Here are 8 public repositories matching this topic...

lechmazur / nyt-connections

Star

Benchmark that evaluates LLMs using 601 NYT Connections puzzles extended with extra trick words

testing benchmark evaluation puzzles reasoning llm llms-benchmarking gpt-4o sonnet3-7 gpt-4-5

Updated Mar 22, 2025
Python

prajwalshettydev / UnrealGenAISupport

Star

UnrealMCP is here!! A Unreal Engine plugin for LLM/GenAI models and MCP UE5 server. Supports automatic blueprint and scene generation from Claude Desktop App & Cursor. It currently also includes OpenAI's GPT4o/GPT4o-mini, DeepseekR1 and Claude Sonnet 3.7 APIs for Unreal Engine 5.1 or higher, with plans to add Gemini, Grok 3 & realtime APIs soon.

Updated Mar 22, 2025
C++

lechmazur / step_game

Star

Multi-Agent Step Race Benchmark: Assessing LLM Collaboration and Deception Under Pressure. A multi-player “step-race” that challenges LLMs to engage in public conversation before secretly picking a move (1, 3, or 5 steps). Whenever two or more players choose the same number, all colliding players fail to advance.

game benchmark evaluation multi-agent eval o1 llm deepseek gpt-4o deepseek-r1 o3-mini sonnet3-7 gpt-4-5

Updated Mar 20, 2025

lechmazur / generalization

Star

Thematic Generalization Benchmark: measures how effectively various LLMs can infer a narrow or specific "theme" (category/rule) from a small set of examples and anti-examples, then detect which item truly fits that theme among a collection of misleading candidates.

benchmark evaluation generalization llm llms llms-benchmarking sonnet3-7 gpt-4-5

Updated Mar 20, 2025

getfounded / mcp-tool-kit

Star

Agentic abstraction layer for building high precision vertical AI agents written in python. Middleware for Model Context Protocol.

python enterprise mcp claude llm generative-ai agentic-ai mcp-server modelcontextprotocol sonnet3-7

Updated Mar 21, 2025
Python

rifatSDAS / AI_Neural_Consciousness_Codes

Star

This repository hosts code samples, benchmarks, and experiments exploring the capabilities of Large Language Models (LLMs) like ChatGPT, Claude, DeepSeek, Grok, and more. From AI-driven coding to gaming, creativity, and education. Fork, explore, and contribute! 🚀

game nlp machine-learning ai deep-learning coding openai chatbots grok llm chatgpt claude-ai anthropic-claude deepseek sonnet3-5 grok3 sonnet3-7

Updated Mar 17, 2025
HTML

iammiracle / type-or-die

Star

"Type or Die" – A weekend challenge built with Cursor and Claude Sonnet 3.7, where speed and accuracy are your only survival tools! 🚀🔥

game cursor typing-game sonnet3-7

Updated Mar 2, 2025
JavaScript

temrb / claude-knowledge-instructions

Star

ai nextjs cursor claude sonnet3-5 sonnet3-7

Updated Feb 27, 2025

Improve this page

Add a description, image, and links to the sonnet3-7 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the sonnet3-7 topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sonnet3-7

Here are 8 public repositories matching this topic...

lechmazur / nyt-connections

prajwalshettydev / UnrealGenAISupport

lechmazur / step_game

lechmazur / generalization

getfounded / mcp-tool-kit

rifatSDAS / AI_Neural_Consciousness_Codes

iammiracle / type-or-die

temrb / claude-knowledge-instructions

Improve this page

Add this topic to your repo