SparkNLP 1004 - Introducing MiniCPM #14205
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
MiniCPM is a series of edge-side large language models, with the base model, MiniCPM-2B, having 2.4B non-embedding parameters. It ranks closely with Mistral-7B on comprehensive benchmarks (with better performance in Chinese, mathematics, and coding abilities), surpassing models like Llama2-13B, MPT-30B, and Falcon-40B. On the MTBench benchmark, which is closest to user experience, MiniCPM-2B also outperforms many representative open-source models such as Llama2-70B-Chat, Vicuna-33B, Mistral-7B-Instruct-v0.1, and Zephyr-7B-alpha.
Screenshots (if appropriate):
Types of changes
Checklist: