-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding some imatrix tools #5302
Conversation
Thank you, Ikawrakow, I really needed the first and third feature! If I understand properly, --continue file_name loads an iMatrix, and --from-chunk N allows the continuation of the loaded iMatrix? As for --combine comma_separated_list_of_files, could please you give some exemples about the use cases, and the methodology employed for combining the iMatrix data, because it's unclear to me. |
If my guess is correct, I think the second feature can combine two different imatrix result from two process. |
Yes. If you have an imatrix calculated from, e.g., the first 50 chunks of
Or, you can store the result from the next chunks, and then use the `--combine option to combine the two results:
As an example for using
to combine them and store the result in |
That's great! I'd love to combine English and French iMatrix files indeed to improve a quant quality in both languages, especially with Miqu out! But won't such a combination be a sort of "blur" between the values of the first and second iMatrix files, negating partly the benefit of one and another, this being amplified if we combine more than 2 files? Also, should the iMatrix files combined preferably be of the same number of chunks and same ctx size? And also, I just noticed, if I make the iMatrix on a Yi 34x2 MOE fp16, I get that 👍 [1]3.2549,[2]7.9056,[3]8.2650,[4]6.4178, It seems that the count is not done properly, maybe affected by the number of models/experts of the MOE acting as a divisor. [245]13.2935,[246]13.3249,[247]13.4073,[248]13.4650,[249]13.5338, save_imatrix: stored collected data after 500 chunks in Y:\iMatrix\TomGrc_FusionNet_34Bx2_MoE_v0.1-b2054-Q8_0.iMatrix_Wiki_c32_ch500.dat.at_500 But once the count of the autosave is reached, it keeps crunching as it should. Which means that the problematic count is likely part of the autosave feature. |
@ikawrakow |
I second @sorasoras. Having a random or simply multiple user choices, like 32 64 128 256 512 ctx size, multiplied by the number of chunks) to make the iMatrix could be interesting to test! |
Can you both give some specific examples of what you did and how this improved your results? I did a quick try with a manually prepared mix of context lengths, and it didn't seem to help. |
basically, I have a friend who finetune 1.8B 7B 13B qwen into a translation machine from Chinese to Japanese with same set of data. |
* imatrix: adding --combine and --continue-from * imatrix: be able to start from a specific chunk --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
* imatrix: adding --combine and --continue-from * imatrix: be able to start from a specific chunk --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
I was playing around with various imatrix calculations and needed some additional functionality currently not available in the
imatrix
tool. The result is this PR, which adds the following functionality--continue file_name
If specified on the command line, the imatrix data infile_name
will be loaded, and the subsequent calculation will accumulate on top of that.--combine comma_separated_list_of_files
If specified on the command line, theimatrix
tool will load and combine the imatrix data in the list of provided files. The files are comma separated, so sorry, no commas (or spaces) allowed in file names. The data will then be saved (either inimatrix.dat
or in the file specified via the-o
option), and the program will terminate. No calculation is done when this option is specified.--from-chunk N
After tokenizing the supplied dataset, the firstN
token chunks will be discarded before proceeding with the calculation. For instance, if one has done a calculation with 100 chunks usingsome_training_data
, and wants to continue from there, one can use./imatrix -m some_model -f some_training_data --continue previous_imatrix --from-chunk 100
.I was playing around with the C4 datasets, which are huge, so it takes a very long time to tokenize (e.g., 1.5 minutes for
c4-validation.00000-of-00008.json
on my computer). I was bothered by that, so was tempted to add an option to tokenize just a portion of the data. But to be dome correctly one needs to deal with utf8, so did not implement for now.