Adding some imatrix tools #5302

ikawrakow · 2024-02-03T11:49:24Z

I was playing around with various imatrix calculations and needed some additional functionality currently not available in the imatrix tool. The result is this PR, which adds the following functionality

--continue file_name If specified on the command line, the imatrix data in file_name will be loaded, and the subsequent calculation will accumulate on top of that.
--combine comma_separated_list_of_files If specified on the command line, the imatrix tool will load and combine the imatrix data in the list of provided files. The files are comma separated, so sorry, no commas (or spaces) allowed in file names. The data will then be saved (either in imatrix.dat or in the file specified via the -o option), and the program will terminate. No calculation is done when this option is specified.
--from-chunk N After tokenizing the supplied dataset, the first N token chunks will be discarded before proceeding with the calculation. For instance, if one has done a calculation with 100 chunks using some_training_data, and wants to continue from there, one can use ./imatrix -m some_model -f some_training_data --continue previous_imatrix --from-chunk 100.

I was playing around with the C4 datasets, which are huge, so it takes a very long time to tokenize (e.g., 1.5 minutes for c4-validation.00000-of-00008.json on my computer). I was bothered by that, so was tempted to add an option to tokenize just a portion of the data. But to be dome correctly one needs to deal with utf8, so did not implement for now.

Nexesenex · 2024-02-03T12:13:33Z

Thank you, Ikawrakow, I really needed the first and third feature!

If I understand properly, --continue file_name loads an iMatrix, and --from-chunk N allows the continuation of the loaded iMatrix?

As for --combine comma_separated_list_of_files, could please you give some exemples about the use cases, and the methodology employed for combining the iMatrix data, because it's unclear to me.

sorasoras · 2024-02-03T12:49:52Z

Thank you, Ikawrakow, I really needed the first and third feature!

If I understand properly, --continue file_name loads an iMatrix, and --from-chunk N allows the continuation of the loaded iMatrix?

As for --combine comma_separated_list_of_files, could please you give some exemples about the use cases, and the methodology employed for combining the iMatrix data, because it's unclear to me.

If my guess is correct, I think the second feature can combine two different imatrix result from two process.

ikawrakow · 2024-02-03T13:35:07Z

If I understand properly, --continue file_name loads an iMatrix, and --from-chunk N allows the continuation of the loaded iMatrix?

Yes. If you have an imatrix calculated from, e.g., the first 50 chunks of wiki.train.raw and stored in imatrix_1_50.dat, and you want to add N more chunks, you can use

./imatrix -m some_model -f wiki.train.raw --continue imatrix_1_50.dat --from-chunk 50 --chunks N

Or, you can store the result from the next chunks, and then use the `--combine option to combine the two results:

./imatrix -m some_model -f wiki.train.raw --from-chunk 50 --chunks N -o imatrix_50_100.dat
./imatrix --combine  imatrix_1_50.dat,imatrix_50_100.dat

As an example for using --combine (apart from the example above): suppose you have calculated an imatrix using an English training dataset, and another imatrix using a French training dataset. Let the results be in imatrix_en.dat and imatrix_fr.dat. You can use

./imatrix --combine imatrix_en.dat,imatrix_fr.dat -o imatrix_en_plus_fr.dat

to combine them and store the result in imatrix_en_plus_fr.dat

Nexesenex · 2024-02-03T14:32:32Z

That's great!

I'd love to combine English and French iMatrix files indeed to improve a quant quality in both languages, especially with Miqu out!

But won't such a combination be a sort of "blur" between the values of the first and second iMatrix files, negating partly the benefit of one and another, this being amplified if we combine more than 2 files?

Also, should the iMatrix files combined preferably be of the same number of chunks and same ctx size?

And also, I just noticed, if I make the iMatrix on a Yi 34x2 MOE fp16, I get that 👍

[1]3.2549,[2]7.9056,[3]8.2650,[4]6.4178,
save_imatrix: stored collected data after 10 chunks in Y:\iMatrix\TomGrc_FusionNet_34Bx2_MoE_v0.1-b2054-Q8_0.iMatrix_Wiki_c32_ch500.dat
[5]7.2867,[6]7.2855,[7]7.2059,[8]7.1582,[9]7.3868,
save_imatrix: stored collected data after 20 chunks in Y:\iMatrix\TomGrc_FusionNet_34Bx2_MoE_v0.1-b2054-Q8_0.iMatrix_Wiki_c32_ch500.dat
[10]7.5491,[11]8.1775,[12]8.2683,[13]7.4605,[14]7.7463,
save_imatrix: stored collected data after 30 chunks in Y:\iMatrix\TomGrc_FusionNet_34Bx2_MoE_v0.1-b2054-Q8_0.iMatrix_Wiki_c32_ch500.dat

It seems that the count is not done properly, maybe affected by the number of models/experts of the MOE acting as a divisor.

[245]13.2935,[246]13.3249,[247]13.4073,[248]13.4650,[249]13.5338,
save_imatrix: stored collected data after 500 chunks in Y:\iMatrix\TomGrc_FusionNet_34Bx2_MoE_v0.1-b2054-Q8_0.iMatrix_Wiki_c32_ch500.dat

save_imatrix: stored collected data after 500 chunks in Y:\iMatrix\TomGrc_FusionNet_34Bx2_MoE_v0.1-b2054-Q8_0.iMatrix_Wiki_c32_ch500.dat.at_500
[250]13.5561,[251]13.6481,[252]13.6523,[253]13.6452,[254]13.6817,

But once the count of the autosave is reached, it keeps crunching as it should. Which means that the problematic count is likely part of the autosave feature.

sorasoras · 2024-02-05T14:42:13Z

@ikawrakow
I was thinking you can add a feature that allow you to randomized Context length in imatrix process so you can basically add randomization into imatrix process.
When I merge result from same data with different context length from 16 128 512, it does improve my results.
This combine result really work for me.

Nexesenex · 2024-02-05T16:34:01Z

I second @sorasoras. Having a random or simply multiple user choices, like 32 64 128 256 512 ctx size, multiplied by the number of chunks) to make the iMatrix could be interesting to test!

ikawrakow · 2024-02-05T16:50:41Z

Can you both give some specific examples of what you did and how this improved your results? I did a quick try with a manually prepared mix of context lengths, and it didn't seem to help.

sorasoras · 2024-02-05T17:41:57Z

Can you both give some specific examples of what you did and how this improved your results? I did a quick try with a manually prepared mix of context lengths, and it didn't seem to help.

basically, I have a friend who finetune 1.8B 7B 13B qwen into a translation machine from Chinese to Japanese with same set of data.
I tried to use imatrix to get a better quantization of these model.
I start with 1.8B because you can do fast imatrix.
I first compare context 500 1024, C500 give me a better result by compare translation result from much larger one like 13B Q8.
and I tried Context 16 when I read
#5006 (comment)
but, there are some degradation compare from C16 and C500.
since you can combine two imatrix, why not combine it and give it a try.
it did give more and less better result.
so I tried to combine different context size results.
16,32,64,128,500,600,700,800,900,1000,1500 and so on.
it does give me a much better translation from readability and getting quite close to 13B Q8 that I test with.
That got me think of something I read.

https://www.microsoft.com/en-us/research/quarterly-brief/jan-2024-brief/articles/improving-reasoning-in-language-models-with-laser-layer-selective-rank-reduction/

* imatrix: adding --combine and --continue-from * imatrix: be able to start from a specific chunk --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

Kawrakow added 2 commits February 3, 2024 13:14

imatrix: adding --combine and --continue-from

935227b

imatrix: be able to start from a specific chunk

4e0d6dd

ggerganov approved these changes Feb 3, 2024

View reviewed changes

ikawrakow merged commit 5ed26e1 into master Feb 4, 2024
54 of 56 checks passed

ikawrakow deleted the ik/imatrix_tools branch February 4, 2024 08:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding some imatrix tools #5302

Adding some imatrix tools #5302

ikawrakow commented Feb 3, 2024

Nexesenex commented Feb 3, 2024 •

edited

Loading

sorasoras commented Feb 3, 2024

ikawrakow commented Feb 3, 2024

Nexesenex commented Feb 3, 2024 •

edited

Loading

sorasoras commented Feb 5, 2024

Nexesenex commented Feb 5, 2024 •

edited

Loading

ikawrakow commented Feb 5, 2024

sorasoras commented Feb 5, 2024

Adding some imatrix tools #5302

Adding some imatrix tools #5302

Conversation

ikawrakow commented Feb 3, 2024

Nexesenex commented Feb 3, 2024 • edited Loading

sorasoras commented Feb 3, 2024

ikawrakow commented Feb 3, 2024

Nexesenex commented Feb 3, 2024 • edited Loading

sorasoras commented Feb 5, 2024

Nexesenex commented Feb 5, 2024 • edited Loading

ikawrakow commented Feb 5, 2024

sorasoras commented Feb 5, 2024

Nexesenex commented Feb 3, 2024 •

edited

Loading

Nexesenex commented Feb 3, 2024 •

edited

Loading

Nexesenex commented Feb 5, 2024 •

edited

Loading