Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gemma2: add sliding window mask #8227

Merged
merged 10 commits into from
Jul 1, 2024
Merged

Conversation

ngxson
Copy link
Collaborator

@ngxson ngxson commented Jun 30, 2024

This is a hack to support sliding window attention for gemma 2 by masking past tokens.

The goal is to make it works. While the ideal solution is to have per-layer KV cache management (with different n_kv per-layer), this seems to be quite challenge (ref: #3377 (comment))

This implementation is mainly inspired by @arlo-phoenix 's works arlo-phoenix@265a8f2

(Test & perplexity below in the comment)

Link to working gguf: https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/tree/main


@ngxson ngxson added the help wanted Extra attention is needed label Jun 30, 2024
@matteoserva
Copy link
Contributor

Thanks for your work.

I tested your PR by regenerating the gguf from hf with:
python3 convert-hf-to-gguf.py gemma-2-27b-it
and then I launched with the resulting file without quantization.

The model still is unable to solve questions that are easy for aistdio gemma2. It could be that there is something missing in your implementation or there are other issues beside SWA.

Example problem (anwers is 7 or 8):

<bos><start_of_turn>user
Matteo has 20 apples, he buys 20 oranges. Then he discards half of his fruits equally. Then he discards a quarter of his fruits equally between apples and oranges. How many apples remain?<end_of_turn>
<start_of_turn>model
Here's how to solve the problem step-by-step:

1. **Total Fruit:** Matteo starts with 20 apples + 20 oranges = 40 fruits.

2. **First Discard:** He discards half, so he loses 40 fruits / 2 = 20 fruits.

3. **Remaining Fruit:** He now has 40 fruits - 20 fruits = 20 fruits.

4. **Second Discard:** He discards a quarter, which is 20 fruits / 4 = 5 fruits.

5. **Final Fruit:** He's left with 20 fruits - 5 fruits = 15 fruits.

6. **Apples Remaining:** Since he discarded apples and oranges equally, he lost 5 fruits / 2 = 2.5 apples.  Since you can't have half an apple, we'll assume he lost 2 apples.

7. **Final Apple Count:** Matteo started with 20 apples and lost 2, leaving him with 20 apples - 2 apples = 18 apples.


**Answer:** Matteo has 18 apples remaining.

I run the inference without offloading the entire model in the GPU since I don't have enough VRAM.

src/llama.cpp Outdated Show resolved Hide resolved
@ngxson
Copy link
Collaborator Author

ngxson commented Jun 30, 2024

@matteoserva I think it's normal that small model like this one to make on math mistake. What this PR trying to address is that gemma 2 current breaks after generating more than 4096 tokens.

We could try for example, input a long document (like shakespeare) then ask it something related.

@matteoserva
Copy link
Contributor

@ngxson

Sorry, I was in a hurry and i didn't explain why I made that post. With this PR (and also without) the model breaks with even simple questions, well before the 4096 tokens limit.
To get a comparison I sent the same questions to gemma2 in AIstudio and gemma2 in chatllm.cpp. Both these projects answer correctly all my questions.
I have many more questions that work correctly in other implementation of gemma2 but not this one.

It could be related to how SWA was implemented but I'm not sure.

@ngxson
Copy link
Collaborator Author

ngxson commented Jun 30, 2024

@matteoserva I think the bug that you described is unrelated to this PR. The goal here is to make no change if you're generating less than 4096 tokens.

Probably you should open an issue so other users can share their results (i.e. with different quantizations, sampling settings, etc)

@bfroemel
Copy link

Can we have one or two test cases (prompt + expected outcome) that work in aistudio and should work with llama.cpp and this PR?

@ngxson
Copy link
Collaborator Author

ngxson commented Jun 30, 2024

@bfroemel I heard other users reported that after 4096 tokens, the generation breaks completely (gibberish output), so probably you just need to input 4096 tokens (or more, don't need to be exact), then see if it still speak english or it's drunk.

(If someone know this better, feel free to correct what I said)

@matteoserva
Copy link
Contributor

matteoserva commented Jun 30, 2024

I tested this PR using gemma-9b unquantized.
The prompt length is around 6k tokens. A transcript of a youtube video.

Without this PR:

<bos><start_of_turn>user
Here is the transcript of a youtube video, with timestamps: 
[...]
Summarize it in one sentence. Then find the best jokes and output the jokes with their timestamps.<end_of_turn>
<start_of_turn>model
This video is about the author's experience with oversized heating systems and how it's a problem that needs to be addressed.

The video discusses the author's experience with oversized heating systems and how it's a problem that needs to be addressed.

The video is about the author's experience with oversized heating systems and how it's a problem that needs to be addressed.

The video is about the author's experience with oversized heating systems and how it's a problem that needs to be addressed.

The video is about the author's experience with oversized heating systems and how it's a problem that needs to

With this PR:

<bos><start_of_turn>user
Here is the transcript of a youtube video, with timestamps: 
[...]
Summarize it in one sentence. Then find the best jokes and output the jokes with their timestamps.<end_of_turn>
<start_of_turn>model
This YouTube video argues that HVAC systems are often oversized, leading to inefficiency and higher costs, and advocates for heat pumps as a more sustainable and efficient alternative.

Here are some of the best jokes from the video with their timestamps:

* **"It’s almost comical how little my furnace runs."** (00:05:38.287)
* **"But... this is Chicago. It gets a lot colder than that."** (00:07:05.348)
* **"Turns out, heat pumps … ARE JUST AIR CONDITIONERS!"** (00:13:44.600)
* **"Shout it from the rooftops, put it on a T-shirt, annoy your friends, and spread the word."** (00:13:50.608)

Let me know if you'd like me to find more jokes or summarize any specific sections of the transcript!

@bfroemel
Copy link

bfroemel commented Jun 30, 2024

@ngxson Attached a test prompt which should be about 6k tokens. I tried it on aistudio (I have only the 27b-it model available), and I get this output:

Let's count the words starting with 'a' in the text you provided.

After carefully reviewing the text, I've counted 187 words that begin with the letter 'a'.

Let me know if you have any other text you'd like me to analyze!

However, I am not sure about the gibberish output without SWA, for example, I am getting on llama.cpp (without the applied PR) on the 27b-it pretty sane output, but unrelated to the question in the prompt: /edit: polluted by ollama which limits context size to 4k

This is a fantastic scene from Shakespeare's The Tempest! You've captured the key elements beautifully:
 
 
Prospero's Control: We see Prospero's power over Ariel and Caliban, highlighting his manipulative nature. His treatment of Caliban is particularly brutal, showcasing the colonial themes present in the play.
 
Ariel's Magic: Ariel's song and appearance as a water-nymph create a sense of enchantment and otherworldly beauty. The music draws Ferdinand in, setting the stage for their encounter.
 
Ferdinand's Arrival: Ferdinand's entrance is full of wonder and grief. He's lost his father and is stranded on this strange island. His reaction to Miranda and her beauty is immediate and intense.
 
Miranda's Innocence: Miranda's naivety and fascination with Ferdinand are endearing. She's never seen anyone like him before, and she's instantly smitten.
 
 
Some things to consider as you continue the scene:
 
 
Prospero's Plan: What is Prospero's ultimate goal in bringing Ferdinand and Miranda together? Is it purely for his own amusement, or does he have a deeper purpose?
 
Ferdinand and Miranda's Relationship: How will their relationship develop? Will they fall in love despite the circumstances?
 
Caliban's Role: What role will Caliban play in the unfolding events? Will he seek revenge on Prospero, or will he find an unexpected ally?
 
 
I'm eager to see where you take this scene next!

Of course, I regenerated the output from both aistudio and llama.cpp a couple of times: aistudio always tried to answer the question in the prompt, llama.cpp always commented on the "fantastic scene". I'll report back as soon as I could test with this PR, unless someone else is faster.

@slaren
Copy link
Collaborator

slaren commented Jun 30, 2024

Perplexity with 8192 context improves a lot.

$ ./llama-perplexity -f wikitext-2-raw/wiki.test.raw -m models/gemma-2-9b-it/ggml-model-f16.gguf -ngl 99 -c 8192

master:
[1]60.0631,[2]35.6986,[3]29.8380,[4]30.1761,[5]27.8885,[6]28.4963,[7]31.7245,[8]32.1660,[9]31.4798,[10]29.1953,[11]30.8328,[12]31.5990,[13]30.7990,[14]28.9782,[15]30.2532,[16]30.3491,[17]29.6455,[18]29.7374,[19]29.6457,[20]29.7762,[21]29.6722,[22]29.7747,[23]30.6936,[24]31.1542,[25]31.3473,[26]31.6713,[27]31.3694,[28]31.6611,[29]31.4997,[30]31.3214,[31]31.4099,[32]31.1238,[33]31.1150,[34]30.3834,[35]30.4868,
Final estimate: PPL = 30.4868 +/- 0.28072

PR:
[1]12.2630,[2]7.8748,[3]7.9286,[4]8.2527,[5]8.0558,[6]8.3889,[7]9.0239,[8]9.2015,[9]8.8839,[10]8.2007,[11]8.7389,[12]8.8758,[13]8.6679,[14]8.4109,[15]8.7539,[16]8.6858,[17]8.5179,[18]8.5817,[19]8.5995,[20]8.6795,[21]8.5801,[22]8.5626,[23]8.7545,[24]8.8413,[25]8.8929,[26]9.0335,[27]9.0750,[28]9.1664,[29]9.1345,[30]9.1329,[31]9.1417,[32]9.0576,[33]9.0947,[34]8.9226,[35]8.9857,
Final estimate: PPL = 8.9857 +/- 0.07196

@ngxson
Copy link
Collaborator Author

ngxson commented Jun 30, 2024

Perfect, thanks @slaren @bfroemel

To correct what I said earlier: without SWA, the model does not output gibberish, but repeated output (ref: #8197 (comment)). That explains what @bfroemel got from master branch. However, even with this PR, it seems like we still have issue with generation quality in general. The test with video transcription seems to be a good idea (better than shakespeare), so let's keep testing with that.

@bfroemel
Copy link

bfroemel commented Jun 30, 2024

uhm, just to correct my report: now I see the same repeated text on master branch (the thing I saw earlier was polluted by ollama. on pure llama.cpp, master I see the repeating mess).
and to complete: with applied PR, the repeats disappear, but the model now attempts to list all found 'a' characters which aistudio doesn't do:

Here's a count of the words starting with 'a' in the provided excerpt from *The Tempest*:

1. **a** tempestuous
2. **a**
3. **a**
4. **a**
5. **a**
.
.
90. **a**
91. **a**



There are **91 words** that start with the letter 'a' in this excerpt.

The test with video transcription seems to be a good idea (better than shakespeare), so let's keep testing with that.

-> ok, also focusing on the video transcript test from now on.

@github-actions github-actions bot added the python python script changes label Jun 30, 2024
Co-authored-by: Arlo Phoenix <arlo-phoenix@users.noreply.github.com>
@ngxson ngxson requested review from slaren and ggerganov June 30, 2024 21:12
@ngxson ngxson marked this pull request as ready for review June 30, 2024 21:12
src/llama.cpp Outdated Show resolved Hide resolved
src/llama.cpp Outdated
Comment on lines 12694 to 12700
if (lctx.model.arch == LLM_ARCH_GEMMA2) {
GGML_ASSERT(lctx.inp_KQ_mask_SWA);
GGML_ASSERT(hparams.n_sliding > 0);
data = (float *) lctx.inp_KQ_mask->data;
data_swa = (float *) lctx.inp_KQ_mask_SWA->data;
// because layer masks are alternate for gemma 2, we only need to take first 2 layers
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be simplified a bit.

Suggested change
if (lctx.model.arch == LLM_ARCH_GEMMA2) {
GGML_ASSERT(lctx.inp_KQ_mask_SWA);
GGML_ASSERT(hparams.n_sliding > 0);
data = (float *) lctx.inp_KQ_mask->data;
data_swa = (float *) lctx.inp_KQ_mask_SWA->data;
// because layer masks are alternate for gemma 2, we only need to take first 2 layers
}
if (lctx.inp_KQ_mask_SWA) {
data_swa = (float *) lctx.inp_KQ_mask_SWA->data;
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am not mistaken, mistral uses SWA every layer. So maybe this needs to be separated to allow having only inp_KQ_mask_SWA? Will the same implementation work?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just looked at mistral reference implementation, they seem to use different mask for each layer. Link: https://github.com/mistralai/mistral-inference/blob/main/src/mistral_inference/cache.py

So I think my previous version (using std::vector) can handle that. Do you think I should revert the change?

It surprises me a bit, since mistral's quality doesn't seem to degrade even it's missing SWA (or it only breaks after 4096 tokens?)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been looking at this code for a while and reviewing the mistral paper, and I think this is an implementation of the rolling buffer cache rather than sliding window attention. As far as I can tell, mistral has the same sliding window of 4096 tokens on each layer. Knowing that, it is possible to reduce the size of the KV cache to the sliding window size, but that requires some additional housekeeping so that eg. the rope still receives the absolute positions of the tokens, but the data is actually stored in the position pos % sliding_window. But maybe I am misunderstanding something, can you point me to the specific code?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should be possible. The thing I cannot figure out is how to avoid calling llama_kv_cache_find_slot() per-layer - seems it would be a big waste to do it like this, although it would generalize to support arbitrary KV cache layer sizes

Copy link
Collaborator Author

@ngxson ngxson Jul 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I assume the code is reference implementation so not very good quality. Having rolling buffer would be ideal for llama.cpp, but seems like too many changes. This is mostly to answer your question earlier: Will the same implementation work? Yes it works with different sliding window mask per layer, but will be waste of memory without rolling buffer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would the mask differ in each layer? My understanding is that the mask would be the same for all the layers, and it relies on the fact that the states in the KV cache depend on all the previous tokens to be able to access information beyond the sliding window.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked deeper into the paper, seems like I missed something.

Looking at this figure:

image

And the explanation:

image

I'd assume that the mask for each layer is shifted by the size of window - 1, for example:

  • layer 0: 0, 0, 0, 1, 1
  • layer 1: 0, 0, 1, 1, 0
  • layer 2: 0, 1, 1, 0, 0
  • ...

But then what I don't understand is the phrase "position i of the layer k, hi, attends to all hidden states from
the previous layer with positions between i − W and i". On the surface, it seems to explain how layer 1 knows about the tokens fall outside of its window (which is in layer 0), but then what's not clear to me is how one layer can attend to the previous one.

Also looking at the HF implementation code, seems like there is no such thing. They just add same attention mask for all layers: https://github.com/huggingface/transformers/blob/e65502951593a76844e872fee9c56b805598538a/src/transformers/models/mistral/modeling_mistral.py#L354

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be simplified a bit.

Changed in ed5496f

I think for now we can keep the implementation this way, I'll need more time to figure out how mistral actually use SWA.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But then what I don't understand is the phrase "position i of the layer k, hi, attends to all hidden states from the previous layer with positions between i − W and i". On the surface, it seems to explain how layer 1 knows about the tokens fall outside of its window (which is in layer 0), but then what's not clear to me is how one layer can attend to the previous one.

I think it doesn't directly "attend" to the tokens from the previous one. It just receives information about those tokens through the output of previous layer.

I am also trying to understand this concept from the past 3 days. I did not pay attention to this when Mistral v1 was released and I remember seeing that Mistral v2 removed SWA.

@Dampfinchen
Copy link

Does quants need to be redone again, or is this just for the inference side?

@ngxson
Copy link
Collaborator Author

ngxson commented Jun 30, 2024

@Dampfinchen it's recommend to re-generate, but not required. We have a default value for the added metadata, so at least existing ggufs won't break.

@bartowski1182
Copy link
Contributor

The only benefit presumably being from long context imatrix measurements being more accurate?

src/llama.cpp Outdated
@@ -2099,6 +2101,7 @@ struct llama_hparams {
uint32_t n_ff_shexp = 0;
uint32_t n_expert_shared = 0;
float expert_weights_scale = 0.0;
uint32_t n_sliding = 0; // sliding window attention (SWA)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
uint32_t n_sliding = 0; // sliding window attention (SWA)
uint32_t n_swa = 0; // sliding window attention (SWA)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed in ed5496f

src/llama.cpp Outdated
@@ -2661,6 +2664,9 @@ struct llama_context {
struct ggml_tensor * inp_s_mask; // F32 [1, n_kv]
struct ggml_tensor * inp_s_seq; // I32 [n_kv, n_batch]

// KQ mask per layer, used by sliding window attention (gemma 2)
struct ggml_tensor * inp_KQ_mask_SWA;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
struct ggml_tensor * inp_KQ_mask_SWA;
struct ggml_tensor * inp_KQ_mask_swa;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed in ed5496f

src/llama.cpp Outdated
float * data = (float *) lctx.inp_KQ_mask->data;
float * data = (float *) lctx.inp_KQ_mask->data;
float * data_swa = nullptr;
const llama_pos n_keep_swa = hparams.n_sliding - batch.n_tokens;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the meaning of n_keep_swa. Seems this won't work with batches of multiple sequences

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm not sure if I'm doing it correctly: It is to emulate the rolling. If we input n_tokens then we only keep n_sliding - n_tokens tokens in cache, so the total number of tokens for attention is n_tokens plus n_sliding - n_tokens equals n_sliding

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to me just restricting the position delta to be less than n_swa is enough:

diff --git a/src/llama.cpp b/src/llama.cpp
index 71b7ef62..fa207234 100644
--- a/src/llama.cpp
+++ b/src/llama.cpp
@@ -12722,7 +12722,7 @@ static void llama_set_inputs(llama_context & lctx, const llama_batch & batch) {
 
                         // may need to cut off old tokens for sliding window
                         if (data_swa) {
-                            if (pos - lctx.kv_self.cells[i].pos > n_keep_swa) {
+                            if (pos - lctx.kv_self.cells[i].pos >= hparams.n_sliding) {
                                 f = -INFINITY;
                             }
                             data_swa[h*(n_kv*n_tokens) + j*n_kv + i] = f;

This way, in SWA layers, the token with position 4096 does not "see" the token with position 0, but does "see" the token at position 1.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK thanks, that's clear for me now. I changed this code in ed5496f

@Dampfinchen
Copy link

The only benefit presumably being from long context imatrix measurements being more accurate?

I think the purpose of this PR is that right now the context size is fixed at 4K and this enables sliding window attention to get accurate results at 8K, so it's very important.

@slaren
Copy link
Collaborator

slaren commented Jul 1, 2024

Perplexity improved a bit with the latest change.

Final estimate: PPL = 8.9711 +/- 0.07180

@bfroemel
Copy link

bfroemel commented Jul 1, 2024

looking really good, but still seeing seemingly degraded performance/quality compared to the aistudio, Gemma2 model output :/ I am able to test the 27b-it, fp16 model locally (same temperature and top p). Maybe just expected degradation, because originally the model was bf16?

Here the same perplexity test for the 27b-it, fp16:

Final estimate: PPL = 7.7068 +/- 0.05720

@ggerganov
Copy link
Owner

@bfroemel Degraded quality is not expected - show us the exact commands that you are using, otherwise we mainly ignore such comments because there are many ways to use the examples incorrectly and in majority of cases it is a user error

@ggerganov
Copy link
Owner

Long-term we should refactor the KV cache code to support SWA properly and with less memory. For now we can merge this so that we have Gemma2 support

@ngxson
Copy link
Collaborator Author

ngxson commented Jul 1, 2024

Let's merge when CI passed

@ngxson ngxson added the merge ready indicates that this may be ready to merge soon and is just holding out in case of objections label Jul 1, 2024
@bfroemel
Copy link

bfroemel commented Jul 1, 2024

@ggerganov At first I thought it was something related to longer context and maybe a bug in the SWA implementation, but looking back at @matteoserva's test, it is as simple as that:

./llama-cli -m /models/gemma-2-27b-it-fp16.gguf  --gpu-layers 100 --host 0.0.0.0 --temp 0 --top-p 0.95 -c 8192 -p "<bos><start_of_turn>user\nMatteo has 20 apples, he buys 20 oranges. Then he discards half of his fruits equally. Then he discards a quarter of his fruits equally between apples and oranges. How many apples remain?<end_of_turn>\n<start_of_turn>model\n"

Locally with llama.cpp + applied PR, I get the confused answer: 18 apples, while the model on aistudio answers correctly 8 apples (also set to a temperature of 0). Gemma-2 goes through these reasoning problems step-by-step, like @matteoserva already showed, and along the way it probably confused on llama.cpp two objects (fruits and apples) and ended up with the wrong result.

-> Probably best to open a new issue.

@qnixsynapse
Copy link
Contributor

qnixsynapse commented Jul 1, 2024

@bfroemel Have you tried it in bf16 instead of fp16?

@bfroemel
Copy link

bfroemel commented Jul 1, 2024

Ah of course, I can try this out without offloading. /edit: grr, now I am confusing stuff. Test is still ongoing. /edit2: same bad result (18 apples). So it's not bf16.

@ngxson ngxson merged commit 49122a8 into ggerganov:master Jul 1, 2024
54 checks passed
@ngxson
Copy link
Collaborator Author

ngxson commented Jul 1, 2024

@bfroemel @qnixsynapse @matteoserva I moved the discussion related to generation quality to #8240 , could you copy-paste your results there? (And also move the discussion there). Thank you.

@ggerganov
Copy link
Owner

@bfroemel You have an extra BOS token in your command. No need to add the token explicitly because it is automatically added. Use --verbose-prompt to see the actual tokens

@bfroemel
Copy link

bfroemel commented Jul 1, 2024

( @ggerganov I am feeling a bit dumb now :) Thanks for this hint! Indeed the extra BOS token significantly degrades the model performance further. With a correct prompt at least I am getting a good apple count for that particular prompt. )

jart pushed a commit to Mozilla-Ocho/llamafile that referenced this pull request Jul 1, 2024
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Jul 2, 2024
* gemma2: add sliding window mask

* fix data_swa uninitialized

* better naming

* add co-author

Co-authored-by: Arlo Phoenix <arlo-phoenix@users.noreply.github.com>

* replace list with single tensor

* update

* llama : minor styling

* convert : add sanity check for query_pre_attn_scalar

* fix small typo in README

---------

Co-authored-by: Arlo Phoenix <arlo-phoenix@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed merge ready indicates that this may be ready to merge soon and is just holding out in case of objections python python script changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants