Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allocate one NeighborQueue per search for results #12255

Merged
merged 1 commit into from
May 8, 2023

Conversation

jbellis
Copy link
Contributor

@jbellis jbellis commented Apr 29, 2023

As discussed on the hashmap PR, this saves about 1% of graph build time.

@jbellis
Copy link
Contributor Author

jbellis commented Apr 29, 2023

@msokolov wrote earlier,

I'm not entirely sure why just from inspection, but this seems to have broken some of the backwards-compatibility tests. What it means is this can no longer read indexes written by a prior point release.

How can I reproduce this locally? gradlew test reports all successful.

@jbellis jbellis marked this pull request as draft April 29, 2023 22:04
@jbellis
Copy link
Contributor Author

jbellis commented Apr 29, 2023

hmm, this is interesting, the hashmap branch passes tests, the neighborqueue branch passes tests, but putting them together does not pass. investigating...

@jbellis
Copy link
Contributor Author

jbellis commented Apr 29, 2023

^ this is addressed on the hashmap PR now

@jbellis jbellis marked this pull request as ready for review April 29, 2023 22:38
@msokolov
Copy link
Contributor

msokolov commented May 8, 2023

I think we can close if this has already been addressed, right?

@jbellis
Copy link
Contributor Author

jbellis commented May 8, 2023

The backwards compatibility breakage was fixed on the hashmap branch, this optimization was not.

@msokolov msokolov merged commit 9a7efe9 into apache:main May 8, 2023
@msokolov
Copy link
Contributor

msokolov commented May 8, 2023

Got it thank you! I think we can later add a CHANGES entry where we track improvements to help with release notes.

@mikemccand
Copy link
Member

Hello, is it expected that this change alters the KNN hits returned? That's fine (if it was expected) ... Lucene's nightly benchmarks are angry about it though, so I'll just regold if this is OK/expected.

@jbellis
Copy link
Contributor Author

jbellis commented May 9, 2023

Not expected. How can I run the test locally?

@msokolov
Copy link
Contributor

msokolov commented May 9, 2023

Those tests are run using a separate package called luceneutil -- see https://github.com/mikemccand/luceneutil

@msokolov
Copy link
Contributor

msokolov commented May 9, 2023

I wonder if it could have been #12248 that caused the difference?

@jbellis
Copy link
Contributor Author

jbellis commented May 9, 2023

Downloading now. In the meantime, can you give more details on the failure? Some variance is expected just by the nature of hnsw randomness, but it shouldn't go from e.g. 90% recall to 80%.

@mikemccand
Copy link
Member

Thanks @jbellis -- I'm not sure this change caused any difference. Something in the past couple days tweaked the KNN results and this one jumped out at me as a possibility.

This is the only detail the nightly benchmark produced:

Traceback (most recent call last):
  File "/l/util.nightly/src/python/nightlyBench.py", line 1818, in <module>
    run()
  File "/l/util.nightly/src/python/nightlyBench.py", line 701, in run
    raise RuntimeError('search result differences: %s' % str(errors))
RuntimeError: search result differences: ["query=KnnFloatVectorQuery:vector[0.028473025,...][100] filter=None sort=None groupField=None hitCount=100: hit 15 has wrong field/score value ([17135768], '0.9621086') vs ([26065483], '0.9620853')", "query=KnnFloatVectorQuery:vector[-0.047548626,...][100] filter=None sort=None groupField=None hitCount=100: hit 0 has wrong field/score value ([20712471], '0.8335463') vs ([15605918], '0.8440397')", "query=KnnFloatVectorQuery:vector[0.02625591,...][100] filter=None sort=None groupField=None hitCount=100: hit 7 has wrong field/score value ([23761647], '0.8285247') vs ([25459412], '0.8309758')"]

Unfortunately it is not so simple to reproduce these nightly benchmarks. Note that they only check for exact hit/score differences and not any precision/recall tradeoff.

@msokolov
Copy link
Contributor

msokolov commented May 9, 2023

The idea is that the hnsw randomness should be predictable based on its fixed random seed (42 IIRC). It isn't really a problem if we changed that as long as we have some idea how / why, and as you say the recall remains unchanged or improved. Also we'd want to make sure that the new normal is stable

@jbellis
Copy link
Contributor Author

jbellis commented May 10, 2023

If the algorithm is implemented correctly, and I think that it is, then in theory the order of neighbor traversal should not matter.

But we are seeing a difference here, so I think what causes that is the limited precision that you get in practice when computing vector similarities. If you have enough vectors, and enough dimensions, then the round off error can accumulate enough to make a difference. That is why the test suite does not surface this difference.

I performed 38 runs of the Texmex SIFT benchmark with known-correct KNN. This resulted in the new code having a very tiny bit better recall on average, with p-value 0.16. My statistics is a bit rusty (it's very rusty) but I believe we're justified in concluding that recall is no worse than before, at least on this test.

Test harness is here, google sheet is here and raw data is attached as csv.

The first column is the new code, and the second is the old (git sha 1fa2be9).

combined.csv

@mikemccand
Copy link
Member

Thanks Jonathan for this detailed analysis -- it looks like the returned hits did change a bit, but recall is a bit better, so it's fine. I'll regold the nightly benchmarks.

@benwtrent
Copy link
Member

@jbellis would this change effect query per second? There is a 20% drop since this commit in QPS in Lucene bench nightlies:

https://home.apache.org/~mikemccand/lucenebench/2023.05.08.18.03.38.html

Looking at the other searches, none slow down this significantly, so it doesn't seem to be the lucene util change.

Did your testing for recall changes show any performance differences on single threaded QPS?

@jbellis
Copy link
Contributor Author

jbellis commented May 17, 2023

The testing I performed pre-merge showed an improvement, but I'm happy to look closer. Is there a way to get a graph of performance over time out of that tool or is it manually pasting things into a spreadsheet?

@jbellis
Copy link
Contributor Author

jbellis commented May 17, 2023

Also, is it actually this commit, or is it the HashMap commit 3c16374 ?

@benwtrent
Copy link
Member

I don't see how it is that commit as the change was only detected after this PR was merged and has persisted since then. Though I may be reading the graph incorrectly.

We should probably test the two commits independently to determine the cause with Lucene util.

@msokolov
Copy link
Contributor

msokolov commented May 17, 2023

Here is the timeline graph with one sample per day, roughly https://home.apache.org/~mikemccand/lucenebench/VectorSearch.html

Hmm there was a several day gap around there, I think because the results changed and this breaks the benchmark tool. I suspect @mikemccand eventually decided that was benign and overrode its controls with imposing a "new normal".

@benwtrent
Copy link
Member

Also, is it actually this commit, or is it the HashMap commit 3c16374 ?

@jbellis that change seems to only be for the writer and the OnHeap vectors. Both of which aren't accessed during search, unless you think the graph building drastically changed for that commit and thus reduced the query per seconds somehow.

@jbellis
Copy link
Contributor Author

jbellis commented May 17, 2023

image

^ This is almost the entire diff, there is also the same change for byte[] and then private void searchLevel takes the NeighborQueue as a parameter.

The differences that I see are

  • we have a redundant call to clear() now for the very first time that results is used, but that's just setting size variables to 0
  • If we return early while traversing the index levels because we hit visit visitedLimit, the new code allocates the full NQ of topK instead of just 1.

(I also see that Builder isn't taking advantage of the new API, so that's a missed opportunity on my part but I assume QPS isn't measuring build time.)

It's hard for me to tell what code path the tester is exercising, but if it is specifying a low-ish visitedLimit then that would explain the regression. (It would also mean that the tester should expect pretty bad recall since it's not actually getting to the bottom level of the graph.)

I tried pushing to the branch for this PR but it didn't seem to do anything, so I opened a new PR that addresses the Builder and the visitedLimit issues at #12303.

P.S. I also notice that the byte[] version is missing these lines from the float[] version, not sure if that's intended or not

    int initialEp = graph.entryNode();
    if (initialEp == -1) {
      return new NeighborQueue(1, true);
    }

@mikemccand
Copy link
Member

Hmm, curiously, clicking through the datapoint with the ~20% QPS drop, these were the changes to Lucene, which is just this PR. Could that have caused this drop? Is VectorQuery really running concurrently in the nightly benchmark?

@benwtrent
Copy link
Member

So, using lucene util, I have been comparing this PR's commit vs. the one previous. I am consistently getting lower QPS with this PR's change but not near the 20% slow down (only around 3%, so could be system noise...).

Here are all the commits between the last good run and the one recorded with the significant slow down: 397c2e5...223e28e

I will try including Luca's commit to see if that increases the QPS discrepancy.

@benwtrent
Copy link
Member

Well, with just a single thread, Luca's commit changed nothing to my local tests.

I do seem to have a single threaded slow down (though minor) due to this change.

@mikemccand how can I determine which parameters the vector search task used when querying? Searching in Lucene Util for -concurrentSearches and -searchThreadCount yields few results and none are helpful.

Any additional help with trouble shooting this weird slow down is welcome :).

@jbellis
Copy link
Contributor Author

jbellis commented May 17, 2023

Can you check the performance with the changes at #12303 ?

@msokolov
Copy link
Contributor

I see that constants.SEARCH_NUM_THREADS=2 and this is what is passed to Competitor() by default as numThreads, and is then passed to perf.SearchPerfTest as -searchThreadCount, and eventually used in TaskThreads to run multiple tasks concurrently. SearchPerfTest creates a thread pool for use by IndexSearcher if -concurrentSearches was passed, but ... this is False by default and doesn't seem to get set anywhere. So ... just repeating the same analysis that @benwtrent did I don't see that luceneutil is running concurrent searching (per query), just running two distinct queries ("tasks") simultaneously.

@msokolov
Copy link
Contributor

One thing that';s confusing here is that luceneutil reports "commits since last successful run" relative to a 5/10 run, but there is no graph point showing that run. The last successful run according to the graph was on 5/7 and there were a bunch of other commits between that one and the 5/10 run. Including this one:

commit 9a7efe92c0b02ac161633c57a5f357e6a9002367
Author: Jonathan Ellis <jbellis@datastax.com>
Date:   Mon May 8 16:22:58 2023 -0500

    allocate one NeighborQueue per search for results (#12255)

so I think the change to slice executor is a red herring and we are just missing some datapoints on the graph

@msokolov
Copy link
Contributor

I tried running luceneutil before/after this change using this command:

 comp =  competition.Competition()

  index = comp.newIndex('baseline', sourceData,
                        vectorFile=constants.GLOVE_VECTOR_DOCS_FILE,
                        vectorDimension=100,
                        vectorEncoding='FLOAT32')

  comp.competitor('baseline', 'baseline',
                  vectorDict=constants.GLOVE_WORD_VECTORS_FILE,
                  index = index, concurrentSearches = concurrentSearches)

  comp.competitor('candidate', 'candidate',
                  vectorDict=constants.GLOVE_WORD_VECTORS_FILE,
                  index = index, concurrentSearches = concurrentSearches)

  comp.benchmark("baseline_vs_candidate")

and I get this error:

  File "src/python/vector-test.py", line 65, in <module>                                                                  [39/1959]
    comp.benchmark("baseline_vs_candidate")
  File "/local/home/sokolovm/workspace/lbench/luceneutil/src/python/competition.py", line 510, in benchmark                       
    searchBench.run(id, base, challenger,                                                                                         
  File "/local/home/sokolovm/workspace/lbench/luceneutil/src/python/searchBench.py", line 196, in run                             
    raise RuntimeError('errors occurred: %s' % str(cmpDiffs))
RuntimeError: errors occurred: ([], ["query=KnnFloatVectorQuery:vector[0.0223385,...][100] filter=None sort=None groupField=None hi
tCount=100: hit 51 has wrong field/score value ([994765], '0.9567487') vs ([824922], '0.9567554')", "query=KnnFloatVectorQuery:vect
or[-0.061654933,...][100] filter=None sort=None groupField=None hitCount=100: hit 16 has wrong field/score value ([813187], '0.8702
4546') vs ([134050], '0.8707979')", "query=KnnFloatVectorQuery:vector[-0.111742884,...][100] filter=None sort=None groupField=None
hitCount=100: hit 27 has wrong field/score value ([724125], '0.8874463') vs ([817731], '0.88757277')"], 1.0) 

maybe it's expected that we changed the results? I think this is what Mike M ran into with the nightly benchmarks

@msokolov
Copy link
Contributor

Re-running with

comp =  competition.Competition(verifyScores=False)

lets the benchmark complete without errors and I get this result:

                            TaskQPS baseline      StdDevQPS candidate      StdDev                Pct diff p-value                 
               AndHighHighVector      182.88     (15.7%)      150.83     (14.8%)  -17.5% ( -41% -   15%) 0.000                    
                   LowTermVector      260.54     (16.5%)      215.76     (14.0%)  -17.2% ( -40% -   15%) 0.000
                AndHighLowVector      201.95     (17.9%)      167.98     (14.3%)  -16.8% ( -41% -   18%) 0.001                    
                AndHighMedVector      208.19     (16.8%)      176.03     (14.4%)  -15.4% ( -39% -   18%) 0.002                    
                   MedTermVector      267.48     (17.3%)      232.24     (16.9%)  -13.2% ( -40% -   25%) 0.015                    
                  HighTermVector      179.05     (17.4%)      168.01     (16.6%)   -6.2% ( -34% -   33%) 0.251
                        PKLookup      195.93      (3.0%)      198.82      (3.1%)    1.5% (  -4% -    7%) 0.126

I don't understand why, but it seems to me we ought to revert this

@mikemccand
Copy link
Member

@mikemccand how can I determine which parameters the vector search task used when querying? Searching in Lucene Util for -concurrentSearches and -searchThreadCount yields few results and none are helpful.

The nightly benchmarks "just" run src/python/nightlyBench.py from the luceneutil repo.

so I think the change to slice executor is a red herring and we are just missing some datapoints on the graph

Eek -- I'll dig into why the nightly chart is misleading us. Maybe the "last successful run" logic is buggy.

@mikemccand
Copy link
Member

Note that you can click & drag to zoom into the nightly chart! Very helpful when trying to isolate specific nights' builds!

@mikemccand
Copy link
Member

so I think the change to slice executor is a red herring and we are just missing some datapoints on the graph

Eek -- I'll dig into why the nightly chart is misleading us. Maybe the "last successful run" logic is buggy.

OK indeed there was a bug in this logic! I pushed a possible fix. Unfortunately it will not retroactively correct the past bug's effect, but going forward, new nightly builds should be fixed. I'll watch the next few nightlies to confirm. Thanks for catching this @msokolov!

@mikemccand
Copy link
Member

I don't understand why, but it seems to me we ought to revert this

+1 -- let's revert for now and then try to understand the performance regression offline.

msokolov pushed a commit that referenced this pull request May 18, 2023
@msokolov
Copy link
Contributor

ok, I reverted. Maybe we can scratch our heads and learn something by understanding what the difference was

@msokolov
Copy link
Contributor

One thing I noticed is that NeighborQueue.clear() does not reset incomplete. I don't think that is causing an issue here, but we ought to fix it.

@msokolov
Copy link
Contributor

OK I think I see the problem -- when we search the upper levels of the graph we do so using topK=1. This initializes the NeighborQueue to have "initialSize=1" and therefore its LongHeap is created with maxSize=1, and this is never changed when we clear() these data structures. But when we search the bottom layer of the graph we want to use topK=topK as passed by the caller.

So I guess we could re-use the same NeighborQueue for the upper graph layers, but we really do a need to set the maxSize of the heap larger for the bottom layer, and it seems as if we might as well just create a new heap.

I guess some of the confusion arises because this data structure is sometimes used in a fixed-size way (we only insert nodes using insertWithOverflow into the results) and in other usages we allow it grow without bound (we insert nodes into the candidate queue using add).

@tang-hi
Copy link
Contributor

tang-hi commented May 18, 2023

I've noticed that we create a NeighborQueue with an initialSize set to topK. For instance, if topK is 100, the maximum size of LongHeap is also 100. However, when we execute the searchLayer function for any layer other than 0, the maximum size of LongHeap in the original version is only 1.

In the searchLevel method, we attempt to insert qualifying neighbors. LongHeap will push an element only if its size is less than maxSize.

 if (results.insertWithOverflow(friendOrd, friendSimilarity) && results.size() >= topK) {
if (size >= maxSize) {
  if (value < heap[1]) {
    return false;
  }
  updateTop(value);
  return true;
}
push(value);
return true;

So, if maxSize is 100, it will store up to 100 points. However, if maxSize is 1, it will only store 1 point.

When we try to return the result, we pop elements from the heap. If maxSize is 100, it will pop 99 elements when the layer is not 0. This could potentially be a time-consuming operation. In contrast, the original version does not pop any elements; it simply returns the result.

Here is the relevant code from the Apache Lucene repository:

while (results.size() > topK) {
  results.pop();
}

In summary, the performance degradation could be due to the increased number of pop operations when the maximum size of the heap is larger than 1.

@tang-hi
Copy link
Contributor

tang-hi commented May 18, 2023

I've noticed that we create a NeighborQueue with an initialSize set to topK. For instance, if topK is 100, the maximum size of LongHeap is also 100. However, when we execute the searchLayer function for any layer other than 0, the maximum size of LongHeap in the original version is only 1.

In the searchLevel method, we attempt to insert qualifying neighbors. LongHeap will push an element only if its size is less than maxSize.

 if (results.insertWithOverflow(friendOrd, friendSimilarity) && results.size() >= topK) {
if (size >= maxSize) {
  if (value < heap[1]) {
    return false;
  }
  updateTop(value);
  return true;
}
push(value);
return true;

So, if maxSize is 100, it will store up to 100 points. However, if maxSize is 1, it will only store 1 point.

When we try to return the result, we pop elements from the heap. If maxSize is 100, it will pop 99 elements when the layer is not 0. This could potentially be a time-consuming operation. In contrast, the original version does not pop any elements; it simply returns the result.

Here is the relevant code from the Apache Lucene repository:

while (results.size() > topK) {
  results.pop();
}

In summary, the performance degradation could be due to the increased number of pop operations when the maximum size of the heap is larger than 1.

But I'm still trying to run lucenutil bechmark to test my hypothesis.

@tang-hi
Copy link
Contributor

tang-hi commented May 18, 2023

I tried running luceneutil before/after this change using this command:

 comp =  competition.Competition()

  index = comp.newIndex('baseline', sourceData,
                        vectorFile=constants.GLOVE_VECTOR_DOCS_FILE,
                        vectorDimension=100,
                        vectorEncoding='FLOAT32')

  comp.competitor('baseline', 'baseline',
                  vectorDict=constants.GLOVE_WORD_VECTORS_FILE,
                  index = index, concurrentSearches = concurrentSearches)

  comp.competitor('candidate', 'candidate',
                  vectorDict=constants.GLOVE_WORD_VECTORS_FILE,
                  index = index, concurrentSearches = concurrentSearches)

  comp.benchmark("baseline_vs_candidate")

and I get this error:

  File "src/python/vector-test.py", line 65, in <module>                                                                  [39/1959]
    comp.benchmark("baseline_vs_candidate")
  File "/local/home/sokolovm/workspace/lbench/luceneutil/src/python/competition.py", line 510, in benchmark                       
    searchBench.run(id, base, challenger,                                                                                         
  File "/local/home/sokolovm/workspace/lbench/luceneutil/src/python/searchBench.py", line 196, in run                             
    raise RuntimeError('errors occurred: %s' % str(cmpDiffs))
RuntimeError: errors occurred: ([], ["query=KnnFloatVectorQuery:vector[0.0223385,...][100] filter=None sort=None groupField=None hi
tCount=100: hit 51 has wrong field/score value ([994765], '0.9567487') vs ([824922], '0.9567554')", "query=KnnFloatVectorQuery:vect
or[-0.061654933,...][100] filter=None sort=None groupField=None hitCount=100: hit 16 has wrong field/score value ([813187], '0.8702
4546') vs ([134050], '0.8707979')", "query=KnnFloatVectorQuery:vector[-0.111742884,...][100] filter=None sort=None groupField=None
hitCount=100: hit 27 has wrong field/score value ([724125], '0.8874463') vs ([817731], '0.88757277')"], 1.0) 

maybe it's expected that we changed the results? I think this is what Mike M ran into with the nightly benchmarks

@msokolov I try to run lucenutil use the command you provide, it throw the exception

Exception in thread "main" java.lang.IllegalArgumentException: facetDim Date was not indexed
	at perf.TaskParser$TaskBuilder.parseFacets(TaskParser.java:289)
	at perf.TaskParser$TaskBuilder.buildQueryTask(TaskParser.java:154)
	at perf.TaskParser$TaskBuilder.build(TaskParser.java:147)
	at perf.TaskParser.parseOneTask(TaskParser.java:108)
	at perf.LocalTaskSource.loadTasks(LocalTaskSource.java:169)
	at perf.LocalTaskSource.<init>(LocalTaskSource.java:48)
	at perf.SearchPerfTest._main(SearchPerfTest.java:543)
	at perf.SearchPerfTest.main(SearchPerfTest.java:133)

my vector-test.py looks like

#!/usr/bin/env python

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import competition
import sys
import constants

# simple example that runs benchmark with WIKI_MEDIUM source and task files
# Baseline here is ../lucene_baseline versus ../lucene_candidate
if __name__ == '__main__':
    #sourceData = competition.sourceData('wikivector1m')
    #sourceData = competition.sourceData('wikivector10k')
    sourceData = competition.sourceData('wikimedium10k')
    comp = competition.Competition(verifyScores=False)

    index = comp.newIndex('baseline', sourceData,
                          vectorFile=constants.GLOVE_VECTOR_DOCS_FILE,
                          vectorDimension=100,
                          vectorEncoding='FLOAT32')
    # Warning -- Do not break the order of arguments
    # TODO -- Fix the following by using argparser
    concurrentSearches = True

    # create a competitor named baseline with sources in the ../trunk folder
    comp.competitor('baseline', 'baseline',
                    vectorDict=constants.GLOVE_WORD_VECTORS_FILE,
                    index=index, concurrentSearches=concurrentSearches)

    comp.competitor('candidate', 'candidate',
                    vectorDict=constants.GLOVE_WORD_VECTORS_FILE,
                    index=index, concurrentSearches=concurrentSearches)
    # use a different index
    # create a competitor named my_modified_version with sources in the ../patch folder
    # note that we haven't specified an index here, luceneutil will automatically use the index from the base competitor for searching
    # while the codec that is used for running this competitor is taken from this competitor.
    # start the benchmark - this can take long depending on your index and machines
    comp.benchmark("baseline_vs_candidate")

Could you tell me how can I fix that?

@jbellis
Copy link
Contributor Author

jbellis commented May 18, 2023

@tang-hi you're right, that explains the discrepancy. The change at #12303 should fix that

@msokolov
Copy link
Contributor

@tang-hi you need to switch the sourceData to use wikivector1m - not sure why the script got left that way. The sourceData defines not only the data (documents) but also the tasks that are run. We want to exclude the faceting-related tasks. Also you will get more reliable results from a larger index (1m docs)

@tang-hi
Copy link
Contributor

tang-hi commented May 19, 2023

@msokolov, thank you! I have successfully run the test and it confirms what I mentioned earlier. I believe that #12303 by @jbellis could resolve this issue.

baseline (this pull request) compared to the candidate version (version prior to this pull request).
Screenshot from 2023-05-19 22-29-16

baseline(version prior to this pull request) compared to the candidate version(fix this PR)
Screenshot from 2023-05-19 22-41-10

@msokolov
Copy link
Contributor

Thanks everyone for testing and fixing. I had reverted this yesterday and I believe what we have on main now has recovered the performance we had before. I also ran luceneutil a few times and wasn't able to observe any change from clear() vs creating new NeighborQueue.

@jbellis
Copy link
Contributor Author

jbellis commented May 19, 2023

The performance impact to building is more meaningful because that is where you are allocating large queues for multiple levels

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants