Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix intermittent IT test failures for HarvestingClientsIT #10449

Merged
merged 12 commits into from
Apr 12, 2024
41 changes: 29 additions & 12 deletions src/test/java/edu/harvard/iq/dataverse/api/HarvestingClientsIT.java
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
package edu.harvard.iq.dataverse.api;

import java.util.ArrayList;
import java.util.List;
import java.util.logging.Logger;
import java.util.stream.Collectors;

import io.restassured.RestAssured;
import static io.restassured.RestAssured.given;
Expand Down Expand Up @@ -41,6 +44,7 @@ public class HarvestingClientsIT {
private static String adminUserAPIKey;
private static String harvestCollectionAlias;
String clientApiPath = null;
List<String> globalIdList = new ArrayList();

@BeforeAll
public static void setUpClass() {
Expand All @@ -54,14 +58,25 @@ public static void setUpClass() {

}
@AfterEach
public void cleanup() {
public void cleanup() throws InterruptedException {
if (clientApiPath != null) {
Response deleteResponse = given()
.header(UtilIT.API_TOKEN_HTTP_HEADER, adminUserAPIKey)
.delete(clientApiPath);
clientApiPath = null;
System.out.println("deleteResponse.getStatusCode(): " + deleteResponse.getStatusCode());

int i = 0;
int maxWait = 20;
String query = "dsPersistentId:" + globalIdList.stream().map(s -> "\""+s+"\"").collect(Collectors.joining(","));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this necessary, btw - what was the problem with "metadataSource:" + nickName?

Doesn't matter really, as long as it works. But I can't help but feel like there's too much going on in this test class, vs. its useful value. (Strictly speaking, there is no need to harvest the first 7 datasets twice in a row; that second set could just contain the 1 dataset with the CVV, maybe?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still mildly curious why it was necessary to search by individual dataset ids here.

do {
if (UtilIT.search(query, normalUserAPIKey).prettyPrint().contains("count_in_response\": 0")) {
break;
}
Thread.sleep(1000L);
} while (i++ < maxWait);
}
globalIdList.clear();
}

private static void setupUsers() {
Expand Down Expand Up @@ -225,11 +240,11 @@ private void harvestingClientRun(boolean allowHarvestingMissingCVV) throws Inte
int i = 0;
int maxWait=20; // a very conservative interval; this harvest has no business taking this long
do {
// Give it an initial 1 sec. delay, to make sure the client state
// Give it an initial 2 sec. delay, to make sure the client state
// has been updated in the database, which can take some appreciable
// amount of time on a heavily-loaded server running a full suite of
// tests:
Thread.sleep(1000L);
Thread.sleep(2000L);
// keep checking the status of the client with the GET api:
Response getClientResponse = given()
.get(clientApiPath);
Expand Down Expand Up @@ -271,19 +286,21 @@ private void harvestingClientRun(boolean allowHarvestingMissingCVV) throws Inte

System.out.println("Waited " + i + " seconds for the harvest to complete.");

// Let's give the asynchronous indexing an extra sec. to finish:
Thread.sleep(1000L);
Response searchHarvestedDatasets = UtilIT.search("metadataSource:" + nickName, normalUserAPIKey);
searchHarvestedDatasets.then().assertThat().statusCode(OK.getStatusCode());
searchHarvestedDatasets.prettyPrint();
searchHarvestedDatasets.then().assertThat()
.statusCode(OK.getStatusCode())
.body("data.total_count", equalTo(expectedNumberOfSetsHarvested));
// Get all global ids for cleanup
JsonPath jsonPath = searchHarvestedDatasets.getBody().jsonPath();
int sz = jsonPath.getInt("data.items.size()");
for(int idx = 0; idx < sz; idx++) {
globalIdList.add(jsonPath.getString("data.items["+idx+"].global_id"));
}
// verify count after collecting global ids
assertEquals(expectedNumberOfSetsHarvested, jsonPath.getInt("data.total_count"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevenwinship
The new intermittent error, the one introduced in #10464 the other day, is still here - the last Jenkins build has just failed again on this line with expected: <8> but was: <2>. Just a matter of calling the search engine while the new datasets may still be getting indexed.

Not your fault, strictly speaking, but since we are already working on this test, may as well. Let me just add a second of sleep here and merge it; I'm sure that's all it needs.


// Fail if it hasn't completed in maxWait seconds
assertTrue(i < maxWait);

// TODO(?) use the native Dataverses/Datasets apis to verify that the expected
// datasets have been harvested. This may or may not be necessary, seeing
// how we have already confirmed the number of successfully harvested
// datasets from the control set; somewhat hard to imagine a practical
// situation where that would not be enough (?).
}
}
Loading