Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating processes using the Active MQ interface #6183

Merged
merged 20 commits into from
Nov 14, 2024

Conversation

matthias-ronge
Copy link
Collaborator

You can use this function to create processes using the Active MQ interface. Examples:

Without a catalog (all data is passed):

import java.util.*;
import javax.jms.*;
import org.apache.activemq.ActiveMQConnectionFactory;

public class Main {
    public static void main(String[] args) { try {

        // Serververbindung
        Connection connection = new ActiveMQConnectionFactory("tcp://localhost:61616").createConnection();
        connection.start();
        Session session = connection.createSession(false, Session.AUTO_ACKNOWLEDGE);
        Destination destination = session.createQueue("KitodoProduction.CreateNewProcesses.Queue");
        MessageProducer producer = session.createProducer(destination);
        producer.setDeliveryMode(DeliveryMode.NON_PERSISTENT);

        // Daten
        String processTitle = "TestAMQ";

        MapMessage message = session.createMapMessage();
        message.setString("id", processTitle); // "Betreff" (für Logging)

        message.setInt("project", 1); // Projekt-ID
        message.setInt("template", 3); // Produktionsvorlagen-ID
        message.setString("title", processTitle); // Vorgangstitel
        // message.setInt("parent", 14); // Elternvorgang

        Map<String, Object> metadata = new HashMap<>();
        metadata.put("docType", "Monograph"); // DocType

        List<String> collections = Arrays.asList("Drucke", "Varia.Drucke"); // Kollektionen
        metadata.put("singleDigCollection", collections);

        Map<String, Object> author = new HashMap<>(); // Metadatengruppe
        author.put("RoleCode", "aut");
        author.put("FirstName", "Max");
        author.put("LastName", "Mustermann");
        metadata.put("Person", author);

        message.setObject("metadata", metadata);

        // Daten senden
        producer.send(message);

        // Ende
        session.close();
        connection.close();

    } catch (Exception e) { e.printStackTrace(); }}
}

Or, use a catalog import:

import java.util.*;
import javax.jms.*;
import org.apache.activemq.ActiveMQConnectionFactory;

public class Main {
    public static void main(String[] args) { try {

        // Serververbindung
        Connection connection = new ActiveMQConnectionFactory("tcp://localhost:61616").createConnection();
        connection.start();
        Session session = connection.createSession(false, Session.AUTO_ACKNOWLEDGE);
        Destination destination = session.createQueue("KitodoProduction.CreateNewProcesses.Queue");
        MessageProducer producer = session.createProducer(destination);
        producer.setDeliveryMode(DeliveryMode.NON_PERSISTENT);

        // Daten
        String katKey = "1752355024";

        MapMessage message = session.createMapMessage();
        message.setString("id", katKey); // "Betreff" (für Logging)

        message.setInt("project", 1); // Projekt-ID
        message.setInt("template", 3); // Produktionsvorlagen-ID

        // Import-Einstellungen
        Map<String, Object> import1 = new HashMap<>();
        import1.put("importconfiguration", 1);
        import1.put("value", katKey);

        message.setObject("import", Arrays.asList(import1));

        // Daten senden
        producer.send(message);

        // Ende
        session.close();
        connection.close();

    } catch (Exception e) { e.printStackTrace(); }}
}

@matthias-ronge
Copy link
Collaborator Author

Metadata groups are supported. See also in the “send class” above:

Map<String, Object> author = new HashMap<>();
author.put("RoleCode", "aut");
author.put("FirstName", "Max");
author.put("LastName", "Mustermann");
metadata.put("Person", author);

@aetherfaerber
Copy link

We have already tested this branch several times and it works well for us so far.

@solth what would be needed for this to be merged?

@solth
Copy link
Member

solth commented Sep 23, 2024

@solth what would be needed for this to be merged?

For starters, there are conflicts that need to be resolved before this pull request can be merged. Additionally, it is a draft, so it seems @matthias-ronge is still working on it and hasn't deemed it ready for review, yet, himself.

@aetherfaerber
Copy link

aetherfaerber commented Sep 27, 2024

I must limit my previous statement to some extent.

We had tested creating processes by importing metadata froam a catalog. It works fine. We have now tested creating a process and passing its metadata vie ActiveMQ and have not been able to get it to work. We used the code given above but the only metadata field that was not empty aftewards was DocType. Of course we adapted the example to include only metadata fields that we have defined in our ruleset.
Could this be caused by our ruleset or does the functionality not work properly?

Also @matthias-ronge could you please comment on the previous statement? Is this a draft or a real PR? Can you add documentation and resolve the merge conflicts?

@matthias-ronge
Copy link
Collaborator Author

I had initially set the pull request as a draft to see if it would be sufficient to all your testings as well. Then I forgot to change it.

I will resolve the merge conflicts and I will take a closer look to the metadata, to see if this is a bug in the program or in the example code. Expect my reply soon.

@aetherfaerber
Copy link

aetherfaerber commented Oct 2, 2024

I will resolve the merge conflicts and I will take a closer look to the metadata, to see if this is a bug in the program or in the example code. Expect my reply soon.

Thank you for looking into it. FWIW, here is the sample code we used, adapted to our ruleset:

import java.util.*;
import javax.jms.*;
import org.apache.activemq.ActiveMQConnectionFactory;

public class Main {
    public static void main(String[] args) { try {

        // Serververbindung
        Connection connection = new ActiveMQConnectionFactory("tcp://localhost:61616").createConnection();
        connection.start();
        Session session = connection.createSession(false, Session.AUTO_ACKNOWLEDGE);
        Destination destination = session.createQueue("KitodoProduction.CreateNewProcesses.Queue");
        MessageProducer producer = session.createProducer(destination);
        producer.setDeliveryMode(DeliveryMode.NON_PERSISTENT);

        // Daten
        String processTitle = "TestAMQ9";

        MapMessage message = session.createMapMessage();
        message.setString("id", processTitle); // "Betreff" (für Logging)

        message.setInt("project", 2); // Projekt-ID
        message.setInt("template", 1); // Produktionsvorlagen-ID
        message.setString("title", processTitle); // Vorgangstitel

        Map<String, Object> metadata = new HashMap<>();
        metadata.put("document_type", "ThematicFile");
        metadata.put("delivery", "Lieferung42"); 
        metadata.put("archiveName", "archive42"); 
        metadata.put("ArcinsysID", "iid42");
        metadata.put("stockID", "stock42");
        metadata.put("stockUnitID", "stockunit42");
        metadata.put("unitID", "unit42");

        message.setObject("metadata", metadata);

        // Daten senden
        producer.send(message);

        // Ende
        session.close();
        connection.close();

    } catch (Exception e) { e.printStackTrace(); }}
}

The created process has no metadata besides the doctype.

@matthias-ronge
Copy link
Collaborator Author

I was able to reproduce the error. The code forgot to specify the location of the metadata storage, so when saving, it didn't know where to put it, and discarded it. I have now fixed that; the metadata is currently always stored in dmdSec, others are not yet possible.

@matthias-ronge matthias-ronge marked this pull request as ready for review October 2, 2024 11:37
@solth
Copy link
Member

solth commented Oct 8, 2024

@aetherfaerber would you or a colleague of yours be willing and confident to review this pull request? That includes checking the code as well as testing the functionality (which you already did, if I am not mistaken).

@aetherfaerber
Copy link

@aetherfaerber would you or a colleague of yours be willing and confident to review this pull request? That includes checking the code as well as testing the functionality (which you already did, if I am not mistaken).

Yes, we already tested the functionality and can also confirm that the fix to the problem I reported earlier works well. Everything works as expected.

While investigating the error we already had a brief look at the code. We can also try to check it but are not sure what this should include. We can offer to go through it and check for unclear variable names, missing comments, possible unrelated changes and the like. An in-depth analysis on code architecture/efficiency/standards is probably out of our league here.
What is expected?

Do the CQ-Pipelines in this repository already check for conformance to the guidelines at https://www.kitodo.org/fileadmin/groups/kitodo/Dokumente/Kitodo-EntwicklerLeitfaden_2017-06.pdf ?

Copy link

@aetherfaerber aetherfaerber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have read through the code changes and had a close look at various important aspects. We approve the changes.

@solth
Copy link
Member

solth commented Nov 7, 2024

@aetherfaerber thank you very much for your review. And sorry I forgot to reply to your question earlier. Those are the correct, current coding guidelines you mentioned, even though they are quite old and long overdue for reevaluation.

@solth
Copy link
Member

solth commented Nov 7, 2024

@BartChris does this pull request resolve #4837 for you?

Comment on lines +67 to +69
* process must be found in the client’s processes. If no parent process is
* specified, but a metadata entry with a use="higherLevelIdentifier" is
* included in the data from the catalog, the parent process is searched for
Copy link
Collaborator

@BartChris BartChris Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit:

I think i read the code wrong. We are probably always searching for a parent (even if important level is set to 1), sorry.

This is happening here:

// always try to find a parent for last imported process (e.g. level ==
// importDepth) in the database!
if (Objects.nonNull(parentID) && level == importDepth) {
checkForParent(parentID, template.getRuleset(), projectId);
}
.

However i am still wondering if we actually making the link to the parent process during the ActiveMQ import if we only provide metadata with a higherlevelidentifier. The parent process is searched while calling the importProcessHierarchy logic but i am not seeing the place in the ActiveMQ code, where we actually use the identified parent to establish the connection between child and the existing parent. I think this is missing from the code.

---outdated:
Is it really enough to provide a metadata import with a higherLevelIdentifier? In case a import configuration is provided, the import is explicitely called with an import level of 1 (IMPORT_WITHOUT_ANY_HIERARCHY):

 List<TempProcess> processHierarchy = importService.importProcessHierarchy(
                order.getImports().get(which).getValue(), order.getImports().get(which).getKey(),
                order.getProjectId(), order.getTemplateId(), IMPORT_WITHOUT_ANY_HIERARCHY,
                rulesetManagement.getFunctionalKeys(FunctionalMetadata.HIGHERLEVEL_IDENTIFIER));

This leads to the code for loading the parent to actually never be called.
https://github.com/kitodo/kitodo-production/blob/master/Kitodo/src/main/java/org/kitodo/production/services/data/ImportService.java#L552-L557

So we are not creating any parents as described in the spec, but we are also not searching for any parent by the recordIdentifier.

Copy link
Collaborator

@BartChris BartChris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general this is a great addition to have. I only read through the code as @aetherfaerber alreaded tested the functionality and i had problems getting the suggested Java code for creating processes running. I would opt to merge this in, as it is useful either way, but i have some remarks.

  1. As noted in my comment i am not really sure, wether it is enough to just have the metadata xml imported with one key serving as higherlevelIdentifier to establish a connection to an (existing) parent
    This is also specified by the issue description:

If no parent process is specified, but a metadata entry with a use="higherLevelIdentifier" is included in the data from the catalog, the parent process is searched for using the metadata entry with use="recordIdentifier". It must already exist for the client. No parent process is implicitly created. The child process is added at the last position in the parent process.

The ImportService has the parent process stored in the variable parentTempProcess, but i am not seeing that we are making use of it in the logic for the ActiveMQ import. Based on the code, the connection to the parent is only made, if the client provides explicitely the parentid.
Or am i overlooking something and it works, when just providing the higherlevelIdentifier in the xml? Have you tested that @aetherfaerber?

  1. We should probably strive to add an end to end test here to check wether it is possible to create the processes as expected using ActiveMQ. Or maybe at least have a test to check wether the data is provided in the right way.

  2. In the Import Service class we already have a method "importProcess" which is used to import processes using CSV. I would like to see, that we duplicate less logic between the ActiveMQ-Processor and the ImportService as i see the CSV import and the ActiveMQ import as very similar. They enable to create processes by providing data, just using different interfaces. So i would prefer if we can - maybe at a later point in time - let the ActiveMQ process reuse the importprocess-method.

What i would like to see in general (not necessarily as part of this PR) is, that the Processor has a way to return the created Kitodo process ID via the ActiveMQ response. Otherwise external systems have a way to create a Kitodo process but have no knowledge on the internal ID used for this process by Kitodo (Which might be useful for further interaction).

@solth
Copy link
Member

solth commented Nov 14, 2024

I agree with @BartChris that we should try to avoid code and method duplication as much as possible. Unfortunately, the various ways to import metadata into Kitodo has already resulted in quite some redundant code duplicates. I think we need a focused effort to consolidate all the different parts of the import code, but that should be dealt with in a separate project.

I also agree that it would be very helpful for external applications if the interface returned the ID of created processes to keep track of the Kitodo processes in the application and to have a way to reference the correct processes in subsequent interactions. I would propose, though, to see that as an incremental expansion of this first version of the function that can be implemented later.

However, the absence of tests is a problem. Since we want this function in Kitodo 3.8 I would agree to merge the PR as it's functionality has been tested and is working, but @matthias-ronge should supply tests via a separate pull request at a later stage.

@solth solth merged commit 6b86067 into kitodo:master Nov 14, 2024
5 checks passed
@henning-gerhardt
Copy link
Collaborator

henning-gerhardt commented Nov 14, 2024

Even this new way must be documented in the developers part of the wiki which is not done yet. The latest changes of the ActiveMQ part in the wiki are missing which must be avoided as no one could use it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants