Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

638 Strange behaviour of CUDS remove #659

Merged
merged 5 commits into from
Jun 24, 2021

Conversation

kysrpex
Copy link
Contributor

@kysrpex kysrpex commented Jun 16, 2021

Not putting the wrapper in the added buffer solves the problem.

@kysrpex kysrpex self-assigned this Jun 16, 2021
@kysrpex kysrpex linked an issue Jun 16, 2021 that may be closed by this pull request
kysrpex and others added 4 commits June 16, 2021 09:32
… baked in the unit tests. Notice how when `wrapper.add` is used, the wrapper is expected in the updated buffer instead of the added buffer. When `wrapper.add` is not used, the wrapper is not expected in neither buffer.
@kysrpex
Copy link
Contributor Author

kysrpex commented Jun 16, 2021

I had a closer look at the buffer mechanism, and I would say that indeed as I believe it makes no sense. Take a look at this file. It contains the only example in the OSP-core code of what being in each buffer actually means. I am looking at the version before I introduced the buffer performance updates to make sure that the bug is there as well, as this bug is 20-25 days old.

The get_triples method is also relevant.

    def get_triples(self, include_neighbor_types=False):
        """Get the triples of the cuds object."""
        o_set = set()
        for s, p, o in self._graph.triples((self.iri, None, None)):
            yield s, p, o
            o_set.add(o)
        if include_neighbor_types:
            for o in o_set:
                yield from self._graph.triples((o, RDF.type, None))

Added buffer (link)

    def _apply_added(self, root_obj, buffer):
        # Perform the SQL-Statements to add the elements
        # in the buffers to the DB.

        for added in buffer.values():
            triples = self._substitute_root_iri(added.get_triples())
            self._add(*triples)

In plain English, for every CUDS object in the added buffer, get all the triples (from memory, the CUDS only exists in memory) that have the CUDS IRI as subject. Put them in the backend.

In even simpler English: copy the CUDS from the memory to the backend.

Updated buffer (link)

    def _apply_updated(self, root_obj, buffer):
        # Perform the SQL-Statements to update the elements
        # in the buffers in the DB.
        for updated in buffer.values():
            pattern = (updated.iri, None, None)
            self._remove(next(self._substitute_root_iri([pattern])))
            triples = self._substitute_root_iri(updated.get_triples())
            self._add(*triples)

In plain English, for every CUDS object in the added buffer, remove all triples that have the CUDS IRI as subject from the backend. Then get all the triples (from memory, the CUDS only exists in memory) that have the CUDS iri as subject. Put them in the backend.

In even simpler English: remove the CUDS from the backend. Copy the CUDS from the memory to the backend.

Deleted buffer (link)

    def _apply_deleted(self, root_obj, buffer):
        # Perform the SQL-Statements to delete the elements
        # in the buffers in the DB.
        for deleted in buffer.values():
            pattern = (deleted.iri, None, None)
            self._remove(next(self._substitute_root_iri([pattern])))

In plain English, for every CUDS object in the added buffer, remove all triples that have the CUDS IRI as subject from the backend.

In even simpler English: remove the CUDS from the backend.


So, guess what happens if you put the wrapper in the added buffer? All its triples get added to the backend, while the old ones are NOT removed. Ideally, one would "load the wrapper from backend" and return the loaded wrapper instead of a new one, but there are several checks in place to prevent this and the solution is more complex.

Simply never putting wrappers in the added buffer seems to be good enough, since they are always supposed to be the root of the session, so their existence only makes sense when they are attached to something. And when they are attached to something, they go to the updated buffer, where what happens actually makes sense according to the code above.

@kysrpex kysrpex marked this pull request as ready for review June 16, 2021 14:12
@paulzierep
Copy link

I can confirm that the problem persists.

with DataspaceSession(conn_text, connect_kwargs=connect_kwargs, dataspace_name=PRIVATE_DATASPACE_DB_NAME) as session:
    wrapper = cuba.Wrapper(session=session)
    wrapper.add(cuba.Entity(), rel=cuba.activeRelationship)
    session.commit()

with DataspaceSession(conn_text, connect_kwargs=connect_kwargs, dataspace_name=PRIVATE_DATASPACE_DB_NAME) as session:
        wrapper = cuba.Wrapper(session=session)
        
        for cuds in wrapper.iter():
            wrapper.remove(cuds)
        
        pretty_print(wrapper)
        session.commit()

>>>>>
- Cuds object:
  uid: 2ab207e6-e0a3-4fb3-abfa-0c73694fa0b7
  type: cuba.Wrapper
  superclasses: cuba.Entity, cuba.Wrapper
  description: 
    The root of all wrappers. These are the bridge to simulation engines and databases.
<<<<<

with DataspaceSession(conn_text, connect_kwargs=connect_kwargs, dataspace_name=PRIVATE_DATASPACE_DB_NAME) as session:
    wrapper = cuba.Wrapper(session=session)
    
    for cuds in wrapper.iter():
        wrapper.remove(cuds)
    
    pretty_print(wrapper)

>>>>>

- Cuds object:
  uid: 9c6dc4bc-85c4-409c-94f0-890ae00de568
  type: cuba.Wrapper
  superclasses: cuba.Entity, cuba.Wrapper
  description: 
    The root of all wrappers. These are the bridge to simulation engines and databases.

   |_Relationship cuba.activeRelationship:
     -  cuba.Entity cuds object:
        uid: a53536c1-8a03-4afa-83b2-438e8f654718
<<<<<

@kysrpex
Copy link
Contributor Author

kysrpex commented Jun 17, 2021

Can you try with a different PRIVATE_DATASPACE_DB_NAME that has been never used?

@kysrpex
Copy link
Contributor Author

kysrpex commented Jun 17, 2021

As you said that it also fails locally, I am leaving this here for tomorrow.


In [10]: from osp.wrappers.sqlite import SqliteSession as SqliteWrapperSession
    ...: from osp.core.namespaces import cuba
    ...: import os
    ...: 
    ...: try:
    ...:     os.remove('A.db')
    ...: except FileNotFoundError:
    ...:     pass
    ...: 

    ...: except FileNotFoundError:
    ...:     pass
    ...: 
    ...: with SqliteWrapperSession("A.db") as session_A:
    ...:     wrapper = cuba.Wrapper(session=session_A)
    ...: 
    ...:     wrapper.add(cuba.Entity(), rel=cuba.activeRelationship)
    ...: 
    ...:     session_A.commit()
    ...: 
    ...: with SqliteWrapperSession("A.db") as session_A:
    ...:     wrapper = cuba.Wrapper(session=session_A)
    ...: 
    ...:     for cuds in wrapper.iter():
    ...:         wrapper.remove(cuds)
    ...:         #wrapper.session.delete_cuds_object(cuds)
    ...: 
    ...:     session_A.commit()
    ...: 
    ...: with SqliteWrapperSession("A.db") as session_A:
    ...:     wrapper = cuba.Wrapper(session=session_A)
    ...: 
    ...:     pretty_print(wrapper)
    ...: 
WARNING 2021-06-17 18:42:10,086 [osp.core.session.db.db_wrapper_session]: Some CUDS objects are unreachable from the wrapper object: cuba.Entity: fd31a7e0-aa65-46f8-8d1a-53ce64d6660c.
If you want to be able to retrieve those CUDS objects later, either add them to the wrapper object or to any other CUDS that is reachable from it.
- Cuds object:
  uid: 307c119e-ea68-424c-bab8-6d7f766a209d
  type: cuba.Wrapper
  superclasses: cuba.Entity, cuba.Wrapper
  description: 
    The root of all wrappers. These are the bridge to simulation engines and databases.

In [12]: from osp.wrappers.sqlite import SqliteSession as SqliteWrapperSession
    ...: from osp.core.namespaces import cuba
    ...: from osp.core.utils import pretty_print
    ...: import os
    ...: 
    ...: try:
    ...:     os.remove('A.db')
    ...: except FileNotFoundError:
    ...:     pass
    ...: 
    ...: with SqliteWrapperSession("A.db") as session_A:
    ...:     wrapper = cuba.Wrapper(session=session_A)
    ...: 
    ...:     wrapper.add(cuba.Entity(), rel=cuba.activeRelationship)
    ...: 
    ...:     session_A.commit()
    ...: 
    ...: with SqliteWrapperSession("A.db") as session_A:
    ...:     wrapper = cuba.Wrapper(session=session_A)
    ...: 
    ...:     for cuds in wrapper.iter():
    ...:         #wrapper.remove(cuds)
    ...:         wrapper.session.delete_cuds_object(cuds)
    ...: 
    ...:     session_A.commit()
    ...: 
    ...: with SqliteWrapperSession("A.db") as session_A:
    ...:     wrapper = cuba.Wrapper(session=session_A)
    ...: 
    ...:     pretty_print(wrapper)
    ...: 
- Cuds object:
  uid: 78f001e8-96dd-4c91-9094-bd73482a6277
  type: cuba.Wrapper
  superclasses: cuba.Entity, cuba.Wrapper
  description: 
    The root of all wrappers. These are the bridge to simulation engines and databases.

In [13]: from osp.wrappers.sqlite import SqliteSession as SqliteWrapperSession
    ...: from osp.core.namespaces import cuba
    ...: from osp.core.utils import pretty_print
    ...: import os
    ...: 
    ...: try:
    ...:     os.remove('A.db')
    ...: except FileNotFoundError:
    ...:     pass
    ...: 
    ...: with SqliteWrapperSession("A.db") as session_A:
    ...:     wrapper = cuba.Wrapper(session=session_A)
    ...: 
    ...:     wrapper.add(cuba.Entity(), rel=cuba.activeRelationship)
    ...: 
    ...:     session_A.commit()
    ...: 
    ...: with SqliteWrapperSession("A.db") as session_A:
    ...:     wrapper = cuba.Wrapper(session=session_A)
    ...: 
    ...:     for cuds in wrapper.iter():
    ...:         wrapper.remove(cuds)
    ...:         wrapper.session.delete_cuds_object(cuds)
    ...: 
    ...:     session_A.commit()
    ...: 
    ...: with SqliteWrapperSession("A.db") as session_A:
    ...:     wrapper = cuba.Wrapper(session=session_A)
    ...: 
    ...:     pretty_print(wrapper)
    ...: 
- Cuds object:
  uid: 99eb3c78-a06e-4f48-a069-20e209fcc0e2
  type: cuba.Wrapper
  superclasses: cuba.Entity, cuba.Wrapper
  description: 
    The root of all wrappers. These are the bridge to simulation engines and databases.

Source for copy and paste:

from osp.wrappers.sqlite import SqliteSession as SqliteWrapperSession
from osp.core.namespaces import cuba
from osp.core.utils import pretty_print
import os

try:
    os.remove('A.db')
except FileNotFoundError:
    pass

with SqliteWrapperSession("A.db") as session_A:
    wrapper = cuba.Wrapper(session=session_A)

    wrapper.add(cuba.Entity(), rel=cuba.activeRelationship)

    session_A.commit()

with SqliteWrapperSession("A.db") as session_A:
    wrapper = cuba.Wrapper(session=session_A)

    for cuds in wrapper.iter():
        wrapper.remove(cuds)
        wrapper.session.delete_cuds_object(cuds)

    session_A.commit()

with SqliteWrapperSession("A.db") as session_A:
    wrapper = cuba.Wrapper(session=session_A)

    pretty_print(wrapper)

@paulzierep
Copy link

I can confirm, that it works for the SqliteWrapperSession. I think I did not load the branch correctly yesterday. I will test it with the DataSpaceSession asap.

@paulzierep
Copy link

Unfortunately, it does break something when using the DataspaceSession:

with DataspaceSession(conn_text, connect_kwargs=connect_kwargs, dataspace_name=PRIVATE_DATASPACE_DB_NAME) as session_A:
    wrapper = cuba.Wrapper(session=session_A)
    wrapper.add(cuba.Entity(), rel=cuba.activeRelationship)
    session_A.commit()

with DataspaceSession(conn_text, connect_kwargs=connect_kwargs, dataspace_name=PRIVATE_DATASPACE_DB_NAME) as session_A:
    wrapper = cuba.Wrapper(session=session_A)
    for cuds in wrapper.iter():
        wrapper.remove(cuds)
        wrapper.session.delete_cuds_object(cuds)

    session_A.commit()

with DataspaceSession(conn_text, connect_kwargs=connect_kwargs, dataspace_name=PRIVATE_DATASPACE_DB_NAME) as session_A:
    wrapper = cuba.Wrapper(session=session_A)
    pretty_print(wrapper)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-12-154a42e075a2> in <module>()
     19 with DataspaceSession(conn_text, connect_kwargs=connect_kwargs, dataspace_name=PRIVATE_DATASPACE_DB_NAME) as session_A:
     20     wrapper = cuba.Wrapper(session=session_A)
---> 21     pretty_print(wrapper)

/home/paul/anaconda3/envs/simphony/lib/python3.6/site-packages/osp/core/utils/pretty_print.py in pretty_print(cuds_object, file)
     25         pp += "\n  values: " + _pp_values(cuds_object)
     26     pp += "\n  description: \n    %s\n" % cuds_object.oclass.description
---> 27     pp += _pp_subelements(cuds_object)
     28 
     29     print(pp, file=file)

/home/paul/anaconda3/envs/simphony/lib/python3.6/site-packages/osp/core/utils/pretty_print.py in _pp_subelements(cuds_object, level_indentation, visited)
     72         sorted_elements = sorted(
     73             cuds_object.iter(rel=relationship, return_rel=True),
---> 74             key=lambda x: (str(x[0].oclass), str(x[1]),
     75                            x[0].name if hasattr(x[0], "name") else False)
     76         )

/home/paul/anaconda3/envs/simphony/lib/python3.6/site-packages/osp/core/cuds.py in iter(self, rel, oclass, return_rel, *uids)
    404                 yield r
    405             else:
--> 406                 yield from ((r, m) for m in mapping[r.uid])
    407 
    408     def _recursive_store(self, new_cuds_object, old_cuds_object=None):

AttributeError: 'NoneType' object has no attribute 'uid'

@kysrpex
Copy link
Contributor Author

kysrpex commented Jun 24, 2021

Unfortunately, it does break something when using the DataspaceSession:

with DataspaceSession(conn_text, connect_kwargs=connect_kwargs, dataspace_name=PRIVATE_DATASPACE_DB_NAME) as session_A:
    wrapper = cuba.Wrapper(session=session_A)
    wrapper.add(cuba.Entity(), rel=cuba.activeRelationship)
    session_A.commit()

with DataspaceSession(conn_text, connect_kwargs=connect_kwargs, dataspace_name=PRIVATE_DATASPACE_DB_NAME) as session_A:
    wrapper = cuba.Wrapper(session=session_A)
    for cuds in wrapper.iter():
        wrapper.remove(cuds)
        wrapper.session.delete_cuds_object(cuds)

    session_A.commit()

with DataspaceSession(conn_text, connect_kwargs=connect_kwargs, dataspace_name=PRIVATE_DATASPACE_DB_NAME) as session_A:
    wrapper = cuba.Wrapper(session=session_A)
    pretty_print(wrapper)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-12-154a42e075a2> in <module>()
     19 with DataspaceSession(conn_text, connect_kwargs=connect_kwargs, dataspace_name=PRIVATE_DATASPACE_DB_NAME) as session_A:
     20     wrapper = cuba.Wrapper(session=session_A)
---> 21     pretty_print(wrapper)

/home/paul/anaconda3/envs/simphony/lib/python3.6/site-packages/osp/core/utils/pretty_print.py in pretty_print(cuds_object, file)
     25         pp += "\n  values: " + _pp_values(cuds_object)
     26     pp += "\n  description: \n    %s\n" % cuds_object.oclass.description
---> 27     pp += _pp_subelements(cuds_object)
     28 
     29     print(pp, file=file)

/home/paul/anaconda3/envs/simphony/lib/python3.6/site-packages/osp/core/utils/pretty_print.py in _pp_subelements(cuds_object, level_indentation, visited)
     72         sorted_elements = sorted(
     73             cuds_object.iter(rel=relationship, return_rel=True),
---> 74             key=lambda x: (str(x[0].oclass), str(x[1]),
     75                            x[0].name if hasattr(x[0], "name") else False)
     76         )

/home/paul/anaconda3/envs/simphony/lib/python3.6/site-packages/osp/core/cuds.py in iter(self, rel, oclass, return_rel, *uids)
    404                 yield r
    405             else:
--> 406                 yield from ((r, m) for m in mapping[r.uid])
    407 
    408     def _recursive_store(self, new_cuds_object, old_cuds_object=None):

AttributeError: 'NoneType' object has no attribute 'uid'

Just a comment for @yoavnash 's review: this does not break any "additional" thing. It just does not fix the problem, this error seems to be the same that appeared on the issue.

@kysrpex kysrpex merged commit 6ad48a4 into dev Jun 24, 2021
@kysrpex kysrpex deleted the 638-Strange_behaviour_of_CUDS_remove branch June 24, 2021 14:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Strange behaviour of CUDS remove
3 participants