Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev #26

Merged
merged 55 commits into from
Oct 31, 2019
Merged

Dev #26

merged 55 commits into from
Oct 31, 2019

Conversation

dityas
Copy link
Owner

@dityas dityas commented Oct 31, 2019

Working POMDP solver and IPOMDP solver.
POMDP policies and solutions can be exported to JSON and dot files.
Same functionality for IPOMDPs just another commit away

To be fixed:

thinclab.belief
  • Belief and InteractiveBelief should have uniform API for belief updates and other belief ops
  • SSGAExpansion should start from alpha vector policy consisting of reward functions instead of performing a full belief expansion first
  • All expansion strategies should be compatible with both POMDP and IPOMDPs
thinclab.decisionprocesses
  • Unify POMDP and IPOMDP DD vars and the way dynamics are stored
  • Maintain a static belief update method which calls thinclab.belief functions

dityas added 30 commits October 11, 2019 20:03
Bring the thesis branch up to date with new changes in master
The camper attacker domain for L0 is built
A minimal attacker domain has been started in l0. The idea is to have
ATT&CK techniques as actions taken by the attacker.  Need to fix
CREDS_DISCOVERY. It should succeed only when HAS_CREDS has been
confirmed through observations
The minimal persistence domain is solvable. THe policy is not
particularly sophisticated. But it looks like it will work for data
exfils
The IPOMDP solver oscillates even for fixed belief tree depths. The
POMDP solver was checked to see if it behaved similarly. It converges.
This implies a probable bug in the IPOMDP solver code and not in the
algorithm itself.
A lot of the attributes in the POMDP class are not used once the policy
is obtained. These are set to null to save on memory
The L1 domain file for the defender is created. But testing it would
require a verified L1 solver. Also the bellman error being reported was
printed for up to 9 decimal points. This is cut down to 3 to make space
for other important metrics
The value iteration solver is now compatible with the general
OnlineSolver API. Some tests still need to be added to account for
unforeseen args and edge cases
The L1 solvers had to implement oneStepNZPrimeBelStates and dpBackup
every time. Also these functions could potentially be useful outside the
solver itself. So the functions are re factored into other classes as
static methods.
The OnlineSolver keeps a track of the bellman error values for the few
most recent iterations and determines if approximate convergence should
be declared for low values of error variance
IPOMDP objects are now serializable. All the enclosed member objects
also implement serialization. The parser objects for both, POMDPs and
IPOMDPs are nulled out after parsing is done.
The OnlinePolicyTree object computes a static policy tree for the IPOMDP
for a given horizon. The implementation has been tested on
OnlineValueIteration and OnlineIPBVI solvers
The OnlinePolicyTree represents the beliefs at the same horizon as hash
sets to avoid making extra nodes for repeated beliefs.
The OnlinePolicyTree object can now export the policy to a JSON string.
The getDotString and getJSONString methods give the dot format and JSON
formatted tree.
Fix the test domain file paths in the OpponentModel tests. Also log at
debug level when context is changed
dityas added 25 commits October 22, 2019 13:51
The POMDP solvers previously implemented in the POMDp class itself are
now separate classes compatible with the OfflineSolver API.
The policy can be better represented as trees rather than graphs. Trees
also allow for each node to maintain belief states. The StructuredTree
class provides a base for implementing PolicyTrees and BeliefTrees
The LookAheadTree is no longer used for belief expansion. Instead, the
FullInteractiveBeliefExpansion does the same and is compatible with the
general BeliefExpansion API.
The OpponentModel API built the belief tree offline. This was
inefficient in term of memory and computation since it included zero
probability nodes in then belief tree. This has been fixed with online
belief tree creation in MJ
The newly implemented solver objects and belief search classes are made
serializable. Old tests are removed
Most of the implementations in the legacy code are no longer used. These
are commented out to see if the solvers still work
The older solvers implemented in the POMDP class are replaced with the
newer solver implementations
The factored interactive belief built using sumouts might be wrong.
Unfactoring cannot be done by simply multiplying back the factors.
The older unit tests do not assert any values from the output. They
simply print the objects to terminal. A few broad but strict assertions
have been added to ensure correctness
The older DDTree functionality was mainly aimed towards creating SPUDD
files from UI. But more recently, it is being use to generate and
persist actual DDs from symbolic perseus.
Remove old API files and re factor relevant parts into other packages.
PolicyNode has been moved in the thinclab.policy package
@dityas dityas merged commit c602008 into master Oct 31, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant