Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OBC Watchdog Monitoring and Watchdog Timeout Scenario #36

Merged
merged 16 commits into from
Apr 1, 2023

Conversation

xszymonzur
Copy link
Member

OBC Watchdog Timeout and OBC Watchdog Timeout Scenario from Systems Analysis to Physical Architecture with 2 Functional Data Flows and a Physical Architecture View.

Close #6, close #30, close #31

OBC Watchdog Timeout and OBC Watchdog Timeout Scenario from Systems
Analysis to Physical Architecture with 2 Functional Data Flows and a
Physical Architecture View.

Close #6, close #30, close #31
@xszymonzur xszymonzur added model-feature Functional modelling, at any level obc Within the scope of the On-Board Computer Subsystem labels Jan 27, 2023
@xszymonzur xszymonzur requested a review from chgio January 27, 2023 16:36
Copy link
Member

@chgio chgio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The System Analysis level is all good, very clean.
here's only an alternative to the Scenario: your ALT block doesn't have to be like an IF-ELSE statement, it could also be a simple IF statement. in this case specifically, you could also draw "Check MCU's Health" outside the ALT block, and then in the ALT block only draw the subsequent exchange. it's more DRY, but readability gains are debatable and marginal at best.

Logical Architecture is where your modelling really shines!
I love how you've already included checking both the OBC's own health as well as other subsystems' health under the same parent function, again this keeps the System level really clean, and expands the Logical level by just the right amount to prepare a "hook" for other capabilities which are obviously semantically related.
The current state is fine, but considering the upcoming Dumb Comms Refactoring, the functions "Check/Restore Communications Subsystem MCU's health" would have to be allocated to the OBC -- or removed altogether! -- which would also illustrate the health-checking service for other subsystems.
At any rate, great work in term of flexibility too.

Physical Architecture is just something else though.
Masterfully-sized breakdown, perfect allocation with new hardware and software nodes, extremely detailed scenarios even with comments! And I'm in absolute awe at your work with the Functional Chains and how you managed to get forks and loops through the Description diagrams!
The only feedback would be to not err on the side of awesomeness: don't get me wrong, these "virtuoso" representations are amazing, and they do a very thorough job at illustrating properties and requirements -- but this is a relatively simple feature, and there are already 3 instances of the same function in a single scenario and definitely non-trivial-to-read functional chain. Consider only including a smaller window of time in your scenarios (just enclose everything in a LOOP block) and being looser with your functional exchanges (maybe let an exchange begin from a certain entity just to get a linear functional chain, even though it should semantically be from another) -- but it's probably better to over-model than under-model.

All in all, excellent work!

Your punishment for committing all this beauty in a single chunk will be to manually re-model this directly on main whenever we get to use Team for Capella, so this will probably never get merged. It's that, or resolving the conflicts with Capella's diff/merge tool, and I know which one I'd pick.
Sorry about that 😅

@chgio chgio assigned chgio and xszymonzur and unassigned chgio Mar 21, 2023
@chgio chgio self-requested a review March 21, 2023 10:34
@chgio chgio added this to the System-Wide FDIR Capability milestone Mar 21, 2023
Add:
- [ES] Healthy MCU Scenario
- [ES] Watchdog Timeout Scenatio

- [PFCD] Healthy OBC Watchdoging
- [PFCD] OBC Watchdog Timeout

- [PAB] Physical Architecture View of Subsystems Recovery

- [PDFB] Physical Function Dataflow View of Subsystems Recovery

Edit root [PFBD] and [PCBD]

Fix: Remove duplicates of Auto Mode Change in lower levels
@xszymonzur xszymonzur force-pushed the model/szy-obc-watchdog-timeout-disconnect branch from 5b9d2cc to f48479a Compare March 28, 2023 09:25
Copy link
Member

@chgio chgio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work as in the last review!

Oversights

There's only one actual oversight since the last review:

  • Include entities from [PAB] Health Monitoring into [PAB] Root: great work highlighting the health monitoring involvements within this diagram, but it should also appear in the Root Physical Architecture. The purpose of Root diagrams is to include everything about the whole system, so the same stuff that's in this subset should be in its superset. I know the Root is very messy, so good luck fitting your work in there! 😅

Improvements

The changes since the last review (#41) have also opened up an opportunity for a modelling improvement, which I'm incredibly excited about:

  • Add watchdog timing requirements as Capella Requirements: your work is a perfect example of why we're using Capella and doing all this: specifying your Exchange Scenarios has successfully elicited some new requirements, which you've cleverly picked up and diligently noted down. Now that we're using the actual extension implementing Requirements as Capella objects, let's make it formal! 🤩
    • Add it to [RQT] OBC Subsystem Requirements; sounds like 05 FDIR/A01-OBC-FLT-P-XXX would be a good fit
    • See if you can substitute the relevant notes in [ES] diagrams with a reference to the actual requirement

Nitpicks

Finally, here are just some extremely nitpicky semantic points, which have also only become evident since specifying the CloudView architecture:

  • Substitute "MCU" with generic entity references: this is a very interesting point, as this feature could very well be imagined to concern MCUs; however, conceptually, MCUs don't really exist at any level above Physical, and not even for all Subsystems at that level. Thinking about it, the OBC's function of acting as a software watchdog for other Subsystems holds regardless of their Physical-level allocation to Nodes -- only the applicable corrective actions do (System Analysis: Other S/S Soft Reset #1, System Analysis: Other S/S Hard Reset #2, System Analysis: S/S Bug-Squashing Firmware Reprogram #3), but that's a problem for those issues and for System Analysis: Multi-Tier Reset Policy #38. The following sounds better to me:
    • System level: "MCU" -> "System"
    • Logical level: "MCU" -> "Subsystem"
    • Physical level: "MCU" -> "Component"
  • Remove "own" in functions allocated to the OBC: this couples the name of the function to its allocation; the fact that it's the OBC that checks and restores its own health is conveyed by the allocation of these functions to the OBC, but theoretically they should equally make sense if another Subsystem was responsible for that.
  • Include parent functions from Logical level in [PDFB] Health Monitoring: since you got me started with Dataflow diagrams, I've struggled to see what sets them apart from just Architecture diagrams without components... I think one approach is highlighting the realisation of Functional Chains by inserting each involved function into its higher-level parent, which is not possible in any other diagram. This way, Architectures show the impact of FCs on the components allocating the involved functions, and Dataflows show the impact of FCs on the parents of the involved functions. Yes, it does get quite messy, but it's the Physical level, what diagram doesn't at that point?
  • (Optional) Extend health checks to non-OBC Subsystems to all other Subsystems: we're in for a penny, let's be in for a pound! (I had to look up how to say this in English lol) If you want to get ahead and set the stage for System Analysis: Multi-Tier Reset Policy #38 in advance, please extend health-checking functions from just Comms to every other Subsystems; otherwise, you might as well remove the one involving Comms too, as its specification does lie outside of the scope of this PR 😄

All in all, this is some truly amazing modelling work! Let's make it even better ❤️

@xszymonzur
Copy link
Member Author

Thank you for the review, I really do appreciate the feedback – your remarks are very useful and help me develop so please do point out when you see some space for improvement.

About the extension of health-checking and the #38, I’m currently away from home till 3rd April and am not sure whether I’ll be able to complete it before the 5th April, so let’s move it to some other PR if you don’t mind. I’d love to get onto it later though :).

@xszymonzur xszymonzur requested a review from chgio March 30, 2023 21:46
Giorgio Ciacchella added 2 commits April 1, 2023 11:35
atomise functional requirements:
- A01-OBC-FLT-F-010 ->
	- F-010
	- F-011
	- F-012

and make the original notes on absolute and relative timings more
explicit in performance requirements:
- A01-OBC-FLT-P-001 ->
	- P-001
	- P-002
rename scenarios, adding level qualifiers for clarity:
- System: "... System Scenario"
- Logical: "... Logical Scenario"
- Physical: "... Physical Scenario (Variant)"
Copy link
Member

@chgio chgio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flawless; ready to merge.

All requested changes, even the more nitpicky ones, have been punctually addressed.
I've only added explicit level qualifiers to your scenario names for clarity, and atomised your requirements a little to make them more explicit about your original observations on relative timing: time between kicks shorter than reset period, time from boot to first kick shorter than reset period. So good!!

Give the CI its 10 minutes and then flex your deployment!

@chgio chgio added this pull request to the merge queue Apr 1, 2023
Merged via the queue into main with commit 33e87c8 Apr 1, 2023
@xszymonzur xszymonzur deleted the model/szy-obc-watchdog-timeout-disconnect branch April 1, 2023 20:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
model-feature Functional modelling, at any level obc Within the scope of the On-Board Computer Subsystem
Projects
None yet
2 participants