Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paper Discussion 8a: Preserving Physical Safety Under Cyber Attacks #67

Open
nikorev opened this issue Mar 2, 2020 · 10 comments
Open
Labels
paper discussion s20 The discussion of a paper for the spring 2020 class.

Comments

@nikorev
Copy link
Contributor

nikorev commented Mar 2, 2020

Please add all comprehensive and critical reviews below

Notable Questions
Still updating

  • @AkinoriKahata, Akinori Kahata, Critical: I cannot understand why TEE has to divide the secure zone and the nonsecure zone. In my understanding, if the architecture can make a secure zone, all of the code can be executed in the secure zone. Why TEE uses the secure zone and nonsecure zone one after the other.
  • @lrshpak, Lily Shpak, Critical: After an attack the system needs time to stabilize the plant, does the time it takes to stabilize the system give the attacker another opportunity to attack?
  • @rebeccc, Becky, Comprehension: The whole section on the safety controller. I don't know if it's important for me to understand the math behind it and the note at the end of the section was helpful but I don't get how it keeps the plant safe.
  • @albero94, Alvaro Albero, Comprehension: In TEE there are to VMs one secure and the other one not. Part of the software is run on the secure VM and part on the non secure one. I do not know much about this environment so what are the limitations of the secure VM? Could you run all your software there and no have a problem? What do you decide what to run and why?
  • @grahamschock, Graham Schock, Critical: The paper makes the assumption that the ROT, the separate hardware module that periodically sends restart signals can not be hacked. If we make the assumption that the whole device can not be hacked, how sure are we of the security of this module.
  • @hjaensch7, Henry Jaensch, Critical: How would updates work on a system like this, if the trusted code is in read only memory?
  • @pcodes, Pat Cody, Comprehension: Is the performance tradeoff between the restart-based approach vs. spending extra money on TEE-equipped hardware worth it?
@AkinoriKahata
Copy link
Contributor

Reviewer: Akinori Kahata
Review type: Critical

  1. The problem being solved.
  • Conventionally, most of the targets of cyberattacks are IT systems; however, recently, sometimes, Cyber-Physical Systems become targets of cyberattacks. As the name shows, the damage of Cyber-Physical systems make an effect on physical safety and sometimes threaten peoples’ life. Although vulnerabilities cannot become zero from the computer security point of view, CPS needs to ensure safe operation by somehow.
  1. The main contributions.
  • For ensuring safety, this paper examines the two types of mechanisms to keep a secure execution environment; one is restart-based secure execution and another is the implementation of Trust Execution Environment (TEE). Through the evaluation of both types, the researcher showed both of them could work under the provided situations, also indicate the difference of the property of them; TEE’s performance is better than restart-based secure execution, but TEE is expensive. The important thing what the researcher said is making an appropriate choice based on the objective of the CPS.
  1. Questions.
  • It’s just my interest in how much costs are different between TEE and restart based secure execution.
  • I cannot understand why TEE has to divide the secure zone and the nonsecure zone. In my understanding, if the architecture can make a secure zone, all of the code can be executed in the secure zone. Why TEE uses the secure zone and nonsecure zone one after the other.
  1. Critiques.
  • According to the article, if the restart time is much shorter than the speed of the plant dynamics, the system can use a restart-based secure execution system. It is abstractive. Then, the author should discuss how much smaller is enough. In my understanding, finally, it depends on the system, but if some benchmark were discussed, it might be helpful for a developer.

@lrshpak
Copy link
Contributor

lrshpak commented Mar 7, 2020

Reviewer: Lily Shpak
Review Type: Critical

Problem Being Solved

This paper is attempting to solve the vulnerabilities associated with having cyber-physical systems. A lot of malicious actors use vulnerabilities in the software to attack physical systems. This means putting the system out of zone that is considered its admissible states

Main Contributions

The authors of this paper try to solve this problem by creating a secure execution interval or SEI. SEIs make it so that every cycle the system checks the state of the physical system to make sure that it is still in admissible state. The authors state that this system will be able to dynamically react to a change in the state that would make the system to no longer be admissible.

Questions

  1. After an attack the system needs time to stabilize the plant, does the time it takes to stabilize the system give the attacker another opportunity to attack?
  2. Does calculating the trigger point for the next SEI to run take a lot of overhead?
  3. In the 3DOF helicopter if the SEI is on the helicopter and someone shuts down the helicopter, will the SEI still works?

Critiques

  1. A lot of the math they do to explain how their system works is confusing to me. I think they could have done a bit more explaining of the math.
  2. I liked how early on in the paper they lay out all the assumptions they make. I also like how they explain what type of system they are protecting against because cyber-physical systems is a broad range of systems.

@bushidocodes
Copy link
Contributor

Reviewer: Sean McBride

Review Type: Critical Review

Problem Being Solved:

How can the baseline safety of the physical plant of a cyber-physical system be preserved even when the software is fully-compromised?

Main Contributions:

  1. Theorizes a methodology of tracking changes of state over time within a set of constraints and using inertia to calculate secure execution intervals over which a mission controller can execute, leaving enough time for the safety controller to intervene and compensate to restore to a good safe state.
  2. Implements several mechanisms for how a safety controller can restore a mission controller to a known good-state from read-only memory. One of these is based on hard resets from ROM. One is based on TrustZone-based virtualization, offering faster restore times.
  3. Experimentally evaluates the system using a physical quadcopter and a simulated factory.

Questions:

  1. As a quadcopter gets closer and closer to the ground (for example, when landing), I imagine the SEI would get smaller and smaller until perhaps some sort of "live lock" situation. How then do you land?
  2. If a quadcopter is instructed to land and flushes its mission controller, it doesn't have memory to know what the mission is, so it forgets its trying to land. Does this mean it then begins ascending to have a greater safety buffer? Again? How does it land?
  3. What are examples of systems that are definitely not amenable to modeling with linear dynamics?

Critiques:

  1. The assumption that sensor data cannot be spoofed seems questionable to me. If this data could be spoofed, it could actually manipulate the safety controller to misinterpret location on the state map and take corrections that actually force the system into an unsafe state (i.e. crashing the quadcopter)
  2. It seems mildly strange to make a quadcopter the main actual implementation of a system that claims to model linear dynamics. Helos don't get lift via forward momentum, so their future position is much less predictable.
  3. I would have appreciated greater discussion about how a system computes relative to multiple possible unsafe states. What if a quadcopter is flying indoors and can crash into both a ceiling and a floor?

@rebeccc
Copy link
Contributor

rebeccc commented Mar 9, 2020

Reviewer: Becky Shanley
Review Type: Comprehension

Problem Being Solved
This paper attempts to utilize the properties of physical plants to ensure the safety of these plants. Since the safety requirements of these plants are essential to their operation, they are vulnerable to both physical and cyber-attacks.

Main Contributions
This paper provides the analytical framework that utilizes the physical properties to compute safe operational windows in run-time during which, safety is guaranteed. It identifies the operational window by leveraging the fact that due to physical inertia, the total destabilization of the plant is time-consuming.

Questions

  1. The whole section on the safety controller. I don't know if it's important for me to understand the math behind it and the note at the end of the section was helpful but I don't get how it keeps the plant safe.
  2. In figure 1 on page 6289, I'm confused. Since the entire restart-based sequence takes the entire safe-flight zone, and if the grey is the only section where restarts are not enabled, wouldn't triggering a restart during another restart cycle (not in the grey) put the plane in a dangerous zone?

@albero94
Copy link
Contributor

albero94 commented Mar 9, 2020

Reviewer: Alvaro Albero
Review Type: Comprehensive

Problem being solved

Cyber-physical systems (CPS) require increased security measures then other systems as the damaged caused by compromising them can be higher due to their physical characteristics. In this paper the authors demonstrate a way to ensure safety in CPS.

Main Contributions

Leveraging the fact that due to inertia an attacker with full control of a system cannot destabilize it instantly, the authors develop a solution to recover control of the system within a secure period of time. First, they design an analytical method to estimate the window of time an attacker would need to damage the system. Second, they implement a system reset that is periodically executed within that period of time or an alternative that uses trusted execution environments (TEE). Finally, they test their solution in a prototype implementation.

Questions

  • Is there a methodology to determine when the system reset approach is enough on when TEE is necessary? I know there is a tradeoff between cost and capabilities, but how can you decide when the extra cost is worth it or not?
  • Based on the last paragraph of 4.Methodology, when the system is pushed to a limit the SC needs to run long enough before ending the SEI and estimating the next. Can an attacker compromise the system again during this period and damage it or is this time very short?
  • In TEE there are to VMs one secure and the other one not. Part of the software is run on the secure VM and part on the non secure one. I do not know much about this environment so what are the limitations of the secure VM? Could you run all your software there and no have a problem? What do you decide what to run and why?

@grahamschock
Copy link
Contributor

Reviewer: Graham Schock
Review Type: Critical

Problem Being Solved
Cyber physical systems have dangerous physical consequences when compromised. This is even more disastrous when human physical well being can be the victim of an attack. This issue is even more pronounced when these devices are interconnected with other devices and interconnected to the internet. There are many ways for a hacker to enter and gain access to a cyber physical system and many disastrous things they can do once inside the system. This shows that security of cyber physical systems needs to be improved.

Contributions
In order to ensure the safety of the cyber physical system, the paper develops methods to ensure basic safe operation of a system in respect to cyber attacks. While this paper does go over how to stop an attacker from entering a system, instead it asks the question how we can ensure that a cyber physical system will not crash. To do this the paper using the concept of physical inertia and that it is impossible to crash systems instantly.In order to support these methods and models the paper details a drone example where we can model the physical state mathematically and therefore we can model the time a hacker needs to destroy the system. With this information we can thwart the attacker by periodically restarting the system.

Questions

  • The paper makes the assumption that the ROT, the separate hardware module that periodically sends restart signals can not be hacked. If we make the assumption that the whole device can not be hacked, how sure are we of the security of this module.
  • The paper relies on the idea of a full restart and a clean software reinstallation. However, what if the user is able to hack into the system at a privilege level below the operating system and below the boot manager. For example what if the hacker was able to get into something like the Intel Management Engine and control everything.
  • From my basic physics knowledge physical inertia gets inherently more complex when we add rotation and and torque. I wonder how different types of inertia affect the formula or the usability of the system.

Critiques

  • There were some major assumptions about the architecture the device needs to have. One of the biggest assumptions is that the Operating System needs to be in read only memory, is this applicable to actual cyber physical systems, this would make it a lot harder to do things like updates.

  • I wish there was more discussion of the trade offs that the restart implementation has. Whenever I restart my computer the battery is pretty drained. This is especially hard with cyber physical systems where power is a difficult issue.

@hjaensch7
Copy link
Contributor

Reviewer: Henry Jaensch

Review Type: Critical Review

Problem Being Solved

Attacks against Cyber Physical Systems have the opportunity to cause physical damage to systems like plants. While there are other attack vectors this paper attempts to address attacks that aim to cause physical damage to the system.

Main Contributions

This paper recognizes that CPSs have consistent physical properties that can allow software a window of time to clean itself before physical damage occurs. Any attacks on a CPS will take a certain time T to cause physical damage. This paper proposes two solutions that use the knowledge of this time T to identify and correct errors before physical damage is done.

Questions

  1. Does the length of the Secure Execution Interval change from system to system and at some point is the SEI too small to get anything useful done?

  2. What are the resource requirements of trust zone tech and can a smaller more resource constrained system support them?

  3. How would updates work on a system like this, if the trusted code is in read only memory?

Critiques

  • The restart model requires stateless controller operation. How many existing controllers are able to be restarted regularly and still provide useful work?

  • Since the SEI interval is calculated dynamically based on sensor readings and state is not preserved. A sensor that relies on a low pass filter will consistently have bad readings if state is not preserved.

@Others
Copy link
Contributor

Others commented Mar 9, 2020

Reviewer: Gregor Peach

Review Type: Critical Review

Problem Being Solved

When a cyber physical system has a problem, it is not only the software that is affected, it is also the hardware and the real world. This magnifies the effect of attacks, leading to damage to people and/or property.

Contributions

This paper purposes a system based on "safe states" and timers. If you're in a "safe state" then you have $N seconds/steps before you crash. If we set a timer for $N - restart time as soon as you leave a safe state, and then restart then, we can ensure the system is always working right. (Assuming we can prevent the program from halting the restart.) That is a very simplified view of the contribution of the paper.

Questions

  • How does algorithm 2 work?
  • How do we ensure the state model of the system is accurate?
  • How do they ensure a high assurance core of control?

Critiques

  • There were some significant constraints on the system setup. I wish they had addressed whether those were usual in practice, or if future work could remove them.
  • Would've been nice to see a third example

@pcodes
Copy link
Contributor

pcodes commented Mar 9, 2020

Reviewer: Pat Cody
Review Type: Comprehensive

Problem Being Solved

When a cyber-physical system is compromised, it carries the risk of damaging the plant, environment, and humans. Normally, these systems are designed to mitigate and prevent intrusions, but there will always be an unforeseen vulnerability, especially when connected to the internet.

Main Contributions

This paper contributes a formal guarantee of a system's baseline safety by creating the notion of Secure Execution Intervals (SEI), a technique to prevent end-point devices from causing physical damage to a plant. Even if a device is compromised, the attacker won't have enough time to cause any physical harm before the next interval. SEI works either via restarts, or using a Trusted Execution Environment (TEE).

Questions

  • Is the performance tradeoff between the restart-based approach vs. spending extra money on TEE-equipped hardware worth it?
  • While restarting the device clears the current attack, has any analysis been done on the effects of a persistent attack? That is to say, if a device is under siege a restart might fix the problem, but do continued attacks prevent the device from working at all (albeit without causing physical harm)?

@RyanFisk2
Copy link
Contributor

Reviewer: Ryan Fisk

Review Type: Comprehension

Problem

Recent attacks on cyber-physical systems have shown that software vulnerabilities can be leveraged to cause physical harm to those systems. The ability to cause physical damage creates a risk to human safety, and connecting these systems to the internet will only make the problem worse.

Contribution

This paper demonstrates a way to preserve the physical functionality of the embedded system during a cyber attack using secure execution intervals. This method prevents an attacker from having control of the system for long enough to cause any physical damage.

Questions

  1. How does the system check the validity of the running program? Could that system be manipulated to not trigger a reset?

  2. Would the resets cause data to be lost from the input devices?

  3. How long does it take to check each program for validity and is this scalable for multiple applications and devices?

@gparmer gparmer added the paper discussion s20 The discussion of a paper for the spring 2020 class. label Mar 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
paper discussion s20 The discussion of a paper for the spring 2020 class.
Projects
None yet
Development

No branches or pull requests