421: Paint the Iceberg Yellow

Transcript from 421: Paint the Iceberg Yellow with Chris Hobbs and Elecia White.

EW (00:00:06):

Welcome to Embedded. I am Elecia White, and this week we have something a bit different for you. I got to sit down with Chris Hobbs, author of "Embedded Software Development for Safety-Critical Systems," as part of my Classpert lecture series.

EW (00:00:27):

I have a quick warning for you. Around minute 55, we start talking about lawyers and self-harm. If that's a topic that would bother you, please feel free to skip it as soon as you hear the word lawyer. In the meantime, I hope you enjoy our conversation.

EW (00:00:46):

Welcome. I'm Elecia White, and I'm going to be talking with Chris Hobbs, author of Embedded Software Development for Safety-Critical Systems. Chris, thank you for talking with me today.

CH (00:01:00):

Thank you for inviting me.

EW (00:01:03):

Could you tell us about yourself as if we met at an embedded systems conference?

CH (00:01:10):

Okay. Yes. As you say, I'm Chris. I work for BlackBerry QNX. I work in our kernel development group. That is the group that works on the heart of the operating system, if you like. The operating system itself is certified for use in safety-critical systems. And I have particular responsibility for that side of things, ensuring that what we produce meets the safety requirements that are placed on it.

EW (00:01:42):

What exactly is a safety-critical system?

CH (00:01:47):

Well, I noticed on the introduction slide you had there that you spoke about software in aircraft, software in nuclear reactors, software in pacemakers, and yes, those are all safety-critical. More prosaically, we also have safety-critical software in Coca-Cola machines these days.

CH (00:02:08):

It came as a bit of a surprise to me as well, actually, when I first met this one, but in the olden days, Coca-Cola machines used to dispense cans or tins of drink, and that was straightforward. Now, the more modern ones actually mix your drink, so you could choose, "I want Diet Coke, cherry flavor, something, something," and the software mixes your drink for you.

CH (00:02:30):

One of the ingredients is caffeine. Caffeine is poisonous. So if the software goes wrong and puts too much caffeine in your drink, it'll poison you. So suddenly Coca-Cola machines and soft drinks machines in general become safety-critical.

CH (00:02:47):

And so, yes, what used to be a relatively small field with railway systems, nuclear power stations, and what have you, is now expanding. The other one I've worked on recently is these little robot vacuum cleaners run around your floor. When you go to bed, you leave the robot vacuum cleaner running around.

CH (00:03:08):

If they reach the top of the stairs, they're supposed to stop. And that's a software-controlled thing. If they don't stop and they go over the stairs, then of course they could kill some child or something sitting at the bottom stairs. So suddenly robot vacuum cleaners are also safety-critical.

CH (00:03:25):

And so anything that could potentially damage a human being or the environment we consider to be a safety-critical system. And it is, as I say, a growth area at the moment.

EW (00:03:36):

Some of those sound like normal embedded systems, I mean the robot vacuum, particularly, and the Coke machine. But how are safety-critical systems different from just pedometers and children's toys?

CH (00:03:52):

Yes, I think that's the big question. And part of the answer, I think, comes down to the culture of the company that is building the system, the safety culture of a number of companies that, particularly in the aviation industry, have been questioned recently, as I'm sure you've realized.

CH (00:04:15):

And it is this safety culture that underlies what is required to produce a system that is going to be safety-critical. And also another difference is lifetime. I mentioned there, it's not just human life that we deal with for safety. It's also the environment.

CH (00:04:34):

So for example, at the bottom of the North Sea off the east coast of Britain, there are oil rigs. And buried down at the bottom of the oil rig at the seabed, there are these embedded devices, which if the pressure builds up too much, will actually chop off the pipe that goes up and seal the oil well to prevent an environmental disaster.

CH (00:05:01):

If that happens, then it costs millions, and millions, and millions of dollars to get that back again. But of course, if it happens, it's going to happen. But replacing the software in that, or even upgrading the software in that system, is extremely difficult and extremely costly.

CH (00:05:21):

So unlike a child's toy or something like that, you may have software here which is required to work for 30, 40 years without attention. So that's another difference, I think, between it and the toy software.

EW (00:05:39):

So what do you do differently?

CH (00:05:44):

Yes, that's an interesting question. One of the international standards, ISO 26262, which is one of the standards for road vehicles basically, has in its second part, part two there, some examples of what makes a good safety culture in a company and what makes poor safety culture. The trouble is, of course, we are software people.

CH (00:06:09):

I mean, my work, I spend my life programming computers. We are not used to this unmeasurable concept of a safety culture. So all we can look for is examples.

CH (00:06:21):

And this is a subset, there's a page full of these in ISO 26262, but just to take a couple of examples here, heavy dependence on testing at the end of the development cycle to demonstrate the quality of your product is considered poor safety culture.

CH (00:06:40):

This is also an important one. The reward system favors cost and schedule over safety and quality. We've seen that recently in a large U.S. aircraft manufacturer, which put countdown clocks in conference rooms to remind engineers when their software had to be ready.

CH (00:06:58):

And companies I work for, if you are working on a non-safety-critical system, then your bonus each year depends on whether you deliver on time. If your are working on a safety-critical system, your bonus never depends on whether you deliver on time.

CH (00:07:15):

So ... the reward system in a good safety culture will penalize those who take shortcuts. So I think the fundamental answer to your question of what is different in when I'm doing safety work is the safety culture within the company.

CH (00:07:41):

And now, how you apply that safety culture to produce good software, that's another question. But yes, the safety culture is fundamental to the company and to the development organization.

EW (00:07:55):

So ... I mean, why do we even get so many failures? We hear about things failing. We hear about cars with unintended acceleration -

CH (00:08:05):

Yes.

EW (00:08:05):

- and oil rigs failing.

CH (00:08:08):

Yeah.

EW (00:08:08):

Is it just about not having the right culture?

CH (00:08:12):

It is in part about having the wrong culture and not having the right culture. But something else that's been observed fairly recently actually is the concept of SOTIF, I'm not sure, Safety of the Intended Functionality. Are you familiar with this concept, or perhaps I could give a quick description?

EW (00:08:32):

Only from your book. Go ahead.

CH (00:08:33):

Okay. So the idea here is that, a lot of people have a lot of better examples, but this is the example I give, traditionally, the way we have looked at a safety-critical system is that a dangerous situation occurs when something fails or something malfunctions.

CH (00:08:52):

So the idea is, "This thing failed, therefore something happened, and someone got hurt." There was a study done, not that long ago, particularly in the medical world, where they discovered that 93% of dangerous situations occurred when nothing failed. Everything worked exactly as it has been designed to work.

CH (00:09:16):

And I've got an example here. I mean, it's one I made up. I can give you a more genuine one, if you wish, but let's assume that we're in an autonomous car. It's traveling on the road. There's a manual car right close behind us. Okay. A child comes down the hill on a skateboard towards the road. Okay.

CH (00:09:39):

The camera system will pick that up and give it to the neural network, or the Bayesian network, or whatever that's doing the recognition. It recognizes that it's a child, 80% probability. Remember this will never be a 100%, but it recognizes that it's going to be a child with 87% probability.

CH (00:09:59):

It could also be a blowing paper bag wandering along the road, or it could be a dog. But yeah, the camera system has correctly recognized that this child on the skateboard is a child. The analysis system correctly measures its speed as being 15 kilometers an hour.

CH (00:10:17):

Great. The decision system now rejects the identification as a child, because children do not travel at 15 kilometers an hour unless on bicycles. This child we've identified is not on a bicycle, no wheels there. So it's not a child. It is probably the blowing paper bag, which we identified earlier. And remember, that was done correctly.

EW (00:10:40):

A as a human I'm like, "No, that's not true. If you identified it as a child, then there has to be another reason. You can't just ignore the information coming in." So how do we end up in that box?

CH (00:10:55):

Well, this is the problem, that nobody thought when they were putting the system together of a child on a skateboard. Children only go at 15 kilometers an hour if they're on bicycles. So we didn't consider that. And I'll come to that in a moment, because that is also really important. We didn't consider that situation.

CH (00:11:15):

So the decision system says, "Either I'm going to hit a paper bag, or I'm going to apply the brakes hard and possibly hurt the person in the car behind me," so correctly, it decides not to brake. So the point there is that everything did exactly what it was designed to do.

CH (00:11:34):

Nothing failed. Nothing malfunction. Nothing went wrong. Every subsystem did exactly what it should do. But you're right in your indignation there that we forgot that children can travel at 15 kilometers an hour if they're on the skateboard.

CH (00:11:50):

Now, the concept of an accidental system, back in 2010, there was a study done where they took a ship in the North Sea off the east coast of Britain, a large ship. And they sailed it to an area where they were jamming GPS. They just wanted to see what would happen to its navigation system. That's good.

CH (00:12:16):

What happened was, of course, the navigation system, you can find on the internet pictures of where this ship thought it was. It jumped from Ireland to Norway to here, the ship was jumping around like a rabbit. That was expected. If you jam the GPS, then you expect that you're not going to get accurate navigation.

CH (00:12:37):

What was not expected was that the radar failed. So they went to the radar manufacturer and said, "Hey, why did your radar fail just because we jammed the GPS?" And he said, "No, we don't use GPS. There's no GPS in our radar." They went to the people who made the components ... for the radar. And one of them said, "Oh yeah, we use GPS for timing."

EW (00:13:07):

And that's super common. I mean, GPS has a one pulse per second. I've used it in inertial measurement systems. It's really nice to have.

CH (00:13:15):

Yep, absolutely. And the trouble here was that that component manufacturer didn't know that their system was going to go into a radar. The radar manufacturer didn't know that that component was dependent on GPS. So we had what's called an accidental system. We accidentally built a system dependent on GPS.

CH (00:13:37):

Nobody would've tested it in the case of a GPS failure, because no one knew that it was dependent on GPS. The argument runs that a lot of the systems we're building today are so complicated and so complex that we don't know everything about how they work.

EW (00:13:58):

I understand that with the machine learning, but for your example, somebody routed the wire from the GPS to the component.

CH (00:14:07):

Presumably all the component had integrated into it. And therefore they just plugged the component in. So this was an accidental system.

CH (00:14:16):

And the idea is that we cannot predict all of the behavior of that system, and SOTIF, Safety of the Intended Functionality, everything worked, nothing failed, nothing failed at all, but we got a dangerous situation. Everything in that radar worked exactly as it was designed to work, but we hadn't thought of the total consequences of what it was.

CH (00:14:47):

There's a lot of examples of the SOTIF. Nancy Leveson gives one which is apparently a genuine example. The U.S. military had a missile at one air force base that they needed to take to a different air force base.

CH (00:15:04):

So obviously what you do is you strap it on the bottom of an aircraft and you fly it from one base to the other. But that would be a waste of time and a waste of fuel. So they decided what they would do was put a dummy missile also on that aircraft.

CH (00:15:20):

And when it got up to altitude, it would intercept another U.S. aircraft and they would fire the dummy missile at the other aircraft just for practice. I think you can see what's going to happen here. So, yes, it took off with the two missiles on. It intercepted the other U.S. aircraft. The pilot correctly fired the dummy missile.

CH (00:15:42):

That caught you. You were thinking otherwise there. But the missile control system was designed so that if you fired missile A, but missile B was in a better position to shoot that aircraft down, it would fire missile B instead. And in this case, there was an antenna in the way of the dummy missile.

CH (00:16:04):

So the missile control software decided to fire the genuine missile. It destroyed the aircraft. The pilot got out, don't worry. It's not a sad story. But again, everything worked perfectly. The pilot correctly fired the dummy missile. The missile control software did exactly what it was supposed to do. It overrode the pilot and fired the other missile.

EW (00:16:29):

See that one makes a lot more sense to me, because it was trying to be smart.

CH (00:16:35):

Yes.

EW (00:16:36):

And that's where everything went wrong. So much of software is trying to be clever, and that's where everything goes bad.

CH (00:16:43):

Yeah. And I think you could make that argument with my example with the child on the skateboard, that the system was being clever by saying, "No, that's not a child. Because it can't be a child if it's traveling at 15 kilometers an hour and it's not on a bicycle." So again, the software was trying to be smart and failing.

EW (00:17:03):

But that one, it was trying to be smart in a way that doesn't make sense to me, because things change in the future. Kids get, I don't know, magnetic levitation skates, and suddenly they're zipping all over. But yeah ... Okay. So you mentioned the safety culture, but what about tactics? How do we avoid these things? I mean, I've heard about risk management documentation.

CH (00:17:35):

Yeah. So typically we would start a project, any project that's going to be certified in any way, with what we call a hazard and risk analysis. You have to be a bit careful about these terms hazard and risk, because they differ from standard to standard. The way I use it is, the iceberg is the hazard. It's a passive thing. It's just sitting there.

CH (00:17:58):

The risk is something active. The ship may run into it. Other standards, on the other hand, would say the hazard is a ship running into the iceberg. So we have to be a bit careful about terminology, but to me, the iceberg is the hazard. The risk is running into it.

CH (00:18:17):

And so we do a hazard risk analysis on the product, and using brainstorming, and there is an ISO standard on this, identify the hazards associated with the product, and identify the risks associated with them. We then have to mitigate those risks.

CH (00:18:43):

And anything that's mitigating the risk becomes a requirement, a safety requirement, on the system. So if you take the iceberg, we may decide to paint the iceberg yellow to make it more visible. Okay. Silly. Okay. Silly idea, but, yeah. The iceberg, we're going to paint it yellow.

CH (00:19:00):

So there is now a safety requirement that says the iceberg must be painted yellow. Okay. And there is then still a residual risk that yellow painting icebergs doesn't help at night, because you can't see the iceberg anyway.

CH (00:19:17):

So we start with the hazard risk analysis, and that is the fundamental point, because it's that that is defining what the risks are, and what we're going to do to mitigate them, and what requirements we make. So typically there will be a requirement.

CH (00:19:38):

... So that's your development, we're setting up the hazard risk analysis. At the other end, one of the things we have to deliver is our justification for why we believe we have built a sufficiently safe system. And there's two things in that of course. What is sufficiently safe? And how do you demonstrate that you have met that? Yes.

EW (00:20:06):

So back to the skateboard and child. The hazard is the child and the risk is the chance of hitting it, hitting the child, -

CH (00:20:17):

Correct.

EW (00:20:17):

- if I understand your terminology. And -

CH (00:20:19):

Yes. Using my terminology. Yeah.

EW (00:20:21):

- we might mitigate that by saying anything that we aren't sure about, we're not going to hit. And then the residual risk still is we might brake suddenly and thereby hurt the person behind us?

CH (00:20:37):

Yes. And that is the residual risk. If you remember, there was an incident a while back in the U.S. with a woman at night walking across the road pushing a bicycle. She was hit by an autonomous car, an Uber. The car had initially been programmed in the way you stated, "If I do not recognize it, then I will stop."

CH (00:21:01):

They found it was stopping every 10 minutes, because it was something there that hadn't been anticipated. So they changed it to say, "We'll stop if we see one of these, one of these, one of these, one of these, or one of these." And a woman pushing a bicycle was not one of those. A woman riding a bicycle would've been.

EW (00:21:22):

Okay. So I just want to go to their team and say, "You can't do this." I want to throw down a heavy book and say, "You need somebody on your team who thinks through all of these problems," who actually has, well, I'm not going to say insulting things, but, "who has the creativity of thought to consider the risks," that clearly that team did not have.

EW (00:21:49):

How do we get people to that point, to that, "I'm not in the box. I want to think about everything that could happen, not just - "

CH (00:21:59):

Yes.

EW (00:22:00):

" - what does happen?"

CH (00:22:01):

What does happen.There's two things that are happening there. One is, last month, actually, a SOTIF standard came out with a sort of semi-structured way of considering the safety of the intended functionality.

CH (00:22:18):

But also, this safety case that I mentioned earlier, the thing that identifies why I believe my system is adequately safe, is actually going to try to answer that question. And this is one of the things that's happening to our standards at the moment, that most of what are called safety standards at the moment that we have are prescriptive.

CH (00:22:49):

They tell you what you must do. "You must use this technique, this technique, and this technique. You must not do this. You must not do that." The trouble with that is that it takes 10, 15 years to bring in a standard. And in that time, techniques change. Software world is changing very, very rapidly.

CH (00:23:07):

So basically by doing that you are burning in the need to apply 10, 15-year-old technology, which is not good. So the series of standards coming out now, like UL's 4600, which are what are called goal-based standards, G-O-A-L, goal, as in football goal, goal-based standards, which say, "We don't care how you build your system."

CH (00:23:34):

"We don't care what what you've done, but demonstrate to me that your system is adequately safe." And that, I think, is where your imagination comes in, that sitting down and imagining awkward situations where, "Well, what would happen if this were to happen?"

CH (00:23:58):

UL 4600 gives a couple of good examples actually for autonomous cars, basically. And it gives one example of an autonomous car. There is a building that's on fire. There are fire engines outside. There are people milling around, lots of people in the road. Someone has used their cellphone to call their car.

CH (00:24:24):

Their autonomous car has arrived to pick them up. Now, what it's doing, of course, is it's having to go on the wrong side of the road. There are fire hoses. There are people. There are what have you. Could your autonomous car handle that situation?

CH (00:24:41):

And you're right. We need people of imagination to look at these situations. Now, the person who produced UL 4600 has also published a number of papers that say that a lot of these incidents that your car may meet during its life are long tail.

CH (00:25:05):

It is unlikely that any car, any particular car, will meet the situation of an aircraft landing on the road in front of it. But inevitably over the course of 10 years, a car somewhere will meet the incident of an aircraft landing on the road in front of it.

CH (00:25:24):

So do we teach every car how to handle an aircraft landing on the road in front of it, given that only one car is probably in its entire lifetime likely to meet that situation? So becoming imaginative, as you say is great, but we have a limited amount of memory that we can give to these systems in the car to understand.

CH (00:25:49):

Have you met a child coming down a hill towards your road on a skateboard when you've been driving? Probably not.

EW (00:25:58):

I mean, I've seen kids on roller blades, which is even less identifiable, and I live on a hill, so yes, they get going pretty fast.

CH (00:26:07):

Yeah.

EW (00:26:08):

As far as the creativity part, in your book, you mentioned that there are some standards that are starting to ask some of the questions we should be thinking about, like, "What happens if more happens?" Do you recall what I'm talking about?

CH (00:26:25):

Yep. There is an ISO standard for doing a hazard and risk analysis. I must admit that initially, it was pointed out to me by one of our assessors, and I thought it was pretty useless. But we applied it, because our assessor told us to and found it's fairly useful.

CH (00:26:44):

I can look up its number, I can't remember off the top of my head, but yes, what it does is it structures the brainstorming. So when you are in a room trying to identify hazards and risks, you are brainstorming, "Well, what could go wrong with this?"

CH (00:27:01):

"Well, maybe a child could come down a hill on a skateboard, or maybe this, or maybe this." And what the standard does is it gives you keywords, specific keywords, like less, fewer, none. So what would happen if there were no memory available on this system? It's basically a structured way of doing a brainstorming.

CH (00:27:27):

So we use that quite extensively now to do exactly that. Take a keyword such as none, too early, too late. "What if the camera system gives this input too late or too early," and things like that. But it is only really a way of structuring a brainstorming session.

EW (00:27:51):

I always like the question, "Let's assume that something catastrophically failed. What was it, - "

CH (00:27:57):

Yes.

EW (00:27:57):

" - that sort of backwards looking. But this creativity is important for figuring out a diverse risk analysis. But there's so much paperwork. I mean, I'm happy to be creative, and I've done the paperwork for FDA and FAA. But that paperwork is kind of, let's just say, boring to write. Why do we have to do it?

CH (00:28:28):

Right. ... Well, a lot of it can be semi-automatically generated. I think that's one of the points to be made. Producing the paperwork doesn't actually make your system any better as you appreciate. I'm a pilot. I own a small aircraft.

CH (00:28:49):

And for example, the landing light bulb is just a standard bulb that I could go down to the local car shop and buy a replacement for. But I'm not allowed to, even though it has the same type number, and the same this, that, and the other as a car bulb, I'm not allowed to buy that.

CH (00:29:10):

I have to buy an aviation grade one that comes with all of the paperwork. Where was it built? When was it done? Who sent it to whom? Who sold it to whom? And of course that bulb is five times the cost of going down to the local shop and buying one.

CH (00:29:28):

But it comes with that paperwork. And a lot of that paperwork can be generated, as I say, semi-automatically. This thing that I keep referring to at the end, that we produced during the development but at the end, the safety case that tries to justify that we are adequately safe, we have templates for that.

CH (00:29:55):

And we would expect someone to apply that particular template and do it rather than producing them the paperwork from scratch every time. Now I go sometimes as a consultant into a startup company, particularly a medical startup, there's been a spinout from a university.

CH (00:30:15):

The university people know all about the medical side of it. They know nothing about the software side of it. And they've got some student-produced software that was written by some students three years ago who've now disappeared. And yes, there is a lot of back paperwork to be done. But in general, once onto the system, it should be semi-automatic, that paperwork.

EW (00:30:42):

But every standard is different. I mean, I remember the DO178-B and the FDA documentation. They had different names for everything. You mentioned risk and hazards mean different things in different standards.

EW (00:30:59):

Are we getting to the point where everybody's starting to agree? ... I mean, you work for QNX. It's a real-time operating system, so you have to do both, don't you?

CH (00:31:12):

Yep. And yep. We're used in railway systems. We're used in industrial systems. We're used in autonomous cars. We're used in aircraft systems, medical systems. It is awkward. There is a group in the UK at the moment, in the safety-critical systems club, that is trying to put together a standardized nomenclature.

CH (00:31:39):

My colleague, Dave Banham is part of that group. I honestly don't hold out much hope for what they're doing. My feeling is that it's happened before, where we've had 10 standards on something, and we then tried to consolidate them into one. And we ended up with 11 standards.

CH (00:31:59):

Yeah. That seems to be the typical way of going. But yeah, Dave and his group are really trying to produce a common nomenclature and vocabulary for use. But no, each standard at the moment is different and uses the terms differently. It's annoying, let's say.

EW (00:32:21):

So going back to risk analysis, how do we determine if a failure is going to happen? How do we put a number on the probability of something going wrong?

CH (00:32:36):

Right. So remember when we talk about this that that study I mentioned right at the beginning argues that only something like 7% of dangerous situations occur because something failed. SOTIF is the other side and is supposed to handle the other 93%. But you're right.

CH (00:32:55):

Almost all of the standards that we are dealing with at the moment assume failure. So we have to assess failure. So a standard like IEC 61508, which is the base standard for lots of other standards, assigns a safety integrity level to your product.

CH (00:33:15):

And the safety integrity level is dependent on your failure rate per hour. So for example, SIL 3, safety integrity level 3, means a failure rate of less than 10^-7 per hour, one in 10 million per hour. So how do you assess that, is the question? And the answer is it is not easy.

CH (00:33:38):

It is obviously a lot easier if you have an existing product. When I first came to QNX 12, 13, oh heavens, 13 years ago, they had a product with a long history and a good history, which had been very carefully kept. So we could look at the hours of use and the number of reported failures.

CH (00:34:04):

The problem we got is we don't know what percentage of failures were being reported. Our software is almost certainly in your car radio, for example, but if your car radio stopped working, you would turn it off, turn it on again. And we didn't get to hear about that failure.

CH (00:34:20):

If every car in the world had had our software, the radio failed, we would hear about it, but would we count that as one failure or a million failures? So there were problems with that. And the way I've done this is with a Bayesian fault tree. Building up the Bayesian fault tree gives us the opportunity to go in either direction.

CH (00:34:46):

I mean, you mentioned earlier, the system has failed, what caused it, which is down from top to bottom, if you like. The Bayesian fault tree also allows us to go the other way. "If this were to fail, what effect would that have on the system?" And so you can do sensitivity analyses and things like that.

CH (00:35:06):

So the place to start, again, I think, is in what ways could this fail? And if we take an operating system, it doesn't matter whether it's Linux or a QNX operating system, we identified that really there only three ways in which an operating system can fail.

CH (00:35:25):

An operating system is an event handler. It handles exceptions. It handles interrupts. It handles these sorts of things. So really there's only three ways it can fail. It can miss an event. An event occurs, an interrupter comes, but because it's overloaded, it doesn't get it. It doesn't notice it.

CH (00:35:48):

It can notice the event and handle it incorrectly, or it can handle the event completely correctly, but corrupt its internal state so that it will fail on the next one. And basically it's then a trawl through the logs, and failure logs, and what have you, to find how often those failures are occurring and whether they are reducing, or whether they go up at every release.

CH (00:36:18):

My colleague Waqar and I presented a paper at a conference last year where we were applying a Bayesian network on that to see if we could predict the number of bugs that would appear in the field based on some work done by Fenton and Neil.

CH (00:36:32):

It was an interesting piece of work that we did there, I think. But yes, it is not a trivial exercise, particularly as a lot of the standards believe that software does not fail randomly, which of course it does. Yes.

EW (00:36:52):

I have definitely blamed cosmic rays, and ground loops, and random numbers on some of my errors, I'm sure. But I have a question from one of our audience members, Timon, "How do you ensure safe operation with an acceptable probability in a system that is not fully auditable down to the assembly level, for example, a complex GUI, or a machine-learning-driven algorithm?"

CH (00:37:18):

Yes. Particularly the machine-learning algorithm I think, is a good, really good example there. I mean, we all know examples of machine-learning systems that have learned the wrong thing. One I know for certainty, because I know the people involved. There was a car company in Southern Germany.

CH (00:37:45):

They built a track, a test track for their autonomous vehicles, and their autonomous vehicles learned to go around the test track perfectly, absolutely perfectly, great. They then built an identical test track elsewhere in Germany, and the cars couldn't go around it at all, although the test track was identical.

CH (00:38:08):

And what they found when they did the investigation was that the car on the first test track had not learned the track. It had learned the advertising hoardings. So turn right when it sees an advert for Coca-Cola, turn left. And the track was identical in the second case, but the adverts weren't.

CH (00:38:28):

So there was a system that could have been deployed. It was working absolutely perfectly yet it had learned completely the wrong things. And yeah, this question that you have here is, to some extent, impossible to answer. Because we have these accidental systems. The systems that we're building are so complex that we cannot understand them.

CH (00:38:54):

There's a term here, intellectual debt, which Jonathan Zittrain produced. An example of that, nothing to do with software, is that we've been using aspirin for pain relief, apparently since 18-something. We finally understood how aspirin worked in 1965 or something, somewhere around there.

CH (00:39:19):

So for a hundred years, we were using aspirin without actually understanding how it worked. We knew it worked, but we don't know how it worked. The thing is the same with the systems that we're building with machine learning. They seem to work, but we don't know how they work.

CH (00:39:40):

Now, why is that dangerous? Well, it's dangerous with aspirin, because in that intervening period where we were using it, but didn't know how it worked, how could we have predicted how it would work and interact with some other drug? With our machine learning system, yes, it appears to work. It appears to work really well.

CH (00:40:02):

But how can we anticipate how it will work with another machine-learnt subsystem when we put these two together? And this is a problem called intellectual debt. It's not my term, it's Zittrain's term. But we are facing a large problem. And machine learning is a significant part of that.

CH (00:40:26):

But yeah, we are never going to be able to analyze software down to the hardware level. And the techniques that we have used in the past to try to verify the correctness of our software, like testing, dynamic testing, now are becoming increasingly ineffective. Testing software these days does not actually cover much. I call it digital homeopathy.

EW (00:40:55):

But machine learning is in everything. I mean, I totally agree with you. I've done autonomous cars, autonomous vehicles, and it will learn whatever you tell it to. And it's not what you intended usually.

CH (00:41:08):

Yeah. So now when you take that software and combine it with another machine-learnt system, which you also don't understand fully, to anticipate how those two will interact becomes very difficult.

EW (00:41:24):

I was a little surprised in your book that you had Markov modeling for that very reason, that it's not an auditable heuristic. How do you use Markov modeling?

CH (00:41:36):

Yeah. ... The standards, IEC 61508, ISO 26262, EN 50128 for the railways, they are prescriptive standards, as I said earlier. And they give methods and techniques, which are either not recommended, or are recommended, or highly recommended.

CH (00:42:05):

And if a technique is highly recommended in the standard and you don't do it, then you've got to justify why you don't do it, and you've got to demonstrate why what you do do is better. So in a lot of cases, it's simply easier to do it.

CH (00:42:22):

We faced this at QNX for a number of years. There was a particular technique which we thought was not useful. We justified the fact that it was not useful. In the end, it got too awkward to carry on arguing that it was not useful as a technique.

CH (00:42:37):

So we hired a bunch of students. We locked them in a room and got them to do it. It was stupid, but it just took it off our back so that next time we went for certification, we could say, "Yes, we do it. Tick, give us a tick box, please." And there's a lot of things that are in that.

CH (00:42:55):

I have used Markov modeling for various things, for anomaly detection here and there. But really the systems we're using these days are not sufficiently deterministic to make Markov modeling particularly useful. The processors we're running on, the SoCs we're running on, come with 20 pages or more of errata, which make it completely non-deterministic.

CH (00:43:33):

There was a phrase I heard at a recent conference. There is no such thing as deterministic software running on a modern processor. And I think that's a correct statement. So yeah, I would not push Markov modeling. It's in my book, because the standards require it. Maybe it won't be in the third edition.

EW (00:43:53):

The standards require it? I mean, which standards? How? Why?

CH (00:43:58):

Yeah. Yep.

EW (00:43:58):

I mean, they can't disallow machine learning, because it's not auditable and then say, "Oh, but Markov modeling is totally reasonable." The difference is minor.

CH (00:44:08):

Yeah. The problem here is these standards, as I said, are out of date. I mean, the last edition of IEC 61508 came out in 2010. There may be a new version coming out this year. We're expecting a new version this year. So that's 12 years between issues of the standard.

CH (00:44:31):

And I'm not sure how to say this politely. A lot of people working on the standards are people who did their software development even longer ago. So they are used to a single-threaded, single-core, run-to-completion, executive-type model of a processor and of a program.

CH (00:44:55):

And so they are prescribing techniques which really are not applicable anymore. And I suspect Markov modeling is one of those. This is where I think this move towards goal-based standards like UL 4600 is so useful. I don't care what techniques used.

CH (00:45:17):

I don't care what language is used. I don't care what this, that and the other, demonstrate to me now that your system as built is adequately safe. And I think that's a much better way of doing stuff. It makes it harder for everybody. It makes it harder for the assessor, because the assessor hasn't got a checklist.

CH (00:45:36):

"Did you do this? Yep. Did you do this? Yep. Did you do this? Did you not do that?" It makes it harder for the developers, people like ourselves, because we don't have the checklist to say, "Well, we have to do this." But it is a much more honest approach. Demonstrate to me with a safety case that your system is adequately safe.

EW (00:45:55):

Are there tools for putting those sorts of things together? Are there tools to ensure you tick all of the boxes, or is that all in the future?

CH (00:46:09):

Again, we can talk about the two sides. The tick box exercise, yep, there are various checklists that you can download. IEC 61508 has a checklist and what have you. The better approach, the other one I talk about, the safety case approach -

EW (00:46:30):

I think we did the safety case approach for FAA. I think the DO178-B is more in that line, where we basically had to define everything and then prove we defined it. Is that what you mean by a safety case or goal-driven?

CH (00:46:46):

So the idea here is that we put together an argument for why our system is sufficiently safe. Now there's a number of notations for doing this. This one is called Goal Structuring Notation. And I'm using a tool called Socrates. There's a number of tools. This is a Socrates tool.

CH (00:47:05):

So what I can do here is, this is our top-level claim, we are claiming ... that our product is sufficiently safe for deployment.

CH (00:47:16):

And we are basing that claim on the fact that we have identified the hazards and risks, we have adequately mitigated them, that we have provided the customer with sufficient information that the customer can use the product safely, and the fact that we develop the product in accordance with some controlled process.

CH (00:47:38):

Now we're only claiming the product is sufficiently safe if it is used as specified in our safety manual. And then of course we can go down, ... customer can use the product safely, jump to the subtree here, and zoom in. So what I'm claiming here is that if in 25 years time, the customer comes back with a bug, we can reproduce that problem.

CH (00:48:07):

We have adequate customer support, documentation is adequate, and so on. So the idea here is two things. First of all, all UL 4600 and the more modern standards require is this argument based on evidence. The first thing to do is you must put the argument together before you start to look for evidence.

CH (00:48:33):

Otherwise you get confirmation bias. There's a little experiment I do on confirmation bias. You've probably done this exercise yourself, but it's one I do with people a lot. ... I'm doing a slide presentation, let's say. And I say, "On the next slide, I've written a rule for generating an X number in this sequence."

CH (00:48:55):

"You're allowed to guess numbers in order to discover the rule." Now I'm not going to ask you to do this, Elecia, because I don't want you to look like an idiot at the end of this.

EW (00:49:07):

No, I read your book. I know what to guess.

CH (00:49:09):

Okay, great. So what happens is people guess 12, 14, 16, and I say, "Great. Yep. Those numbers all work."

EW (00:49:18):

On your slide, do you have 2, 4, 6, 8, and 10? Okay.

CH (00:49:22):

So you must now guess what the rule is for generating the next number in this sequence. And so to do so, you can guess numbers. And so typically people will guess 12, 14, 16, and I will say, "Yep, they work. So what's the rule?" And they say, "Well, it's even numbers." And I say, "Nope, not even numbers. Guess a few more."

CH (00:49:44):

And they go, "18, 20, 22?" "What's the rule?" "Well, plus two." "Nope, it's not plus two." And this goes on for some time. I've been up all the way to 40, and 44, and things with some customers until somebody guesses 137, just to be awkward. And I say, "Yep, that works." And it then leads us to what the rule is.

CH (00:50:10):

The rule has to be that each number must be larger than the previous one. The problem that this identifies is what's called confirmation bias, at which we're all as human beings subject to. If you think you know the answer, you only look for evidence that supports your belief.

CH (00:50:33):

If you believe that this is even numbers, you only look for evidence that it is even numbers. This was identified by Francis Bacon back in the 17th century. It was rediscovered, if you like, fairly recently, and we applied this to some of our safety cases.

CH (00:50:54):

And we started finding all sorts of additional bugs. Instead of asking people, "Produce me an argument to demonstrate that this system is safe," now, if you ask someone to do that, what sort of evidence are they going to look for? They're going to look for evidence that the system is safe.

CH (00:51:10):

So we said, "Look for evidence that the system is not safe, and then we will try to eliminate those." By doing that, we found an additional 25 or so problems that we had never noticed in our safety cases previously. So we took that to the standards bodies that produced the standards for this Goal Structuring Notation.

CH (00:51:34):

And now that doubt has been added to the standard. So basically the idea is, we put together an argument. We argue the customer can use the product safely. We argue that we have identified the hazards and risks, that we have done this.

CH (00:51:54):

And we take that to all the stakeholders and say, "If I had the evidence for this argument, would that convince you?" And typically they'll say, " Well, it's good, but we'd also like this. And we'd also like that." And you can build that. So we build the argument. Only then do we then go and look for evidence.

CH (00:52:18):

So here, for example, we come through, the residual risks are acceptable. And the sub-claim of that is that there is a plan in place to trace the residual risks during a customer use of the product. If the customer uses our product in an environment we did not expect, and there are new risks, then we have a plan in place to trace those.

CH (00:52:43):

So now please show me your evidence that that is true. So first of all, we put together the argument, we agree on the argument structure, and only then do we go to look for the evidence. And what we'd like to be able to do is put doubt on that.

CH (00:53:01):

So, okay. You've got a plan, but has that plan ever been used? Has that plan actually been approved? Do your engineers actually know about that plan? Put all of the doubt you possibly can into this safety case. And I think then we have a justification for saying this product is sufficiently safe for deployment.

CH (00:53:24):

And as you were saying, nothing to do with the fact that I used this technique, or I used Markov modeling, or I did this or the other, it is the argument that says why I think my product is sufficiently safe.

EW (00:53:38):

In your book, you had a disturbing section with a lawyer, talking about liability for engineers.

CH (00:53:50):

Oh, yes. I don't know whether I mentioned the anecdote with [Ott Nortland], possibly I did. For those who haven't read the book, I was at a conference a few years ago, a safety conference, and we were all standing around chatting as you do. And one of the people there was [Ott Nortland], who's well-known in the safety area.

CH (00:54:12):

And he said something that hit me, hit me like a brick. He said he had a friend who is a lawyer. That lawyer often takes on cases for engineers who are being prosecuted because their system has hurt somebody or the environment. And [Ott] said that his friend could typically get the engineer off and found innocent if the case came to court.

CH (00:54:45):

But often the case does not come to court, because the engineer has committed suicide before the case reaches court. I'm not sure if that's the anecdote you were thinking of, Elecia, but as you can imagine, it stopped the conversation as we all sort of started to think about the implications of this.

CH (00:55:07):

But it does make you realize that the work we're doing here is real, that people do get hurt by bad software, and the environment does get hurt by bad software. So, yeah, it is real. And there's a lot of moral questions to be asked as well as technical questions.

EW (00:55:31):

And as somebody who has been in the ICU and surrounded by devices, I want that documentation to have been done. This risk analysis, it can be very tedious. For all that we're saying, there's a creativity aspect to it. But all of this documentation, well, the standards prescribe what you're supposed to do. The goal is to make sure you think things through.

CH (00:56:01):

Yes, ... there's various ways of looking at the standards. I was in the ICU earlier this year. I broke my wrist on the ice. And although I knew about it intellectually, I was horrified to see it practically, the number of Wi-Fi, Bluetooth connections coming from these devices that were all around me. Was it designed to work together in that way, those systems? I don't know. But yeah, there's different ways to look at the standards.

CH (00:56:39):

I don't like prescriptive standards as I've probably indicated to in the course of this. However, the prescriptive standards do give guidelines for a number of types of people.

CH (00:56:55):

As I say, that startup, that spinout from the university that has had no product experience basically, they really could use those standards, not as must do this, must do this, must do that, but as a guideline on how to build a system. And I think no question there, these are good guidelines in general.

CH (00:57:16):

They may be a little out of date, but they're good guidelines. They're certainly better than nothing. The other way of looking at these safety standards is that, although they say on the cover that each one says this is a functional safety standard, building a safe product is trivially easy.

CH (00:57:37):

The car that doesn't move is safe. The train that doesn't start moving is safe. The aircraft that doesn't take off is safe. As soon as we make the product useful, we make it less safe. As soon as we let the car move, it becomes less safe.

CH (00:57:55):

So I like to think of these standards sometimes as usefulness standards. They allow us to make a safe system useful. And I think if you approach them in that in that manner, then it answers, I think in part, your concern about your devices in your intensive care unit and what have you, how they can be used.

CH (00:58:19):

But yes, certainly some level of confidence like that safety case I spoke about, the product is sufficiently safe for deployment in a hospital environment with other pieces of equipment with Wi-Fis and Bluetooths around it, and used by staff who are semi-trained, and, "I'm in a hurry," and are tired at the end of a long day, that should be documented and demonstrated. Yeah, I'd agree wholeheartedly.

EW (00:58:49):

And documented and demonstrated because we want engineers and managers to think about that case.

CH (00:59:00):

Yes. But this, again, comes back to what you were saying earlier of imagination. My wife often says that when she looks over my shoulder at some of these things, "You need someone who is not a software engineer to be thinking up the cases where this could be deployed, it could be used.

CH (00:59:25):

"Because you engineers, Chris, you are not sufficiently imaginative." It's not something that engineers do, is be imaginative in that way. And so, yeah, it is a problem, but ultimately we are not going to be able to foresee and take into account every situation.

CH (00:59:43):

But certainly, if you look at those medical devices in the hospital, you'll find most of them have some sort of keypad on so that the attendant, or nurse, or doctor can type in a dose or something. If you look at them, you'll find half of the keypads are one way up like a telephone.

CH (01:00:00):

The others are the other way up like a calculator, either zero is at the top, or zero is at the bottom. Now in the course of a day, a nurse will probably have to handle 20 of these devices, all of which are differently laid out with different keyboard layouts and all that sort of stuff. That should have been standardized. That is setting you up to make a mistake.

EW (01:00:29):

It's not that we're making things safe that way, we're actually designing them in a way that will cause a problem in the name of intellectual property and lack of standards. Because following a standard is a pain. I mean, it's not as much fun as designing software from the seat of the pants.

CH (01:00:55):

Nope. It is much more fun to sit down and start coding. Yeah, I agree wholeheartedly. I am a programmer. I work all day in Ada, and Python, and C. And yep, it is much more fun and much less efficient to sit down and start coding.

EW (01:01:18):

I have a question from a viewer, Rodrigo DM asked, "How do you develop fault-tolerant systems with hardware that is not dedicated to safety-critical design, for example, in Arm M0+?

CH (01:01:35):

Yeah. So as I've said, the hardware is always going to be a problem. The only way, really, you can do that, if what you're looking for is a fault-tolerant system, is duplication or replication, preferably with diversification. This is painful because, of course, it costs more money. Because you're going to put two of them in.

CH (01:02:06):

So you've got to say either, "I am going to have sufficient failure detection, that I can stop the machine and keep it safe, or I'm going to have replication. I'm going to put two processors in," or whatever. I had a customer a while back who was taking replication or diversification to an extreme.

CH (01:02:32):

They were going to put one Arm processor, one x86 processor, ... one processor running Wind River, one processor running QNX, and so on. And I asked the question, "Why are you diversifying the hardware?" The answer was, "Well, because the hardware has bugs. We know that. There's 20 pages of errata."

CH (01:02:56):

And I said, "Well, yeah, but these are heisenbugs. These are not bohrbugs. These are random bugs." The last time I can remember a bohrbag, a solid bug in a processor was that x86 Pentium processor that couldn't divide, if you remember the Pentium bug back in 1994, 1995.

CH (01:03:19):

If the processor is going to fail, it is likely to be going to fail randomly, in which case, two processors of the same type are almost certainly going to be sufficient, or even one processor with running something like virtual synchrony, which would detect the fact that the hardware error has occurred and then take the appropriate action, which may be simply running that piece of software again.

CH (01:03:47):

And I know there's a couple of companies, particularly in Southern Germany, using coded processing to do the safety-critical computation so that you can check implicitly the correctness of whether the computation has been done correctly. And the argument is, "You can have as many hardware problems as you like which don't affect my safety."

CH (01:04:14):

"I don't care. Bit-flipping memory is going to occur every hour, but that's fine if it doesn't affect my computation. So if I use something like coded processing, where I can check that the computation was done correctly to within one in a million, say, then I don't care about those hardware problems."

CH (01:04:34):

But again, you've got to justify that. And that is the way I've seen one of our customers do it, with coded processing using then non-certified hardware.

EW (01:04:46):

And back to Timon, do you have tools for doing risk analysis? Are there specific things you use?

CH (01:04:59):

I don't know a good tool. If anybody has, then please let me know. Over the years we have built tools to allow us to do this, but they're internal Python scripts to do this, that, and the other. No, I don't know of a good tool for doing risk analysis. Sorry about that.

EW (01:05:20):

I'm kind of sad. Yes.

CH (01:05:21):

Yeah, yeah. That's right.

EW (01:05:24):

Part of the goals of many of the documentation involve traceability, where, as you were showing, you have a safe product and that breaks into multiple things that includes the safety manual, which breaks into multiple things. Do you have a tool for that?

CH (01:05:44):

For the traceability?

EW (01:05:45):

Yes.

CH (01:05:46):

Yeah. Yeah. So this is something that most of the processes demand. ASPICE, for example, demands this, CMMI. The tracing of a requirement to the design, to the lower-level requirements, to the code, to the test cases, there's the verification evidence that you have for that.

CH (01:06:09):

What we have found there, this is going to sound silly. This is going to sound really silly, but we use LaTeX for all of our document preparation. The beauty of LaTeX is that it's textual. It is just ASCII text. So basically we can embed comments in our documents.

CH (01:06:32):

And then at the end of the product, when we have to produce the complete table of traceability, we simply run a Python script over those documents. It reads those structured comments, and it produces the document automatically.

CH (01:06:48):

So, if on the other hand we were using some proprietary documentation tool, like Microsoft's Word or something like that, I don't believe we could do that. And I'm not sure how you would do that. You would have to keep that as a separate document manually. But the nice thing about LaTeX, just ASCII text, you can run the Python script on it.

CH (01:07:09):

You can pull out all this stuff and produce these really, really, really boring documents that tell you that safety requirement number 1234 is in paragraph 3.2.7.4 of the design documentation, and it relates to the lines 47 to 92 of this particular C module and so on. Because all of that is just in ASCII text. So that's the way we've done it.

EW (01:07:34):

Well, I meant to only keep you for about an hour and we've a bit over. One more question about languages. You mentioned Ada, which is a language that has the idea of contracts and provability. And you mentioned C, which is the reputation for being the wild, wild west of unsafety.

CH (01:07:58):

Yes.

EW (01:07:59):

Which languages do you like? Which languages should we be using? And how do we figure that out?

CH (01:08:07):

Yep. This isn't interesting. The standards themselves deliberately ... recommend not using C. So if you look in IEC 61508, it gives a list of languages that it likes and dislikes. And it specifically says, "Don't use C." What it does say is you can use C if you use a subset of C and that subset is checked by some form of static analysis.

CH (01:08:39):

So for example, you might take the MISRA subset of C and you might use Coverity or something like that, or Klocwork to check that you are using that. And to be fair, our operating system is written in C.

CH (01:08:56):

Every time I go to a conference, I find that there's a whole load of people there selling products to try to make C better, to check that C is doing this and that you are not doing this in C and you're not doing that. And I feel that we are getting to the point where we've got to stop putting lipstick on the C pig and go elsewhere. Now, where do you go elsewhere? Now, that's a good question.

CH (01:09:30):

I just put up on the screen a list of some of the things that I feel that you could discuss about what you need in languages, Ada and SPARK Ada in particular. Yes, we have the formal proving, and we have a customer I'm working with at the moment who is using SPARK Ada. And that's great.

CH (01:09:47):

The other one that's on the horizon at the moment, well, D was on the horizon for a while, but Rust seems to be coming on the horizon. I have a bad experience with Rust. I teach a course, which this is one of the slides. And a few years ago, I wrote a Rust version of a very bad C program that I have that has a race condition and Rust would not compile it.

CH (01:10:14):

That was great. That was exactly what I wanted. Rust refused to compile this badly structured program. A couple of months later, I was giving that course again, and I said, "Look. Watch, and I'm going to compile this with Rust and show you how it doesn't accept it." It accepted it. The compiler had been changed.

CH (01:10:33):

I repeated the same thing about six months later. And this time Rust gave me a warning message with a different thing. That's the problem with Rust. It's not yet stable. And it actually says, as I'm sure you're aware, in the Rust documentation that this is the best documentation we have. It is not yet really suitable. It's not stable.

CH (01:10:59):

So basically you are missing a long history for the compiler linker and a stable history of the product. And yeah, there's a lot of other things you can talk about on languages. As I say, I write a lot of Ada and particularly SPARK Ada, and we're working closely with AdaCore on that. But AdaCore is now, of course, supporting Rust as well. And I think Rust may be the future eventually once it's stabilizes.

EW (01:11:32):

Once it stabilizes.

CH (01:11:32):

Yeah.

EW (01:11:33):

That's always been my caveat as well.

CH (01:11:37):

Yeah.

EW (01:11:39):

And you said at the top that there are plenty of opportunities in this area. If someone wants to work in the safety-critical systems, one, how do they get into it, and two, what skills do they need to develop first?

CH (01:11:57):

That's a really good question. Yes. Yes, it is a growth area. Yes. And what's more, it is an interesting area, because of all of the things we've been discussing today, the language support and all of these things, accidental systems, how we handle accidental systems, whether we should be looking at SOTIF, or whether we should be looking at failure.

CH (01:12:20):

There's a lot of research going on, a lot of interesting stuff going on. The problem is that it's basically an old man's game. And yeah, when I go to conferences, which I do quite regularly, I think I probably lower the average age of the people by attending, which is really worrying. Yeah.

CH (01:12:43):

And most of the people giving the presentations, most of the people at conferences are men. And I think that's got to change. There was a really useful thing at the Safety-Critical Systems Congress last year, where a young woman stood up and gave a presentation on how they're intending to make this more inclusive, but, hasn't happened.

CH (01:13:09):

The trouble is education. I was giving IEEE chat, some years ago now, and I had an audience full of academics. And I said, "Okay. Which of you teach some form of computing?" And most of them put the hands up. "Okay. Which of you teach embedded computing?" And I saw a few put their hands up.

CH (01:13:29):

"How many of you teach anything to do with safety-critical embedded programming?" And there's one university, University of Waterloo, a chap from there put his hand up. So this is not being taught in the universities and therefore it is coming up the hard way.

CH (01:13:48):

And so I think the way to do it is you could just go get in. We are looking for people at the moment. Everybody is looking for people. ... There's three levels of skill that people need. There's skill in software engineering in general. There is skill in the particular vertical area, whether that's railway trains, or medical devices, or whatever. And there is then skill in the safety critical stuff.

CH (01:14:20):

And I think any company that's looking for people is going to be looking for at least two of those. You're not going to get all three. And so, yeah, you can read books like mine. It's not going to really help that amount. You've got to go out and do it.

CH (01:14:40):

So I think becoming familiar with the embedded software world, as Elecia teaches, and what have you, and then becoming familiar with a vertical market, whether that's aviation, or whether that's autonomous, cars or something like that, and then go, and just apply.

EW (01:15:01):

And do you have any thoughts you'd like to leave us with?

CH (01:15:04):

Well, I think to some extent it was what I've just said, but I think it's worth just repeating. This is a growth area. This is an exciting area. There's lots of research going on on digital clones and all sorts of things that's going on at the moment.

CH (01:15:25):

This is an area where we need young people who are going to take it to the next level. And so let people like myself retire, even, get out of the industry and start lowering the average age of conferences. Yes. Yeah.

EW (01:15:44):

Our speaker has been Chris Hobbs, author of "Embedded Software Development for Safety-Critical Systems." Chris, thank you for being with us.

CH (01:15:53):

Well, as I say, thank you for the invitation. I enjoyed myself. And if there are any further questions, and if they can be made available in some way, I'm very happy to try to address them of course.

EW (01:16:05):

Alright. We will figure that out. I'd like to thank Felipe and Jason from Classpert for making this happen, and our Making Embedded Systems class mentor, Erin, for putting up some of those helpful links of the standards we talked about.

EW (01:16:26):

I'm Elecia White, an instructor for Classpert, teaching the course Making Embedded Systems. And I have a podcast called embedded.fm where we will probably be hearing this interview. Thank you so much for joining us. And we hope you have a good day.