Episode 3
Beans, Band-Aids, and Bullets: How Your Data Can Work for You
Listen in as Carolyn and Willie find out the true power of data. Sean Applegate, CTO of SwishData, explains how data can be utilized across an entire mission to empower the warfighter.
Episode Table of Contents
- [01:00] Smart People With Great Ideas
- [07:29] Your Data Can Work for You up to Some Extent
- [14:15] Securing the Application So Your Data Can Work for You
- [22:08] Eight Guiding Principles
- [28:03] Applicability of AI
- Episode Links and Resources
Smart People With Great Ideas
Carolyn: I'm Carolyn Ford and this week, my guest co-host is Willie Hicks, public sector CTO at Dynatrace. I'm super excited that we get to talk to Sean Applegate, CTO of SwishData.
Sean: I'm excited to be here, it should be a blast.
Carolyn: Honestly, this is the best part of my week. This is the best part of what I do. I love talking to really smart people with great ideas about how technology can better our lives and how the government specifically can do that. So, Sean, you've written a lot of stuff. You're a pretty prolific writer, blogs articles, and a recent blog that I saw, I'm not going to lie, it kind of broke my head. It was a lot of technical stuff, but there were a couple of things in it that were kind of gotchas for me. I'd love for you to drill down into a little bit.
At the beginning of your blog, you write, the name of the blog is Optimizing Mission Outcomes with Intelligent Insights. In one of the beginning paragraphs, you say transforming the DoD to a data-centric organization requires that data is visible, accessible, understandable, linked, trustworthy, interoperable, and secure. So I would love for you to dive into that.
Sean: I would say the one thing that DoD is noticing, and you'll see this with some of their DevSecOps reference architectures is it requires culture change. Whether that's the business leaders or the mission leaders, the contractors, the developers, the people running infrastructure, or delivering a service.
Your Data Can Work For You But It Has To Be Accessible
Sean: They've identified that the data has to be accessible across all of those different parts of the mission. That getting that data collectively together is extremely important. It's valuable for both mission velocity and a competitive advantage around the world, whether that's DoD or civilian agencies, we see that as well. So data is critical, be able to find it first.
Carolyn: If you've got the data, what do you mean it's not accessible? Do you mean like across agencies or across groups?
Sean: A lot of it is making it not just within your command, but outside the command. So it's trusted. For example, I'm using an application performance management issue. I'm delivering an application, I have lots of stuff on the application. I may not have a lot of stuff on the user community, or maybe somebody wants to analyze the success of my mission. That mission can be measured, lots of different ways.
How do I merge those data points together? So I can draw, make a business decision from that, that's very impactful. That may be something at a very strategic echelon such as the Pentagon. Or maybe very tactical, down at the tip of the sphere, the unit deployed overseas. I need to make a decision right this minute. How do we do that? That's very complex.
Carolyn: One of the things that gets bounced around a lot these days, you guys have both talked about AIOps. Using AIOps to get us to this place, all of these things that you list. Can you talk about how AI ops enables this?
Your Data Can Work for You Through Problem Solving Complex Things
Sean: On the AIOps side, what we find is, it allows our human workers to better focus on problem-solving and the complex things you can easily do with the computer. The AI piece allows us to typically make linkage. If you think of linked data, it's the dependencies between data points or systems. In many cases, when we look at application performance management, a user might have an issue on the front end. We have to go, what was that issue? The network, is that the desktop? Is it the application web front end, or is it deep back in the database?
Being able to draw that picture out in the end so you can analyze what their dependencies are and understand them, and then do the root cause analysis to figure out where the problem is at is absolutely critical so we can solve those problems faster. And that's really what it's aboutl solving problems quicker, or building better performing team systems so that we can achieve our mission and make citizens happy or empower the war fighter at the tip of the spear.
Carolyn: I was going to ask you to jump in because the whole linkage thing, that's kind of boring when I want to talk about AI. I want to talk about the Terminator or data from Star Trek.
Willie: And Jarvis.
Carolyn: Exactly. Willie, first level, set us on AI. I'll try to keep my fantasies out of the podcast and then respond to what Sean said.
Willie: Just kind of a level set on AI. It's more of a question. All through these conversations with people, there's always a lot of misconceptions about AI, what AI is.
A Narrow Approach to AI
Willie: A lot of names get thrown around machine learning versus are we talking about more of a discreet AI model. All of these different types of AI models that might be out there. We haven't reached what you're talking about with this data-centric view and kind of utilizing AI. I see that more as a very narrow approach to AI. Very narrowly focused on that skillset and not what people think of as general AI, which we haven't reached yet. You hear about IBM Watson, that's a long way from doing the Terminator. That's kind of what I see. Is that the case?
Sean: That's a good summary of it. If we looked at solving problems quickly with technology or making things better with technology, if you consider that artificial intelligence, you can really do things with basic AI today. So if you thought of, I found a problem, the next logical step might be, can I fix it automatically? Or can I build a little bot that can go fix it automatically? We're starting to see that with things like robotic process automation, for things that maybe aren't easily scriptable.
But the citizen developer might be able to build that process into their day-to-day job. That job might be IT operations or application development or running some infrastructure. We've done that in the past with some existing government clients, maybe writing something to analyze complex analysis. When you think of site reliability engineering, you could write some really basic AI or machine learning scripts. Where you could analyze dependencies across functions that you need to monitor in your job that are unique.
Your Data Can Work for You up to Some Extent
Sean: That industry can't do themselves, and you can take something like a TensorFlow or a Pytorch. Do some analysis of basic data sets and do that enroll at your own to some extent. Or unlock those things in a cloud, if you have access to something like an Azure or an AWS where you can do some AI things in the cloud, but you can easily get the data that's accessible and understandable and snap it into that fairly easily.
Willie: Just touching on then taking that to what we're talking about from your blog. I did not read the subsequent DoD document, which was about 700 plus pages. Thank you for that breakdown. What I saw there and why I think AI clicked is that you're talking about at first, the data. We're just talking about data, but we're talking about the full, you're talking about infrastructure. You're talking about the columns between all of these data points. How we secure access, everything needs to be CAC enabled and authenticated and all that good stuff.
But also it came to mind, I've been reading and talking a lot lately about the fall of JEDI, which sounds like a movie. But the new JCW, the new contract that is coming out as a replacement for the JEDI. A lot of that is going to drive this new joint, all domain command, the JADC2 initiative where it's all data-centric. That's tying a lot of data from the battlespace and the tactical edge with a lot of what I don't think we talked about yet.
Sensor Data
Willie: Even a lot of this is going to be sensor data. We started talking about IOT. You started talking about bringing all of this data in from, could be tens of thousands, hundreds of thousands, maybe millions of data points. Something has to parse all of that information to get the right relevant information to the battle. The commanders and the people who need that are allied forces or whoever subscribes to that data. That's where AI also needs to be leveraged, especially when we're talking about what you wrote in your article. Am I close to accurate?
Sean: Absolutely. In fact, there's a program called Advana, which is the DoD program. There's a bunch of approved tools for big data in AI that are included in that. One of the biggest challenges is how do you do that at scale? They're at the very early part of that journey where they figure out, how am I going to do this? They are doing it at the OSD level today, but how do I didn't do that at a mid-tier command? Or how do I do it at the edge of the battlespace? How do you do it in a jet as you're flying and you have to get telemetry off the jet at the end of the mission and analyze as part of the mission.
Those are not small challenges when you think of the massive amount of data across the Department of Defense. How do you make sense of that as a community? Part of that is getting that data into a data lake or a data warehouse somewhere where people can access it. And then do things with it that are valuable because that data has value to it.
Your Data Can Work for You During Time-Sensitive Situations
Sean: Often that's a time-sensitive situation where you need to analyze it within minutes or days, not months or years.
Carolyn: JC2. I love the idea of it. Define JC2 for me, Willie.
Willie: The joint all domain kind of command and control initiative, or it's JADC2. It's this idea born out of the communication signaling part of the military where they have all of this data and Sean is spot on. I'm not a military person, not trying to pretend like I'm military. But I can see, the vision here is that the wars of the future are not just going to be fought with bullets and putting steel on target. But it's also who's going to have control of the data and that space, and who's going to be able to find answers and execute on a mission faster.
This is a new arms race, this is why AI is so important. This is why you see all of this talk about AI and how the Department of Defense and the US need to be really focused on our AI capabilities. We can bring to bear the technology we're going to need to analyze all of this data from the data space. To Sean's point, this data could be coming from land, sea, under the sea from, literally soldiers on the ground who are wearing sensors.
It could be coming from satellites, whether all of this data has to go in. Again, not being military, but I understand that all of these pieces have to be aligned. To get a good view of the battle space, understand where you need to have your troops, how you need to have them there.
How Best Decisions Are Made
Willie: What do they need to be equipped with? All of these things need to be understood. The best decisions can be made and whoever can make those decisions the fastest. To Sean's point, there used to be a time we could make these decisions on days. I'm sure back in world war two, and you look at the planning of D-Day and things like that.
Things that were in the works for days and months, we might have hours or minutes to make a decision. Humans just can't, it is just impossible to parse all of that data. To make a good decision without decision support services from something like an AI. Does that make sense?
Carolyn: Yes and what you said, I translate it as we're going to have this central command and control for all DoD, maybe even some fed civ agencies. Here's why I say it kind of scares me and coming back to you, Sean, how do we secure it? How do we trust it? If we've got it, it's coming in from everywhere.
Sean: Generally speaking, if you look at the zero trust piece of it first, let's break that down. There's the zero trust architecture. So NIST 800-207 and there's the DOD zero trust reference architecture, which came out about three months ago. There are seven pillars. But if you break them down to the most basic functions, it's about securing a device. Making sure it's not compromised before you let in the environment, securing the users and the applications. Typically what we find is securing users and devices are the easier pieces.
Securing the Application So Your Data Can Work for You
Sean: Securing the application and when we say securing, not just compliance, but actually knowing and measuring that it is secure in real time. Finding the open-source module or the function or method that is secure, that the developer can rapidly fix on their own. It’s where AI can definitely help us find that because they can measure those things with APM technologies, with integrated security. We can trigger and tell the dev ops team or the no ops team, you have an issue.
Go take care of it immediately. If they're managing a small service or a function, they can go fix that in a couple of hours and it's fixed. We're starting to see that in the platform one environment and DoD, where they're patching containers every day in 24 hours. You have a team patching the Tomcat container for the web front end every 24 hours. But the rest of DoD subscribes to that container, that hardened container.
They're getting that patch and leveraging that fix without having to do actual work themselves. Getting that team that runs that container, that owns the security for that container DoD-wide. Where they can patch it as fast as possible and know the exact function or method they have to fix is important.
More importantly, if a large percentage of DoD applications rely on a core set of containers being shared in the community, you also have to make sure those containers operate and perform meticulously. If I have a team that's supporting that, they have to make sure it runs well. They QA it properly, they pull their performance testing left into their dev cycles.
The Integrity of Data
Sean: When they publish it every 24 hours, which is a lot of publishing a year, that it is running smoothly and not having any problems. Then the question becomes, how do those teams then monitor those in production at scale, if they're across hundreds of applications, for example, across DoD.
Carolyn: I heard a couple of things from another show. My past life is an insider threat. For you to say that it's harder to secure the app than the user, we can debate that later. To be secured, what I heard was AI DevSecOps. We're baking it in at this ground level, we're using AI to do it. That means it's coming, it's the integrity of the data. The integrity of the containers are built from birth.
Sean: That's a general way to approach it. It depends on what you mean by birth, but yes. If you've mentioned having a birthing a new baby, every 24 hours, sure. It comes from the top down. Because the team's going to turn over, in this case, we're really treating them more like cattle. If you want to use the DevOps term, we're going to not treat it like a pet. But to that team that manages that one container or those five containers that a lot of people use, that's a very important asset in their life that they have to care and feed for and nurture.
Those things come in lots of different flavors. But if you're a developer, you have to own everything about that container, that function that you're going to share with the rest of DoD and the community. So how do we make sure it runs well and it's secure? Then the question might be, would you?
How Your Data Can Work for You From a Data Accessibility Standpoint
Sean: From a data accessibility standpoint, I'd like to know how those containers are working not just in your application, but in other applications around DoD. You can make a lot more decisions and support it better if you can then access data across the organization and pull and work together across say a 10 application team. You're supporting them in ways that they care about that affect the mission.
Carolyn: Once you plug your container into the mothership, then you can send sensors out and see how it's integrating, assimilating to everybody else.
Sean: Sure, if you want to get the sensors for a minute. If we talk about applications, specifically containers, you can either go with some type of open sensor for application performance management to get things like metrics, logs, and transactions out of it. You could use something like OpenTelemetry or Fluentd or Telegraph or Statsd. There's lots of options that are open source that are supportable across different application performance management platforms. Or if you want to use the word observability platforms, those as well.
The question for the government might be why wouldn't you embed that inherently, those sensors, if you offer for a DevOps team or an SRE team inherently as part of your build cycle. They are there, and then you can leverage them across lots of observability platforms. Then an organization can pick the one that's best for them that they liked the best. Or maybe they can pick the one with the most advanced AI functionality.
One of the Hardest Problems
Sean: That's why sometimes getting the data into your platforms from lots of systems in DoD is one of the hardest problems. Because they've got to go through an authority to operate the process and to get that approved. It's great, but making changes to it after the ATO sometimes can be a little challenging. If you build your sensors and as part of that process inherently, it becomes a lot easier to get the data out later, in a more open fashion, potentially.
Willie: That actually begs a question I had and this is a slight tangent, but I'm just curious. You triggered a thought in my mind, Sean. Are you seeing from your customers and the clients you work with, are speaking or baking this into the product? So there's these new concepts around like open telemetry and building this type of telemetry into the application. It can be exposed at runtime and be pulled in by any number of these tools that you're talking about. You can get a complete, more full picture of the landscape. Is that something you're seeing as well?
Sean: We are seeing it in different agencies. Some of the civilian agencies have been more focused on things like distributed tracing and writing things into their application code. While that's a noble effort, it leaves a lot of the infrastructure not covered. It becomes hard to connect the dots from your application code or the web front end, which normally the app guys have pretty well covered. But then, if you look at cloud infrastructure or on-prem hardware infrastructure, it leaves it uncovered in most cases. If you consider the network, another piece of that you want to pay attention...