Saturday, May 29, 2021

What Should Go Into a CMDB

 

It’s not every day that information technology work leads you into philosophy, but designing a configuration management database will do it. Spend a little while thinking about what is known or even knowable about the services you’re trying to provide, maybe you’ll end up asking “what does existence even mean?” Fear not, there are some practical guiding principles to follow. 

First, some background: what is the purpose of a Configuration Management Data Base (CMDB)? Why are we even trying to align Configuration Items (CIs) to the Services that they provide? The intention is to provide visibility into what affects what. The purpose of that visibility is to ensure reliability of changes, by understanding what is affected and planning in advance to minimize service interruption.

Of course, that goal is not simple to achieve. If it were really simple, it would just be a notebook or a spreadsheet, right? But the modern organization has a complex web of interdependent systems. A flat sheet of items misses so much of reality that it fails to make life any easier. On the other end of the complexity spectrum, there’s a very careful and intentional attempt to describe everything important. Let’s pretend that’s possible for a moment, which I doubt… it’s still unlikely that your organization can do it at a justifiable cost. If the system fails to provide return on investment, then it’s scrapped with good reason.

So, the CMDB should only track what is truly necessary to affordably meet the goal of visibility driving change safety. Common advice is to begin with a map of CIs to service offerings. But, is that a detailed 1:1 map? Sort of, maybe. That’s a fine directional goal, but you should not insist on or try to attain a complete CI to service offering mapping on day one. You might start off asking “which service offering does this CI map to in the service catalog”, but it’s all too easy to wander into epistemology. Business leaders are asking for things that can’t be easily made visible and so you end up doing service mappings by proxy… which doesn’t work. 

If you can’t draw a clear line from “this attribute on this entity maps to that attribute on that entity to support this metric indicating that service”, then it’s dreams going into the system, not measurements. So only put in things in that you have to because it’s obvious that they work to support a service. For instance, let’s say we’re talking about a job processing farm. The NAS volumes support the VMs that perform the jobs that test the designs. You can say “this volume supports that job” clearly, but you will struggle to say “this NAS failure is costing that business unit an on-time delivery of that deal” because you don’t have sufficient context in your software. In an organization with sufficient complexity to be considering a CMDB, the business map isn’t clear and the technology map is only a little better. 

What is a CI anyway, and how are you going to identify it? Is a group of ephemeral containers providing services worth a CI each, or one for the service, or none? If you’ve got a big expensive server and you replace its motherboard so the serials and MACs change, do you update all your records? What if you just changed its hostname? Does it still support the same services that it used to? How do you reference the systems you run on a cloud Infrastructure as a Service (IaaS)? Do you need a CI for each of the Software as a Service (SaaS) functions that you use?

The simplest answer to those questions is to manually define and track systems, ignoring identifying attributes like serials and MACs. However, this leaves recalls, warranty tracking, and depreciation out of the CMDB and in another tool, reducing the cost justification for the CMDB. Maybe that works for your organization, but it’s a decision that has got to be done across the organization so that you can present meaningful numbers in your KPIs and compliance audits.

What types of devices should go into the database? Does a network switch count, since it’s critical for making your organization’s software service operate? How about the AC power or HVAC systems that are critical for keeping the switch work? How about the roof that keeps the rain off the switch? Again, the philosophy of everything being interconnected is fascinating but you’ve got a job to do. So, a CI is any item that you can actively manage. I take an absolutist approach here: if there is not a software tool providing visibility and control to the device, then it is not a CI worth tracking.

My advice is to draw a CI line at the management interface. Only put a thing into the data system if it has an agent on it or an interface that enables a management system that you own to collect data. Again, three quarters of my career is in those agents, so call it self-serving if you like. But if the device can’t update your systems on its own; then someone has a job to maintain the “time-saving” tool. If you can afford to allocate people to maintaining data cleanliness in a CMDB so ITAM reports make sense and ITSM service requests go smoothly, that’s great, but it sounds like questionable use of resources at any scale. As Corey Quinn writes, “What people lose sight of is that infrastructure, in almost every case, costs less than payroll.”

Next, treat every collected attribute like it’s a personal insult. For every report, ask how little can you possibly collect and still solve the problem. Is there already data collected that will work? This is the land of practically unbounded high velocity data sets, and scale is going to be a problem. Habits built with a small number of entities will fall apart quickly as your organization grows. 

So that’s CIs… now for the services that they provide. Definitionally, a service offering is something your stakeholders and customers can directly use. But, remember that the point of the CMDB is to de-risk changes, meaning you have to map things that might get changed to that service. Service: Our self-hosted website is up. CIs to provide that: dozens of servers, plus all sorts of unknowables like network and power and HVAC and a functioning civil society. That said, while a one-to-many service:CI map is more complex, it’s the only model that is at all realistic. Business users requesting changes and reporting problems should have no need to select CIs and suggest solutions, so the complexity of looking down the tree from service offering shouldn’t matter. IT operators requesting changes and reporting problems do need to see what those changes affect, and looking up the tree to affected services is theoretically useful. However, those two statements rely on an accurate service:CI map, and that accuracy is sorely lacking. More likely, the business requestor is aware of problematic CIs because they’ve been troubleshooting the problem on their own, and the IT operator is not aware of affected services because the CI has been multi-tasked or repurposed. Therefore, incident triage and handling often include preliminary discovery of the functional map, if possible.

Exceptions to that grim state of affairs can exist: just as I recommend the definition of a CI is limited to an automatically discoverable entity, I also recommend that the definition of a service is limited to machine readable labels. Tagging or labeling CIs with the services that they support allows the CI data collection mechanism to be used to support CI to service mapping. Better, this allows an organization to begin with manual process (apply this label to anything you spin up with this account in these availability zones) and then grow to automation (if spinning up from this image and running this process, then include that label). That way the organization does not have to begin with perfect in order to get somewhat better.

VMBlog Post on Decentralization

 linking to this piece I wrote for VMblog 

Why Decentralized Work Calls for Decentralized Data

Sunday, May 9, 2021

Planning your Year

The plan is maybe accurate, but probably not. The act of planning is a useful time to step back, evaluate strategic position and rethink investment. Using financial targets is a mechanism. If target is thirty percent growth, does the product org have a realistic answer for how that will happen? If that answer is couched in details like “in eleven months we’ll ship the grand frobulator version 2.13.32.005”, then it’s maybe not so realistic. 

For a non-SaaS software offering it takes a quarter for product changes to affect sales numbers, and a year to know if you’re going to get renewals or just have a flash in the pan. It takes a quarter to deliver a change to an existing product. So if you have to make product changes to affect this year’s numbers and you’re just planning them now, you’ve got all your eggs in Q4. It’s key to realize that product time scales are very different from sales time scales. And if your team isn’t already big enough to deliver the features, you’re even worse off because hiring and developing engineers is even slower than making features.

Whatever plan you offer for making product in the second half of this year is setting up for next year’s sales. Why can’t you just parallelize those efforts? Well, customers are rightly suspicious of features that are planned but not shipped. The world of enterprise software sales is also still relatively small and interconnected, so your own team cares about their reputations. They won’t be happy selling sizzle unachievably ahead of steak.

Software as a Service doesn’t change those dynamics as much as people might assume that it does, but it does help to smooth out a spiky cash flow. Whether you’re selling term or perm, on-prem or as a service, trying to move your current revenue numbers with expected product features is very prone to forecast failure.

Using a Technical Edge in Products

"Don't worry about people stealing an idea. If it's original, you will have to ram it down their throats." -- Howard Aiken

Innovative technologies are different than what came before. Product buying patterns are based on what has come before. If you have an innovative technology you will need to bend your customer’s stated needs and your technology’s capabilities to fit each other. Product market fit is not easy or pretty.

“But wait,” you might be thinking, “isn’t listening to the customer’s need the most important thing in product management?” Well, sort of. It’s not listening to what the customer says they need, it’s listening to what they need. The customer’s stated need is shaped by several components:

  • Their actual need to do something, such as pass a compliance audit or roll out a new service
  • Their experiences with other products that they’ve used for this purpose in the past
  • Their team’s appetite for innovation versus surety
  • Their purchasing process.

That last one is extra challenging; if your product fits into the buyer’s signing authority, then great, but if you’ve got to make it through an RFP-gated procurement office, your innovative product is going to be checklist compared with the last twenty years of status quo. The only way through is for your buyer to spend their political capital and force the issue.

If your innovative technology can help the customer solve their real need to do something faster and cheaper than legacy technology, then lean into that and give your buyer the weapons they need. You can get past experience and conservatism and even a purchasing department if you’re really helping the customer save money and time.

I’ll spare the full list of examples, but there are many; trust me that technologies now considered boring infrastructure, such as gratuitous ARP NIC teaming, were once radical and difficult to sell.

Two Types of Questioning


 Answers to questions can easily fit into two flavors: operationalized and free-form. Classify the use cases: there’s the questions you know how to ask, and the questions you don’t know to ask yet.

A question that you know how to ask is operationalized. You’re looking for yes, no, or broken, or perhaps a count. The operational nature of the question means you can improve operations of your system by using the questions. 

You may need to use a series of operationalized questions to drill down to success.

* A good operationalized pattern uses multiple questions to reduce the target set at each step: “are there systems with problems?” -> “there are systems with problems” -> “Show me the systems with the problems” -> “here are the types of problems on the systems with the problems” -> click “Show me the ones with the problem I know how to fix” -> “here are the systems with that problem” -> “deploy a fix for that problem”. This is good because it’s efficient: each step is small, and each step is hitting fewer targets. Whether your problem domain is management, computation, or surveying humans, you'll use fewer resources if you ask fewer questions of fewer targets.

* A bad operationalized pattern == “give me all the data and i’ll search for answers in it”. This is misuse of an effective tool: search through raw data is powerful for discovering what you don’t know to ask, but it can be the wrong tool for daily repetition tasks. It works, but it costs more time and money than necessary.

Noted, it is possible to take the progressive questions pattern entirely too far, as is shown by Microsoft “click to see more” Teams. A forced wizard flow where it isn’t necessary is an anti-pattern. Progressive disclosure of necessary data can become an anti-pattern.

A question that you don’t know how to ask is free-form. You’re looking for weirdness, patterns, outliers, intuition. Is there an anomalous behavior pattern on a subset of systems? That’s hard to answer without a big data lake and a Stats101 textbook, so you stream data at the lake and see what kind of stuff can be found. Algorithms can help, but you’ll also certainly need human analysts. And the findings from that data lake, you will probably want to convert to operationalized questions.

This is a lifecycle of discovery, a process of learning. Operationalized questions grow stale over time, and need to be replaced. Part of the job as an analyst is to maintain the tools.

The Mystical Art of Getting New Things Done

The process isn’t actually mystical: it’s selling an idea to everyone who must work together to execute it. People who would use it. People who will fund it. People who will build it. People who will sell it. Eventually, people who will buy it.

Tell people what you want to do. Lots of people. Product people, engineering people, sales, marketing, finance. Whoever will listen. Even if they're not helping you move the ball forward, they're helping you practice and learn. Explain why it needs to be done, how it will work, what will indicate that it’s working, and who would need to do it. Use a written artifact (document or slide deck) to summarize what you’ve been saying.

Answer their questions. If they ask things you can’t answer, get back to them when you can in writing, and update your written artifact.

Ask everyone you talk to who else you should talk to. Keep going until you’ve got resources. Keep going until they build the right thing. Keep going until customers use it. Keep going until the goals are met.

So far so good, right? Except, that is the process for people who’ve already got the expectation that they drive new ideas to executed success. Product leaders. Maybe that's a product manager in your organization, or an engineering manager, or a service owner. If that isn’t your role, then you’re also adding a second sales job.

First, you need to understand what you want. Are you asking to have the job of executing your idea? Or are you asking someone else to execute on your idea?

If you want the job, it’s actually much simpler to proceed. You’re explaining why you are the right person to make this happen. Details of how to proceed will depend on your organization. Do you have to get the role before you can act the part? Then you need to interview for the role. Does your organization reward acting outside of your lane? Then you need to make sure everyone involved sees that you’re doing so, acting as an unpaid product manager (presumably without abandoning the responsibilities you are paid for). Regardless of the mechanics, you’re effectively campaigning for a job as a product leader so you need to do what is necessary in your organization to get that job along with execution of your idea.

If you don’t want that for your career, then you are seeking a product leader to own your idea. Again, a second thread from driving the idea to execution. Thing is, few product leaders are sitting around bereft of ideas and trying to figure out what to have the engineers build. So, why should your idea be what gets done now? Another challenge you’re going to need to consider is ownership. Are you truly ready to give up ownership of your idea, or do you want to retain some stake? Can you find an accommodation with your product leader partner?

Friday, April 23, 2021

Data Value and Volume are Inversely Proportional

In 2006, Clive Humby coined the phrase “Data is the new oil”. This is often misinterpreted as “Data powers the economy”, particularly by folks who sell data processing and storage, but it’s useful to see what someone who actually uses data says. In 2013 Michael Palmer, of the Association of National Advertisers, expanded on Humby’s quote: “Data is just like crude. It’s valuable, but if unrefined it cannot really be used. It has to be changed into gas, plastic, chemicals, etc to create a valuable entity that drives profitable activity; so must data be broken down, analysed for it to have value." 

Much like refined oil (or processed ore for that matter), the volume of material is reduced as the value is increased. This concept should be intuitive to anyone who’s ever manually sifted a pile of data into a thin report, but it’s sometimes lost in contexts where locating a needle in the haystack is the analyst’s goal. Adding human effort reduces volume.

Corollary: if the project requires massive amounts of data storage, it might be worth asking how much value is going to be in there? Is the purpose to store as cheaply as possible and rarely retrieve? Or maybe the plan is to figure out value later

There’s an interesting confluence of partially aligned incentives between people who have to retain large amounts of data, people who want to retain and access large amounts of data for analytics, and people who just want answers to problems today. “Storage and compute are practically free”, which is why Amazon Web Services is worth 1.5 trillion USD in 2021. If you don't have to pay the bills, collecting raw data for real time analytics sounds great. Otherwise, you'll need to consider this project's needs for data modeling.