Monday, July 18, 2022

Supporting a Product’s Cluster Headaches


Cluster headaches are problem reports that make everyone else in the organization ask “are we seeing that too?” and pile on. Alice suspects a memory leak and files a report. Bob tags his customers to it as well. They start discussing general performance questions in a chat or a meeting with a broad audience. Charlene through Zachariah pile on with more detail, some of which is relevant. A few days later, the root cause of Alice’s problem is found in an environmentally specific misconfiguration, but by now you’ve got an executive asking when you’re going to fix the memory leaks.

As a product manager, you may find these incidents annoying. They distract and disturb the engineers and stir up trouble with the field. However, people are people and they’re going to pattern match. Its not in your best interest to pour cold water on field people trying to help. Instead, look for ways to use these incidents to drive improvement.

  • The best disinfectant is sunlight. Open conversation about the troubleshooting process keeps everyone aware (Slack is great for this, ideally with daily summation to the ticket, but some teams use long-term meetings instead). As it becomes clear that Alice’s ticket is not what everyone else thought, their willingness to pile on decreases. There is a limit to effective openness though: people can misinterpret comments and egos can get bruised. The time lag of email based ticket comments is particularly bad for this. Someone may need to referee and keep conversation productive. 
  • Recognize that there is a problem. As a development team, perhaps you can look at this situation as a symptom of something to resolve. The X to this Y may be that there are legitimate concerns about resource utilization, and that troubleshooting those concerns is difficult. Can your team do something to improve that experience? Adding metrics and alerting on known bad states is almost always useful.
  • Where there’s smoke, there’s often fire. If cluster headaches keep popping up around the same component, that’s a signal of fear, uncertainty, and doubt. Increase enablement for that component, and listen to what the field says. If they don’t trust it they won’t sell it, so you will not be successful until they understand and trust it.

Saturday, July 9, 2022

Planning R&D Time

Granted that reality will disrupt the plan, I still find it useful to do a quarterly planning exercise. I'm fond of doing this with a zero based budget of person-time that has already had maintenance requirements removed. So you've got N full-time equivalent (FTE) people. They're going to be sick and vacating and training 25% of the quarter, and they're going to spend 40% of the remainder in meetings. If you have 10 people (keeping math simple), that’s 1920 hours of development in a quarter. Divide that in half to get the budget for feature work: 710 hours for maintenance, 710 for roadmap feature cards. Now, scope and prioritize the features you will try to deliver in this quarter. Scope doesn’t have to be perfect, an engineering manager’s estimate is plenty at this stage.

This exercise can be done in all sorts of ways, the tools and structure don’t matter. I’ve participated as everything from company leader to middle manager to product manager to subject matter expert. Maybe it’s a bunch of people in a room with post its and masking tape on a wall. Maybe it’s a series of screen share calls or physical meetings with a spreadsheet or a project tracking tool. Maybe meetings are with the whole leadership team, or subsets, or mixed. I have some dislike for a spreadsheet approach: specific project management tools or post it’s on a wall make it harder to accidentally over-allocate resources. Nothing can stop a team determined to tell themselves fibs about capacity though.

Here is the hard part: you have to leave the maintenance work untracked at the roadmap level. It’s whatever engineers feel is necessary. That budget isn’t open to negotiation or justification, it is a requirement of selling software that needs to be maintained. When a leadership team breaks this rule, then the maintenance work is no longer protected from the budgeting process. That invariably leads to it getting deferred, because hope of making money from a new feature is more pleasant than fear of losing money from decreasing quality. That deferred maintenance eventually comes back to haunt the organization, producing the highly prioritized “product get-well initiative”.

My advice to give engineers freedom to maintain is not absolute, to be clear: maintenance occurs within a budget of time. Want to refactor everything or start over in a new language? You should have to convince the entire organization to sign off on that. But you shouldn’t have to fight the budget wars in order to fix bugs.

Monday, July 4, 2022

Internal Transfers

Role, tech stack, and culture. A new hire for a role must come up to speed on all three areas, and will be coming from behind on at least one. Some organizations are lucky enough to have a broadly common set of cultural or role expectations so that people can easily transfer skills from elsewhere. Some operate in a popular tech stack that many others use as well. However, no organization is completely identical to any other, and a new hire must adjust to stated and unstated requirements.

An internal hire is already familiar with two of the three areas. Consequently they can be way faster to come up to speed. They are also a known cultural quantity, since the hiring team has had a lot more opportunities to interact than any interview cycle can provide. This means internal transfers are more likely to be a net positive than external hires when they are possible. Hiring managers will prefer them, all other things being equal.

There are arguments against “allowing” “your people” to transfer to other teams, which are largely nonsense. It's far better for the company to transfer and grow people than it is to lose them when they're ready to move beyond their role, and very few organizations have realistic growth opportunities that involve staying in one team. Put differently, if kept in one role an individual contributor (IC) can grow to be an IC over more stuff (broad instead of deep), an IC with more influence (formal guidance role), a people lead of some sort, or a combo of those things. But not everyone can meet all of their needs without a substantive role change, and there are not enough slots to allow that many role changes. Let's say you hire really well and grow 30% YOY. If half of your organization’s ICs feel ready to move up a notch in a year, at best you can serve half of them, and that's if your team gets an equal share of the growth pie. Much more likely that there's one or two growth roles per team per year and a half dozen people who should get one. If your organization doesn't have a way for them to grow inside, maybe they're going to go outside. If that IC moves to another team in your organization instead, you have much more potential to replace them on your schedule with a hire of your choice. Note the assumption of a backfill req; the ability to backfill hasn’t got anything to do with where the IC left to.

An internal hire is not guaranteed to be successful though, particularly when the role is more challenging or not well understood. Product management roles have a notably high level of bounce-off from internal hires in my experience. 

There is also the potential that an internal candidate won't pass the interviews. If it's made clear how and why they didn't pass and therefore where they might improve, they may be reenergized to serve in current role while preparing for another run at this or another new role. Without that clear feedback, the person could feel like they’re just being unfairly blocked from a hierarchy. In the absence of information many people assume the worst. They may become unmotivated or immediately move to seeking external alternatives.

Managers need to ensure the team goals are met and need to support the people who report to them in seeking their own goals. You may have wants like “minimize disruption” and “do more projects” but it's important to note those are preference rather than requirement. As a manager participating in an internal transfer, it is important to recognize the transition promptly. Set a timeline to stopping the old work and starting the new, then honor it.

Sunday, July 3, 2022

Executive Dashboards

Executive dashboards go through the Tuckman model… like team members, they have to be understood before they are accepted. 

Forming: a need for a dashboard is recognized and that dashboard is introduced to the executive staff. It may or may not be challenged, but its place is not certain.

Storming: if not at introduction then soon after, the dashboard’s ability to accurately reflect reality will be challenged. Where is this data from, how is it generated and collected, what biases does it reflect, how much delay does it contain, what role will Goodhart’s Law play if we rely on it. Perfection is usually recognized as unattainable, but most executives will want to know what risks are encoded in the tool. This process is ideally conducted offline as preparation for staff or board meetings; it’s a sign of ill health if regular meetings are derailed with storming about the tools.

Performing: Once the dashboard is understood, the executive team can work with it without a deep dive into the data that backs it, until something changes. Conditions of reality are altered, there’s additions or subtractions in the executive team, or the tools used for reporting change. Then the model starts over.

As below, so above; as above, so below. A similar process can be so served in relationships between field (sales and support) and factory (product and engineering). The list of hot issues is a dashboard.

The best choice for discoverability is spreadsheet: links to data are relatively easy to follow, and formulas and lookups are readily followed by executives with varied backgrounds. Unfortunately spreadsheets make it easy to break data links and introduce staleness. Worse, their legibility makes them corruptible; anyone working with it can accidentally change the behavior. There are mitigations and workarounds to all problems but they add brittleness. More modern data reporting systems can reduce risk of corruption by using role based access control at more granular levels. They can also help with staleness by being closer to data collection. Unfortunately, most of these systems lose the discoverability of a spreadsheet, so in my experience the most common dashboard tool for planning is still a spreadsheet.