Sunday, July 22, 2018

Tools and the Analyst

also posted as a Twitter thread

Let’s say I’m responsible for a complex system. I might have a lot of titles, but for a big part of my job I’m an analyst of that system. I need tools to help me see into it and change its behavior. As an analyst with a tool, I have some generic use cases the tool needs to meet.

  • Tell me how things are right now
    • What is the state?
    • Is it changing?
  • Tell me how things have been over time?
    • What is the state?
    • Is there any change in progress?
    • Is the state normal?
    • Is the state good/bad/indifferent/unknown?
  • Tell me what I'm supposed to know
    • What is important?
    • What should I mitigate?
    • What can I ignore?
  • Alert me when something needs me
    • What is the problem?
    • What is the impact?
    • Are there any suggested actions?
  • How much can I trust this tool?
    • Do I see outside context changes reflected in it?
    • How does the information it gives me compare with what I see in other tools?
  • How much can I share this tool?
    • Do I understand it well enough to teach it?
    • Can I defend it?
As a generic set of use cases, this is equivalent to the old sysadmin joke, “go away or I will replace you with a small shell script”. A tool that can provide that level of judgement is also capable of doing the analyst’s job. So a lot of tools stop well short of that lofty goal and let the user fill in a great deal.

  • Alert me when a condition is met
  • Tell me how things are right now
  • Tell me how things have been over time?

Maybe the analyst can tie the tool’s output to something else that tries to fill in more meaningful answers, or maybe they just do all of that in their own heads. This is fine at the early adopter end of Geoffrey Moore’s chasm, and many vendors will stare blankly at you if you ask for more.

After all, their customers are using it now! And besides, how could they add intelligence, they don’t know how you want to use their tool? They don’t know your system. But let’s get real, the relationships between customers, vendors, tools, analysts, and systems are not stable.

The system will change, the customer’s goals will change, and the analyst won’t stay with this tool. Even if everything else stays stable, experienced analysts move on to new problems and are replaced by new folks who need to learn.

The result is that tools mature and their user communities shift, growing into mainstream adopters and becoming a norm instead of an outlier. By the time your tool is being introduced to late adopters, it needs to be able to teach a green analyst how to do the job at hand.

How’s that going to work? Here’s a few ideas:

0: ignore the problem. There’s always a cost:benefit analysis to doing something, and nature abhors a vacuum. If a vendor does nothing, perhaps the customer will find it cost-effective to solve the problem instead.
Look at open source software packages aimed into narrow user communities, such as email transfer. Learning to use the tools is a rite of passage to doing the job. This only works because of email-hosting services though.
Because email is generally handled by a 3rd party today, the pool of organizations looking at open source mail transfer agents is self-selected to shops that can take the time to learn the tools.

1: ship with best practices. If the product is aimed at a larger user community, ignoring the problem won't work well. Another approach is to build in expected norms, like the spelling and grammar checkers in modern office suites.
An advanced user will chafe and may turn these features off, but the built-in and automated nature has potential to improve outcomes across the board. That potential is not always realized though, as users can still ignore the tool’s advice.
An outcome of embarrassing typos is one thing, but an outcome of service outage is another. Since there is risk, vendors are incentivized to provide anodyne advice and false-positive prone warnings, which analysts rapidly learn to ignore.

2: invest into a services community and partner ecosystem. No one can teach well as a person who learned first. Some very successful organizations build passionate communities of educators, developers, and deployment engineers.
Organizations with armies of partners have huge reach compared with more narrowly scoped organizations. However, an army marches on its stomach and all these people have to be paid. The overall cost and complexity for a customer goes up in-line with ecosystem size.

3: invest into machine intelligence. If the data has few outside context problems, a machine intelligence approach can help the analyst answer qualitative questions about the data they’re seeing from the system. Normal, abnormal: no prob! Good, bad: maybe.
It takes effort and risk is not eliminated, so it’s best to think of this as a hybrid between the best-practice and services approaches. Consultants need to help with the implementation at any given customer, and the result is a best practice that needs regular re-tuning.

Perhaps we are seeing a reason why most technology vendors don’t last as independent entities very long.

Monday, July 2, 2018

Dev and Test with Customer Data Samples

The short answer is don’t do it. Accepting customer data samples will only lead to sorrow.


  • Not even if they gave it to you on purpose so you can develop for their use case or troubleshoot their problem.
  • The person who wants you to do that may not be fully authorized to give you that data and you may not be fully authorized to keep it. What if both of you had to explain this transaction to your respective VPs? In advance of a press conference?
  • Even if the customer has changed the names, it’s often possible or even easy to triangulate from outside context and reconstruct damaging data. That’s if they did it right, turning Alice into Bob consistently. More likely they wiped all names and replaced them with XXXXXX, in which case you may be lucky, but probably have garbage instead of data.
  • Even if everyone in the transaction today understands everything that is going on… Norton’s Law is going to get you. Someone else will find this sample after you're gone and do a dumb.
  • Instead of taking data directly, work with your customer to produce a safe sample.


REDUCE THE DATA

At first, you may look at a big data problem as a Volume or Velocity issue, but those are scaling issues that are easily dealt with later. Variety is the hardest part of the equation, so it should be handled first.

Are you working with machine-created logs or human-created documents? 


Logs


  1. Find blocks of data that will trigger concern. Since we do not care about recreating a realistic log stream, we can chose to focus only on these events. If we do want the log stream to be realistic for a sales demo, we will need to consider the non-concerning events too, but finding the parts with sensitive data helps you prioritize.

  2. Identify single line events.
    • TIME HOST Service Started Status OK
  3. Identify multi-line transactional events.
    • TIME HOST Received message 123 from Alice to Bob
    • TIME HOST Processed message 123 Status Clean
    • TIME HOST Delivered message 123 to Bob from Alice
  4. Copy the minimum amount of data necessary to produce a trigger into a new file.


Documents


  1. Find individual blocks of data that will trigger concern: names, identifiers, health information, financial information.

  2. Find patterns and sequences of concerning data. For instance a PDF of a government form is a recognizable container of data, so the header is sufficient indicator that you’ve got a problem file. A submitted Microsoft Word resume might have any format, though. 

  3. Copy the minimum amount of data necessary to produce a trigger into a new file.


TOKENIZE THE DATA

Simply replacing the content of all sensitive data fields with XXXXXX will work for a single event, but it destroys context. Transactions make no sense, interactions between systems make no sense, and it’s impossible to use the data sample for anything but demoware. If you need to produce data for developing or testing a real product, you need transactions and interactions.


  1. In the new files, replace the sensitive data blocks with tokens. 
    1. Use a standard format that can be clearly identified and has a beginning and end, such as ###VARIABLE###.
    2. Be descriptive with your variables: ###USER_EMAIL### or ###PROJECT_NAME### or ###PASSWORD_TEXT### make more sense than ###EMADDR###, ###PJID###, or ###PWD###. Characters are cheap. Getting confused is costly.
    3. Note that you may need to use a sequence number to distinguish multiple actors or attributes. For example, an email transaction has a minimum of two accounts involved, so ###EMAIL_ACCOUNT### is insufficient. ###EMAIL_ACCOUNT_1### and ###EMAIL_ACCOUNT_2### will produce better sample data. 

  2. Use randomly generated text or lorem ipsum to replace non-triggering data as needed.
 Defining "as needed" can seem more art than science, but as a rule of thumb it's less important in logs than documents.

GENERATE THE TEST FILES

Raw samples as above are now suitable for processing in a tool like EventGen or GoGen. This allows you to safely produce any desired Volume, Velocity, and Variety of realistic events without directly using customer data or creating a triangulation problem.