Monday, April 16, 2018

Crash notifier

Say you're on OSX and working with some C++ software that might crash when bugs are found... and say you don't want to tail or search the voluminous logs just to know whether there's crashes to investigate. The crashes will show up in the Console application, but it destroys your performance to leave it open.

The programs I care about all have names starting with XM. This one-liner pops a notification and then opens the crash report in Console, which I can close when I'm done.

Jacks-MacBook-Pro:SCM-Framework jackcoates$ crontab -l | grep Crash
*/60 * * * * find ~/Library/Logs/DiagnosticReports -cmin -60 -name xm* -exec osascript -e 'display notification "{}" with title "Extremely Crashy!"' \; -exec open {} \;

Thursday, April 12, 2018

Where's the Product in AI?

AI & ML products are harder than they look

Copying from Twitter to Blogger like dabbing paint on a cave wall

AI tech is obviously overhyped, and conflating with ideas from science fiction and religion. I prefer using terms like Machine Intelligence and Cognitive Computing, just to avoid the noise. But if we strip away the most unrealistic stuff, there's some interesting paths forward.

The biggest problem is in defining strong semantic paths from the available data to valid use cases. Many approaches founder on assumptions that the data contains value, that the use case can be solved with the data, or that producer and consumer of data use terms the same way.

Given a strong data system, there is a near term opportunity to build AI-powered toolsets that help customers learn and use the data systems that are available. This is a services heavy business with tight integration to data collection and storage.

This has to be human intelligence driven and therefore services-heavy though, because the data and use cases are not similar between budget-owning organizations. There is data system similarity on low-value stories, but high-value stuff is specific to an organization.

That services work should lead to the real opportunity for cognitive computing, which is augmenting human intelligence in narrow fields. If there is room to abstract the data system, there's room to normalize customers to a tool. Then you've got a product plan, similar to SIEMs.

Put products into fields where the data exists, use cases are clear, the past predicts the future, pattern matching and cohort grouping are effective, the problem has enough value in it to justify effort, and outside context problems don't completely derail your model. Simple!

If you can describe the world in numbers without losing important context, then I can express complex relationships between the numbers.

There's a question being begged though... given a data system that successfully models, how much did the advanced system improve over a simpler approach? Is the DNN just figuring out that 95%-ile outliers are interesting?

If a problem can be solved with machine intelligence, great. If the same problem could be solved with basic statistics, that's cheaper to build, operate, and maintain. It'll be interesting to see how this all shakes out.

Update: an interesting take on this from Benedict Evans:

Wednesday, April 11, 2018

Splunk Apps and Add-ons

What are Splunk Apps and Add-ons? What's the difference?

If you're still confused... it's not just you. The confusion roots back to fundamental disagreements on approach that are encoded into every product the company has ever shipped, so it's tough to recommend a meaningful change.

Splunk apps are folders in $SPLUNK_HOME/etc/apps. They're containers that you put splunk objects into. You can put anything in them: code, knowledge management configuration, dashboard elements, libraries, binaries, images, whatever. If you just want to put some stuff together and run it on your laptop, you're done at this point. Put things in a folder for organization. Or don't. Whatever.

If you want to distribute components in a large environment, if you want to depend on shared components, if you want to avoid huge multi-function monoliths, then you start dividing apps into different types. This is why you see the terms "App" and "Add-on" in Splunk. The App refers to the visible front-end app that a user will interact with. The Add-on refers to administrator-only components. This is where the Splexicon definitions start and stop.

There are multiple types of Add-ons. Their definitions are not entirely well established, and have come and gone in official documentation. Right now, it's here, but don't be surprised if that breaks:

Since I helped to write these definitions in the first place, I feel confident in stating what they should be. However, these rules are breached as often as they are observed, and Splunk themselves are the most likely to ignore all of this guidance. If you want to follow the best possible practice, buy Kyle Smith's book and read that. Here are the possible types:

IA: Input Add-on

This includes and configures data collection inputs only. In practice, these are rare and the functionality is usually stuffed into a TA.

TA: Technology Add-on

This includes and configures knowledge management objects. In practice, many TA's also include data collection inputs. A TA would be able to translate the field names provided by a vendor to field names expected by your users, as well as recognizing and tagging specific event types.

SA: Supporting Add-on

This includes supporting libraries and searches needed to manage a class of data. Let's say we're building a security monitor and considering whether authentication attempts seem malicious or not. An SA could include lookup and summary generators to normalize and aggregate the data from many authentication systems and ask generic questions for reporting and alerting.

  • Example: 
  • Goes on Search Heads
  • You should absolutely have savedsearches.conf
  • It would make sense to include lookups and some dashboards, prebuilt panels, modular alerts, modular visualizations
  • Some SA's include all the IA stuff mentioned above.

DA: Domain Add-on

This includes supporting libraries and searches needed to manage a domain of problems. Let's say we're considering PCI requirement 4, focused on antivirus software being present, configured, and not reporting infections. A DA might include lookup and summary generators to prepare those answers, dashboards to investigate further, and correlation searches to alert on problems.
  • Example: (the "dirty dozen" PCI requirements that can be measured from machine data are each represented with a DA)
  • Goes on Search Heads
  • You should absolutely include dashboards, prebuilt panels, modular alerts, modular visualizations
  • It would make sense to include lookups and savedsearches.conf
And so finally, the App.


The front end that ties it all together and makes it usable. If it's done well, users have no idea everything before this was ever involved. This goes on search heads only.