Sunday, May 31, 2015

Velocity Takeaways

I had the privilege of attending the 2015 Velocity Conference in Santa Clara.  It was an amazing show with a ton of great speakers and even more exhibitors.  I always find the interactions at the booths are were a lot of knowledge transfer happens.  I did hear a pretty distinct pattern in a lot of the offering though.  The pitch went something along the lines of "we get you the metrics to find root cause".  This messaging sounds pretty amazing at face value! I believe it is imperative to determine what the speaker means by root cause, and how many scenarios they are ready for.  To determine what types of problem patterns a monitoring solution could potentially solve, I look at three components.  The three are; data collection, data analysis and data presentation.  This article will cover the first.


Data Collection

For monitoring solutions, they can only solve problems they can see.  Are they collecting end user clicks? If not, then they cannot determine the impact to end users.  Seems simple enough, but you should have an in depth conversation on the mechanism that they collect those statistics.

Some tips on collection

1.  If they do not have a library or agent somehow injected into the running process, they will not get root cause on the call stack.  Examples:
PurePath view from Dynatrace Application Monitoring












  • sync issues
  • CPU consumption hogs
  • exceptions
  • correlating log messages to called transaction

2.  If they do not have network based capture, they will not get network issue resolution.  Example:
Network heath breakdown from Dynatrace DCRUM











  • Retransmission issues
  • Packet loss
  • Network redirects  

3.  End user device capture was the biggest opportunity for business at velocity this year.  The key call outs for this to be possible are:
Visit capture screen from Dynatrace User Experience Management
















  • SDK for native devices
  • Ability to see non-web based transactions (think traditional thick client requirements)
  • JS agents injected into browser/mobile browser*
              *Watch out for W3C based capturing if you are using Angular.JS.  W3C timings will not get the same visibility as other frameworks.



Final Thoughts

All in all, the conference was a ton of fun.  I was able to see what others in the industry are releasing into this space.  I even got a demo of New Relics and AppDynamics portal:















Although a lot of organizations where touting the "root cause" messaging, I only saw some glimmers of others demonstrating a true root cause analysis.  The more complex applications have become, the more powerful monitoring tools have started to pull away in this fight.   I have yet to find the silver bullet for performance analysis, and probably never will, but the journey is very entertaining!

Monday, May 18, 2015

DevOps is Not Just a Tool

I was forwarded an interesting article last week and it really made me think.  The article in question is  written by John Allspaw and entitled "An Open Letter To Monitoring/Metrics/Alerting Companies".  The main point John makes is regardless how advanced and powerful a piece of software is, there will always be a need for a user on the other side of the screen making sense of the data.  Software at its core is just a product.  I agree with John that it is disingenuous for a company to pitch their product as "the end all be all solution for better troubleshooting".  There are other components in making a true DevOps practice a reality.

I am a huge fan of the CNBC show The Profit.  If you have never watched it before; the host, Markus Lemonis travels around and invests his personal capital in struggling businesses.  He dedicates his time and effort into getting businesses on an exponential growth path.  Markus always refers to a simple formula for success.  The formula consists of the three P's; Product, Process, and People.  A true DevOps environment will also include the three P's.


Product

                           Is it all there?

The tools of an organization working on DevOps practices must have the ability to facilitate cross team conversations.  This means that the tool (or tools) implemented must present data in multiple ways that can be interpreted by different teams.  So logically, the tool must be able to do one thing above all else, collect data.  Analyzing incomplete pictures will always lead to incomplete results.  Be weary of solutions that will always "filter out noise".  The hardest problems to solve always seem to be in the "noise" of the environment.


Process

                         It's ok to look down

Why does your company do things the way it does?  Why do you have those status calls?  Why do you have an internal forum?  If you cannot answer these questions, I do not want to talk to you.  You should have a clear understanding of the process of your organization and they "why" behind that process.  In simpler terms, process is meant for the sake of progress.  In a DevOps world the process should facilitate communication between all the groups involved in the SDLC.  I have seen some of the best organizations put Ops members on the weekly Dev touch bases and vice versa.  Production problems suck.  They are stressful, a huge drain on the business and a black eye on the brand.  If an organization takes the steps required to allow honest discussions regularly you will see a noticeable drop in P1 production issues.


People

                                    No computer will replace you

The center of the DevOps movement is the people.  Bottom line.  To build the best applications, you need the best people.  I have worked with some amazing engineers, and some not so amazing engineers.  I have sadly had to walk away from potential deals because I know the group that would be responsible for understanding the data was not capable of doing so.  The reason why I walk away from those situations is the same reason a chainsaw company will not sell its product to a 10 year old.  The child is not ready and will likely hurt itself as opposed to providing any value.


Final Rant

I hate the fact that organizations look down on services when evaluating potential purchases.  Requiring services does not mean the product is hard to understand, more so your organization requires some assistance in perfecting your individual products, process, and people.  You cannot honestly think your team will instantly become better just by installing something in your environment.  Just like learning anything new, it will require changes in all three fronts to truly become a DevOps shop.



Friday, May 15, 2015

Tough Questions: Dynatrace

DISCLAIMER: All views and opinions are based off of individual experiences and are not reflective of anyone or anything other than myself.  The information stated below is has either been obtained from personal engagements or available via other publications.

If I did not work for such a great company, I may get called into the principals office for this one.  Dynatrace has taken a unique journey since its inception in 2007.  The customer list is over 6,000 and includes nine of the top ten retailers and banks (who is holding out?).  As of right now, Dynatrace is the market share leader when compared to the other two companies discussed in these articles.


The platform is comprised of Application Monitoring, User Experience Management, Synthetics, and Data Center RUM.


How does Ruxit fit into this equation?

A companies future is an important item to learn whenever making a large purchase.  I am personally working with a company who is now scrambling to find an alternative since the collapse of OpTier.  With Ruxit being introduced into the same APM market as Dynatrace Application Management the questions around internal competition often come up.  Companies should be confident on choosing the right solution, and that solutions future.


How come you do not have an analytics platform?

New Relic and AppDynamics have one.  Why is not there a Dynatrace Analytics?  With as much of a "me tooing"* going around in this industry it only makes sense that Dynatrace is also getting into the big data crunching spot that the other two vendors are stepping into.

*when a company releases a feature that companies like.  Then a competitor releases a similar feature.  AKA keeping up with the Jones'.


How is your platform integrated?

Since Dynatrace has agent, agentless, and synthetic platforms;correlation from these multiple data collection points will be a necessity for understanding the data collected.  Obviously there are integrations that exist today, but ensure that you understand how those connections work and what data is shared between the tools.


Final Rant

It has been a very fun and interesting journey working for Dynatrace.  Nearly everyday I have the pleasure of working alongside some of the most talented coworkers, partners and clients.  I started this blog because this company has shown me there is a better way.  It is our job as fellow performance nerds to constantly question and push for improvements regardless of where the opportunity may be.



Back to intro
Continue to New Relic
Continue to AppDynamics


Wednesday, May 13, 2015

Tough Questions: AppDynamics

DISCLAIMER: All views and opinions are based off of individual experiences and are not reflective of anyone or anything other than myself.  The information stated below is has either been obtained from personal engagements or available via other publications.

AppDynamics was founded in 2008 and had the first version launch in 2010.  Since launch they have made a ton of noise and have created a pretty impressive list of customers.  AppDynamics is enterprise focused but can do deployments ranging from start-ups to large dotCOMS.  The two main founders started AppDynamics after leaving CA Introscope so they have been really in this space for over a few decades.  Their main target audience appears to be production operations teams and heavily focus efforts on getting C-level conversation for enterprise wide type of deployments.

The platform is comprised of APM, Mobile Real-User Monitoring, Browser Real-User Monitoring, Database Monitoring, Server Monitoring, and Application Analytics.  There is a lot of buzz around a new Hadoop based architecture the platform is moving to, but I am going to rate the company as is today.

How do you correlate different data points?

Whoa, deja vu!  Same thing as before.  Make sure they explain how the are able to show when a user clicks on a button or link, how the back end requests are tied together.  Fact of the matter is, APM and touch-point based capture (browser/mobile) are two separate components relying on separate hardware.  This is also true for their back end correlation as well. You will not get end to end capture for a majority (like 99.9% in high traffic environments).


How many agents do I need to install?

This is sort of like part two of the question above.  AppDynamics requires a separate process to grab host metrics.  Not the end of the world.  More an annoyance when it comes to upgrading and managing a complex environment.


How do you monitor mobile/web server/mainframe?

This is the one they should seriously be called out on every single time.  Their "agents" for these components are not agents.  AppDynamics subscribes to counters or relies on other vendor integrations to get at this data.  If you want counters; write a script, do not pay for that type of stuff.


How do you scale?

Be careful asking this question.  AppDynamics could quote that they have 10,000 plus agent deployments.  Remember that additional agent they need for host metrics?  Want to guess what make up those ridiculously sounding agents to controllers environments?  If they claim they have X-thousand agents reporting to a single controller server, demand to talk to a customer that currently has that deployed.  They often name-drop companies they are no longer at as well, or at least not in the previous scale they like to mention (AHEM... NETFLIX).


Final Rant

Some of the items they are working on are also very questionable.  The whole DevOps movement at its core is to facilitate communication across all teams involved in the SDLC.  They are taking that too seriously with the "Virtual War Room".  APM is not meant to "make a better war room".  APM at ITS core is meant to "Virtually Eliminate The War Room" (TO: Marketing, From: Bill (You're Welcome)).  Cool, you made WebEx version 1.0.  Congratulations.  But, this really will not ultimately lead to helping solve harder problems.  It is a better way to communicate yes, but it is not a core fundamental of diagnosing performance problems.


Back to intro
Continue to New Relic
Continue to Dynatrace

Tough Questions: New Relic

DISCLAIMER: All views and opinions are based off of individual experiences and are not reflective of anyone or anything other than myself.  The information stated below is has either been obtained from personal engagements or available via other publications.

I am going to start off with the company I do have a lot of respect for.  New Relic emerged on the APM scene with the launch of it's SaaS portal in 2013 (it was founded in 2008).  The main talking points anyone positioning New Relic will be all focused around data.  It is pretty simple to see that their main target is what they have aptly named "data nerds".  Data science is a huge driver in terms of arenas such as big data.  Data and understanding end users helps increase conversion rate, quickly identify performance problems and all around makes organizations more efficient.

New Relic positions its platform into 7 components.  They are APM, Mobile, Insights*, Server, Browser, Plugins, Synthetics.  Insights is the component I am personally excited to see go GA since it will potentially be big data analytics as a service.  So without further ado, here are some of the questions you should ask New Relic:


How do you correlate different data points?

This can be phrased in multiple different ways.  Another way of thinking of this would be: if a uses calls in and reports a problem, can I find that end to end transaction?  Answer will be "no" to that second question.  Most vendors struggle with this and use time-stamp based correlation to "follow" call stacks or service calls from one area to another.

How do you license data?

With the onset of Insights (again, I saw the platform and it is pretty slick), there are often questions around what can be done with this big data analytics.  As of right now, you will need to pay for each of the 7 components listed above.  It does appear that New Relic will be double dipping when it comes to Insights.  If you have an APM agent (which you will have to pay for) you will also have to pay for storage of those specific calls in Insights.  Its your data, but you gotta pay the toll for using it?

How do you grab custom metrics?

Everyone says they can grab custom metrics.  Have these guys show you what is needed.  The biggest headache I hear from New Relic customers is the complexity it requires to do something as simple as grab a method argument or create a custom dashboard.  Outside of the insights platform, I have even heard grumblings around alerting.  This is a real scenario that happened at a customer.  Lets call the customer George:

George: "I just bought New Relic. Can Dynatrace do custom dashboards and alerts?"
Bill:  (Looks confused because this is peanuts when it comes to features/functionality of Dynatrace) "Yes."
George: "Great! Prove it!"
Bill: "K."
15 minutes later
George: "I'll buy"

One of the corner stones New Relic stands on is all focused around data, but without Insights (AKA additional costs) the platform does not appear to do any sort of customized dashboard.

How do you integrate with non-production environments?

This is key for companies who want to promote better code to production (otherwise known as every company).  One of the biggest cons I see with the platform as a whole is the fact is its focus on production environments.  This is due to the fact that the data collection is only powerful for large scales of data which typically do not happen in non-production environments.  Yes, New Relic may have integrations with CI tools or even load generation tools.  But fundamentally, is the value shown enough to warrant an additional license purchase?  Always be weary of vendors who give away test licenses for free.  That means they have conceded to the fact the value prop of their solution just is not there.


Final Rant

New Relic can help identify a significant amount of performance problems BUT I have not seen it used in a triage scenario.  The problem is, as much as they preach their data collection a lot of the data is filtered out and not stored within the environment.  I am of the opinion to not trust the computer overlords with the ability to determine what is important to store or not.  Not yet anyways.


Back to intro
Continue to AppDynamics
Continue to Dynatrace

Tough Questions: The Intro

DISCLAIMER: I am a current employee of Dynatrace, but all views and opinions expressed in this article are my own and not necessarily shared with the Dynatrace company or its affiliates as a whole.

"How are you different than your competition?"  This is the question I am guaranteed to get asked during every single engagement.  It is also the question I hate the most.  REALLY?! I view my job as showing organizations there is a better way for application development.  When I am engaged in an account I am not thinking "how can I show my tool is superior to another they are checking out", I am trying to prove if you introduce this new wrench into your toolbox, life can be so much easier!

On the other hand, it is a fair question for a someone investigating any solution to ask.  However in all fairness, I personally could say what ever I want at that moment.  The real proof to the differences should not be based on what an individual representative of any company can state, but what can they show (I get the irony in the fact that I am sharing these details over a blog post).  In this next series of blog posts, I am going to go over the tough questions all companies should be asking New Relic, AppDynamics, and Dynatrace.  Why these three?  Well, not only are they Gartner's leading APM vendors (all in the top right of the magic quadrant), they are also the three I feel comfortable discussing the major concerns most companies encounter throughout utilization of the solution.

If you are investigating any of these mentioned solutions.  I strongly encourage you to ask each of them the listed questions pertaining to each solution.  We all have our skeletons and all do some slight of hand when it comes to topics we are not 100% confident in.  I am hoping to put some light on all points of concern for each APM solution provider to help the industry make the best decision possible.


I started writing this and realized there is a common pattern in some areas.  Here is a protip;  Never ask a "Can" questions.

  • Can your product follow transactions?
  • Can your product integrate with Jenkins?
  • Can your solution monitor SAP?
  • Can you give visibility into web servers, browser, mobile app, or mainframe
  • etc.  


Instead ask "How" questions.

  • How do you follow transactions?
  • How does your integration with Jenkins look?
  • How can your solution monitor SAP?
  • How do you get visibility into the web servers, browser, mobile app, or mainframe?


Every vendor can answer yes to a "Can" question.  A "How" questions forces visibility on the technical way a vendor is going to accomplish that feat.  Some (to most of the time) the answer to the "how" questions will show that the vendor will not do what you want.

Continue to New Relic
Continue to AppDynamics
Continue to Dynatrace

Friday, May 8, 2015

Becoming Fit for End Users

Dynatrace has the solution that is meant to improve any company’s digital performance.  In order to accomplish this feat, multiple levels of routines are available to any company looking to improve their current application physique.  There are a lot of individuals and teams who are not happy with what they see when looking at the shape of their application.  While others are content, but still have thoughts of, “What can I do to improve my confidence”?  Dynatrace is your one stop shop regardless if you need a complete lifestyle change, or just want to tone discrete areas of your application body.


Gym Membership

Ruxit

Ruxit is the platform for teams whom have decided, “Something needs to change”.  This starting step requires the admission that the current way of monitoring the environment is no longer capable of keeping up.  Simple commitments need to be taken in order to start the path to a healthier environment.  With Ruxit’s repeatable payment options, it is the perfect choice for organizations that would like to become more efficient.  This step normally opens up the door to further development and understanding of the delivery process within an organization.  It can lead to deeper workouts into teams’ applications, but can also be sufficient for certain groups.



Stretching

Synthetics

Every good routine starts off with a warm up.  Whether it is testing the muscles current ability, or preparing for the workout ahead, anyone who is serious about getting fit needs to prepare prior to the “real load” is added.  Synthetics cover this aspect of digital performance fitness quite well.  Even long after the work out is complete, synthetics allow for the team to understand the health of the system.  Synthetics can pinpoint issues within a system even when there are no real users interacting with it.  Which would you rather have, an alert that says the system went down at 2AM, or a horde of end users calling in complaining they cannot log in at 8AM.  By this point, stretching exercises are a common practice in the field, and some people may say that it is commoditized.  Dynatrace Synthetics are specifically designed to capture end user experience and are fully integrated in with the rest of the workout.


Cardio

DCRUM

Endurance is key to any body that wants to be fit.  Being able to run long distances requires awareness of the complete body from stride form to proper breathing habits.   DCRUM offers the “far and wide” approach as well.  DCRUM is meant for ensuring that all components are constantly working on the long workouts where the applications are relying on multiple systems to work.  DCRUM will not only worry about core service like application tiers, but items such as network devices and packet communication.  DCRUM provides companies the ability to automatically determine the specific fault domain where the issue is happening.  Maybe the issue resides in a sore load balancer.  Or maybe there is a problem with packet flow after a few hours.  DCRUM can help identify these issues, so steps can be taken to isolate the problem to that tier.  This is to prevent the generic “I have to stop because I can’t go on” excuse.  This prevents the war room approach and focuses efforts on the problematic tier as opposed to unnecessary diagnostic efforts.  DCRUM is language agnostic and focuses easily on covering the whole environment.


Weight Training

Dynatrace Application Monitoring

Focusing on key muscle groups will bring noticeable changes to ones physique.  An individual can customize routines to focus on area’s they consider a part of their core.  For digital assets, this could be a deep look and workout of a specific java tier.  Dynatrace Application Monitoring is meant for finding out the deep why of problems.  If the ankle is sore, is it because of the problem in the code, or the communication it receives from the web server.  Every member of Dynatrace Application Monitoring will have certain weaknesses the team would like to address.  Once those weaknesses become strengths, other parts of the application body can be focused on.  Dynatrace Application Monitoring has the ability to focus on major muscle groups like Java, .NET, PHP, Node.JS, Message Brokers, Web Servers, Native, Mainframe, Databases, and can go all the way to track end users sessions from native mobile apps, mobile web, and desktop interactions.  Multiple muscle groups can be included in a single workout.  And the individual can notice the differences and compare the problematic areas of not only the front end services, but the interaction between the front and back end services in a never before seen depth.


Personal Trainers

Enablement Services

You will eventually plateau.  There is a limit to one individual’s knowledge of fitness and performance.  This is why Dynatrace offers an Enablement Services program.  Our professional team is not designed to do the workout for you.  They are field experts that have multiple clients and can help coach your organization in current best practices and new routines that will take your digital assets to the next level.  Sometimes they may be brash, but they are always there to push you beyond what you may think you are capable of.  They are offered for all of the workout types listed above and come in many different packages to meet your current requirements.  To reiterate; Enablement services are not designed to rack the weights and do the lifting for you.  Enablement services are available to show you proper form and technique, so you yourself can take your application to the next level.


The Obstacle Course

Production


Fundamentally you are not building applications to have the quickest response time or the most efficient application processes.  A digital asset exists to serve end users. This is why production is separated out into a category of its own.  No one in their right mind would say “I am going to run ten miles today” if their longest run up until that point is less than five.  This also means an organization should not say, “I am going to launch an application that one million users will use” without proper preparation.  This is where Dynatrace really separates itself from any other partner today.  We are focused on making you sweat in development, test, and performance environments, but do so in a way that maximizes your time.  Dynatrace wants to see you call a single developer when a load test throws a failure; as opposed to thinking the whole body needs to investigate what the issue is.  Do not misinterpret this article, as Dynatrace is not ready for production.  In fact, Dynatrace is amazing for production awareness.  You will still be monitoring your heart rate in production with DCRUM just as you will still be aware of your individual muscles performance from Dynatrace Application Monitoring and you will still have time at the water stations for a quick stretch to see how the system is from Synthetics.  Dynatrace will still always be there, cheering you on from the sideline as you pass competition and shatter expectations, even the ones you had of yourself.

Defining Performance


Merriam-Webster, the standard for dictionaries, has 6 definitions for performance.  Performance can reference both a positive and negative impacting execution of a certain act.  Performance is not limited to the technology world and fundamentally consists of three components regardless of what is being measured.  The two key points and their underlying determinants are; Speed and Consistency.

SPEED

(Trae Waynes 40 yard dash at the combine.  Courtesy of MLive.com)

The first questions regarding most tests revolve around time.  The time it took to run the 40 for athletes is a corner stone in their determined worth as a player.  There are similar metrics used to determine the speed at which transactions flow through applications.  The problem that occurs in most organizations is where should these timers start?  There are teams that strictly care about speed of delivery for single components, for the network, for the infrastructure and the overall end to end 40 time is often not tracked.  In today's world, the speed that matters is all relative to the end user.  Internal RTT inside the datacenter could be sub millisecond, but users will still leave your site if they experience problems from their device.  End to end delivery is what matters most, and if you are blind to the delivery of content to devices, you are never seeing the 40's finish line.


CONSISTENCY

"60 percent of the time, it works every time" - Brian Fantana (Anchorman 2004)

Your neighbor tells you about this great pizza place down the street.  You decide to give them a try.  Upon calling them up, you order a Hawaiian pizza.  If the delivery person shows up one second later with some General Tso's, you (some part of you) will still be disappointed.  Same goes for application delivery.  Just because some sort of response is generated, if it is not consistent with session to session, or device to device your brand is impacted.  When inconsistencies do arise it is imperative to find out what is consistent about the inconsistencies.  Today, can you determine if the issues are related to a certain; host, process, time of day, geo region of customer, device of end user?  Without these metrics, there are users who are leaving your brand and you might not even be aware of the issues.



WORLD TODAY

Performance is a very hot topic today.  The performance fires were stoked with the explosion of mobile and tablet based interfaces available in the market today.  There are huge opportunities for teams and organizations to become thought leaders in this new world, but it does take quick and calculated change.  Data will drive these changes, and performance will drive users feelings towards the brand.