Friday, August 14, 2015

Containers: The Concept is Small, but the Impact is Big!

Due for a Disrupt

Next year AWS turns 10 years old.  The cloud is getting extremely mature and there are multiple cloud providers touting their individual features that makes each unique in value.  There is a huge following (myself being one of them) that "the next cloud" will be containers.  The AWS for containers will be Docker.  I have attended a few Meetups regarding Docker specifically, and I am always impressed with the innovations that are constantly being released.  Docker version 1.8 was released this week and I see it as the first release that makes the solution enterprise friendly.  There are a few items to call out specifically in this release that makes getting started with Docker that much easier, most notably Docker Toolbox.  One of the items I do not see discussed enough is the overall value containers bring to each group within the software development life cycle.  Containers are not just a new way to develop and deploy applications.  Containers provide significant value to each team who interacts with an application.  Let us investigate how containers impact development, deployment, operations, and business teams.


Developers Dream

Take a look at any downloaded application, and you will see something like this: 
To get each one of these icons, developers have to create specialized libraries specific to each operating system.  Imagine a world where this is not needed.  Imagine the reduction in work required by a developer to produce an application that can be ran on any operating system.  Imagine if a developer did not have to push out fix packs for supporting a new security patch.  This would fundamentally cut down on the time required to reach all potential users of that application.  Do you want to develop faster?  Containers are the option for you!

Microservices Deploy Faster

It only takes one syntax error to break a deployment.  Too many times have I been sitting on a deployment call where someone fat fingered a host name or forgot to open a port and the whole deployment failed.  Microservices are the answer to address these items, and containers effectively ensure that microservices are deployed correctly.  If there is an issue with a single configuration, it is easily identifiable from the containers perspective and will only impact one microservice as opposed to corrupting the whole push.  There is a lot of players developing container management solutions, and in turn are making microservices extremely easy to manage and control.

Growing Elastically for Production

An item I always ask customers is; "How many concurrent users do you have?"  When I do get an answer (which is very infrequently), it is often wrong and not true for every scenario.  Virtualization was the original answer to address the requirement for elastic environments.  This did work for a time, but now environments are becoming too large to manage.  An application developed in containers can grow horizontally and vertically to account for any event (even catastrophic outages).  Getting ready for black Friday?  Add more containers.  Want to cut down on cloud charges during non peak hours? Scale down your containers.  Want to embrace continuous deployment/integration? Containers allow for the smallest of small incremental changes and are built to handle true A/B testing in any environment! Containers allow for 100% availability of any application that can still be constantly improved without impact to the end user!

Protecting the Brand

I have the impression that the business is the hardest group to convince containers are the best going forward strategy.  Docker's image of a whale carrying shipping containers does not paint a picture that reflects simple and easy.  

However, there is significant value to the brand when utilizing containers.  One of the terms business/feature folks always seem to know is six-sigma, or the guarantee that the application will be running practically all of the time.  Containers are THE way to ensure that end users are not impacted by hiccups within the delivery chain.  On top of this, with the reduction of development time required for cross OS level support, new features can be added into the product more efficiently and can be market tested to a subset of users before deploying it to everyone.  If you are the key feature driver of the application, and have ever asked "why can't we get this in quicker", you should have your team investigate containers!

The Better Way is Here!

Will containers become the standard for application development?  I strongly believe the answer is yes.  Living in the software world, I often see key solutions that benefit a specific group.  Containers are a technology that has a key value prop regardless of what team you sit on.  I for one am going to hitch my brain to this Docker wagon and see how much value I can derive!


Thursday, July 2, 2015

The Problem With Big Data

Over the past few years, Big Data has been a marque term in the technology space.  There have been multiple different organizations who have tried to tackle the fundamental problems that are associated with Big Data analytics.  Heavy investments have come into key areas mainly focused around scalability and request efficiency.  Even recently, IBM made a huge investment into the Big Data space.  Ultimately, Big Data technologies are meant to answer questions based on noticed trends that would be impossible for a user or groups of users to determine themselves.  Although these technologies are quite impressive, they often lack an easy way to COLLECT metrics outside of involving someone familiar with the components specific outputs.  This is the problem with Big Data, the question of "how do I get the data I want to be consumed by the Big Data solution" is still a question no one has an industry recognized solution for.


"Just Get Us the Data, and We'll Take it From There"

If you read my previous article, you know I am borderline obsessed with the phrase "there has to be a better way".  There are plenty of examples of Big Data tools utilizing common communication frameworks and protocols.  The real effort in this process is how does an application owner pull out that data and submit it to the Big Data solution with minimal and effective efforts.  Many times, I see organizations put the effort of outputting the data on developers to either write REST interfaces into the application, or even worse (from a performance perspective) write out the data to log files.  Both of these efforts end up solving the problem, but could introduce security issues along side the fact you are asking a developer to implement a brand new component into the solution for one off requests.  Try this exercise:

  1. Think of a question you would like to ask your application
  2. Write down all of the metrics or points of data required to come up with an answer
  3. Picture all of the individuals required to get at each one of those points of data
  4. What if one of those individuals defines a metric differently, what would be the impact to the answer?
  5. Can anyone maliciously use this data if accessed?

This is just the topics you need to cover for answering one question.  The only way to prevent this from becoming unbearable, is to get to the same results using a different path.


If You are Relying off Logs, You are Doing IT Wrong

Log files are up for interpretation.  Did a developer come up with that string that is written?  Then there are probably edge cases where that output is not right.  Basically, stop writing log files to get at specific points of data.  There are solutions out there that will instrument the application and provide a much richer context of what is going on within the stack.  This context could never be captured in a string written to the file system.  Instead of writing one off messages to solve one off problems, work on how to implement a single framework that can be utilized across the organization to get at all points of data.


The Three Points

Coming full circle, a real Big Data implementation is comprised of three components (similar to monitoring tools); data collection, data analysis, and data presentation.  I get the feeling there large number of players trying to corner the latter two areas.  The data collection area is still extremely green in my eyes, and I am eagerly waiting for someone to really make a play for answering that question.

Sunday, May 31, 2015

Velocity Takeaways

I had the privilege of attending the 2015 Velocity Conference in Santa Clara.  It was an amazing show with a ton of great speakers and even more exhibitors.  I always find the interactions at the booths are were a lot of knowledge transfer happens.  I did hear a pretty distinct pattern in a lot of the offering though.  The pitch went something along the lines of "we get you the metrics to find root cause".  This messaging sounds pretty amazing at face value! I believe it is imperative to determine what the speaker means by root cause, and how many scenarios they are ready for.  To determine what types of problem patterns a monitoring solution could potentially solve, I look at three components.  The three are; data collection, data analysis and data presentation.  This article will cover the first.


Data Collection

For monitoring solutions, they can only solve problems they can see.  Are they collecting end user clicks? If not, then they cannot determine the impact to end users.  Seems simple enough, but you should have an in depth conversation on the mechanism that they collect those statistics.

Some tips on collection

1.  If they do not have a library or agent somehow injected into the running process, they will not get root cause on the call stack.  Examples:
PurePath view from Dynatrace Application Monitoring












  • sync issues
  • CPU consumption hogs
  • exceptions
  • correlating log messages to called transaction

2.  If they do not have network based capture, they will not get network issue resolution.  Example:
Network heath breakdown from Dynatrace DCRUM











  • Retransmission issues
  • Packet loss
  • Network redirects  

3.  End user device capture was the biggest opportunity for business at velocity this year.  The key call outs for this to be possible are:
Visit capture screen from Dynatrace User Experience Management
















  • SDK for native devices
  • Ability to see non-web based transactions (think traditional thick client requirements)
  • JS agents injected into browser/mobile browser*
              *Watch out for W3C based capturing if you are using Angular.JS.  W3C timings will not get the same visibility as other frameworks.



Final Thoughts

All in all, the conference was a ton of fun.  I was able to see what others in the industry are releasing into this space.  I even got a demo of New Relics and AppDynamics portal:















Although a lot of organizations where touting the "root cause" messaging, I only saw some glimmers of others demonstrating a true root cause analysis.  The more complex applications have become, the more powerful monitoring tools have started to pull away in this fight.   I have yet to find the silver bullet for performance analysis, and probably never will, but the journey is very entertaining!

Monday, May 18, 2015

DevOps is Not Just a Tool

I was forwarded an interesting article last week and it really made me think.  The article in question is  written by John Allspaw and entitled "An Open Letter To Monitoring/Metrics/Alerting Companies".  The main point John makes is regardless how advanced and powerful a piece of software is, there will always be a need for a user on the other side of the screen making sense of the data.  Software at its core is just a product.  I agree with John that it is disingenuous for a company to pitch their product as "the end all be all solution for better troubleshooting".  There are other components in making a true DevOps practice a reality.

I am a huge fan of the CNBC show The Profit.  If you have never watched it before; the host, Markus Lemonis travels around and invests his personal capital in struggling businesses.  He dedicates his time and effort into getting businesses on an exponential growth path.  Markus always refers to a simple formula for success.  The formula consists of the three P's; Product, Process, and People.  A true DevOps environment will also include the three P's.


Product

                           Is it all there?

The tools of an organization working on DevOps practices must have the ability to facilitate cross team conversations.  This means that the tool (or tools) implemented must present data in multiple ways that can be interpreted by different teams.  So logically, the tool must be able to do one thing above all else, collect data.  Analyzing incomplete pictures will always lead to incomplete results.  Be weary of solutions that will always "filter out noise".  The hardest problems to solve always seem to be in the "noise" of the environment.


Process

                         It's ok to look down

Why does your company do things the way it does?  Why do you have those status calls?  Why do you have an internal forum?  If you cannot answer these questions, I do not want to talk to you.  You should have a clear understanding of the process of your organization and they "why" behind that process.  In simpler terms, process is meant for the sake of progress.  In a DevOps world the process should facilitate communication between all the groups involved in the SDLC.  I have seen some of the best organizations put Ops members on the weekly Dev touch bases and vice versa.  Production problems suck.  They are stressful, a huge drain on the business and a black eye on the brand.  If an organization takes the steps required to allow honest discussions regularly you will see a noticeable drop in P1 production issues.


People

                                    No computer will replace you

The center of the DevOps movement is the people.  Bottom line.  To build the best applications, you need the best people.  I have worked with some amazing engineers, and some not so amazing engineers.  I have sadly had to walk away from potential deals because I know the group that would be responsible for understanding the data was not capable of doing so.  The reason why I walk away from those situations is the same reason a chainsaw company will not sell its product to a 10 year old.  The child is not ready and will likely hurt itself as opposed to providing any value.


Final Rant

I hate the fact that organizations look down on services when evaluating potential purchases.  Requiring services does not mean the product is hard to understand, more so your organization requires some assistance in perfecting your individual products, process, and people.  You cannot honestly think your team will instantly become better just by installing something in your environment.  Just like learning anything new, it will require changes in all three fronts to truly become a DevOps shop.



Friday, May 15, 2015

Tough Questions: Dynatrace

DISCLAIMER: All views and opinions are based off of individual experiences and are not reflective of anyone or anything other than myself.  The information stated below is has either been obtained from personal engagements or available via other publications.

If I did not work for such a great company, I may get called into the principals office for this one.  Dynatrace has taken a unique journey since its inception in 2007.  The customer list is over 6,000 and includes nine of the top ten retailers and banks (who is holding out?).  As of right now, Dynatrace is the market share leader when compared to the other two companies discussed in these articles.


The platform is comprised of Application Monitoring, User Experience Management, Synthetics, and Data Center RUM.


How does Ruxit fit into this equation?

A companies future is an important item to learn whenever making a large purchase.  I am personally working with a company who is now scrambling to find an alternative since the collapse of OpTier.  With Ruxit being introduced into the same APM market as Dynatrace Application Management the questions around internal competition often come up.  Companies should be confident on choosing the right solution, and that solutions future.


How come you do not have an analytics platform?

New Relic and AppDynamics have one.  Why is not there a Dynatrace Analytics?  With as much of a "me tooing"* going around in this industry it only makes sense that Dynatrace is also getting into the big data crunching spot that the other two vendors are stepping into.

*when a company releases a feature that companies like.  Then a competitor releases a similar feature.  AKA keeping up with the Jones'.


How is your platform integrated?

Since Dynatrace has agent, agentless, and synthetic platforms;correlation from these multiple data collection points will be a necessity for understanding the data collected.  Obviously there are integrations that exist today, but ensure that you understand how those connections work and what data is shared between the tools.


Final Rant

It has been a very fun and interesting journey working for Dynatrace.  Nearly everyday I have the pleasure of working alongside some of the most talented coworkers, partners and clients.  I started this blog because this company has shown me there is a better way.  It is our job as fellow performance nerds to constantly question and push for improvements regardless of where the opportunity may be.



Back to intro
Continue to New Relic
Continue to AppDynamics


Wednesday, May 13, 2015

Tough Questions: AppDynamics

DISCLAIMER: All views and opinions are based off of individual experiences and are not reflective of anyone or anything other than myself.  The information stated below is has either been obtained from personal engagements or available via other publications.

AppDynamics was founded in 2008 and had the first version launch in 2010.  Since launch they have made a ton of noise and have created a pretty impressive list of customers.  AppDynamics is enterprise focused but can do deployments ranging from start-ups to large dotCOMS.  The two main founders started AppDynamics after leaving CA Introscope so they have been really in this space for over a few decades.  Their main target audience appears to be production operations teams and heavily focus efforts on getting C-level conversation for enterprise wide type of deployments.

The platform is comprised of APM, Mobile Real-User Monitoring, Browser Real-User Monitoring, Database Monitoring, Server Monitoring, and Application Analytics.  There is a lot of buzz around a new Hadoop based architecture the platform is moving to, but I am going to rate the company as is today.

How do you correlate different data points?

Whoa, deja vu!  Same thing as before.  Make sure they explain how the are able to show when a user clicks on a button or link, how the back end requests are tied together.  Fact of the matter is, APM and touch-point based capture (browser/mobile) are two separate components relying on separate hardware.  This is also true for their back end correlation as well. You will not get end to end capture for a majority (like 99.9% in high traffic environments).


How many agents do I need to install?

This is sort of like part two of the question above.  AppDynamics requires a separate process to grab host metrics.  Not the end of the world.  More an annoyance when it comes to upgrading and managing a complex environment.


How do you monitor mobile/web server/mainframe?

This is the one they should seriously be called out on every single time.  Their "agents" for these components are not agents.  AppDynamics subscribes to counters or relies on other vendor integrations to get at this data.  If you want counters; write a script, do not pay for that type of stuff.


How do you scale?

Be careful asking this question.  AppDynamics could quote that they have 10,000 plus agent deployments.  Remember that additional agent they need for host metrics?  Want to guess what make up those ridiculously sounding agents to controllers environments?  If they claim they have X-thousand agents reporting to a single controller server, demand to talk to a customer that currently has that deployed.  They often name-drop companies they are no longer at as well, or at least not in the previous scale they like to mention (AHEM... NETFLIX).


Final Rant

Some of the items they are working on are also very questionable.  The whole DevOps movement at its core is to facilitate communication across all teams involved in the SDLC.  They are taking that too seriously with the "Virtual War Room".  APM is not meant to "make a better war room".  APM at ITS core is meant to "Virtually Eliminate The War Room" (TO: Marketing, From: Bill (You're Welcome)).  Cool, you made WebEx version 1.0.  Congratulations.  But, this really will not ultimately lead to helping solve harder problems.  It is a better way to communicate yes, but it is not a core fundamental of diagnosing performance problems.


Back to intro
Continue to New Relic
Continue to Dynatrace

Tough Questions: New Relic

DISCLAIMER: All views and opinions are based off of individual experiences and are not reflective of anyone or anything other than myself.  The information stated below is has either been obtained from personal engagements or available via other publications.

I am going to start off with the company I do have a lot of respect for.  New Relic emerged on the APM scene with the launch of it's SaaS portal in 2013 (it was founded in 2008).  The main talking points anyone positioning New Relic will be all focused around data.  It is pretty simple to see that their main target is what they have aptly named "data nerds".  Data science is a huge driver in terms of arenas such as big data.  Data and understanding end users helps increase conversion rate, quickly identify performance problems and all around makes organizations more efficient.

New Relic positions its platform into 7 components.  They are APM, Mobile, Insights*, Server, Browser, Plugins, Synthetics.  Insights is the component I am personally excited to see go GA since it will potentially be big data analytics as a service.  So without further ado, here are some of the questions you should ask New Relic:


How do you correlate different data points?

This can be phrased in multiple different ways.  Another way of thinking of this would be: if a uses calls in and reports a problem, can I find that end to end transaction?  Answer will be "no" to that second question.  Most vendors struggle with this and use time-stamp based correlation to "follow" call stacks or service calls from one area to another.

How do you license data?

With the onset of Insights (again, I saw the platform and it is pretty slick), there are often questions around what can be done with this big data analytics.  As of right now, you will need to pay for each of the 7 components listed above.  It does appear that New Relic will be double dipping when it comes to Insights.  If you have an APM agent (which you will have to pay for) you will also have to pay for storage of those specific calls in Insights.  Its your data, but you gotta pay the toll for using it?

How do you grab custom metrics?

Everyone says they can grab custom metrics.  Have these guys show you what is needed.  The biggest headache I hear from New Relic customers is the complexity it requires to do something as simple as grab a method argument or create a custom dashboard.  Outside of the insights platform, I have even heard grumblings around alerting.  This is a real scenario that happened at a customer.  Lets call the customer George:

George: "I just bought New Relic. Can Dynatrace do custom dashboards and alerts?"
Bill:  (Looks confused because this is peanuts when it comes to features/functionality of Dynatrace) "Yes."
George: "Great! Prove it!"
Bill: "K."
15 minutes later
George: "I'll buy"

One of the corner stones New Relic stands on is all focused around data, but without Insights (AKA additional costs) the platform does not appear to do any sort of customized dashboard.

How do you integrate with non-production environments?

This is key for companies who want to promote better code to production (otherwise known as every company).  One of the biggest cons I see with the platform as a whole is the fact is its focus on production environments.  This is due to the fact that the data collection is only powerful for large scales of data which typically do not happen in non-production environments.  Yes, New Relic may have integrations with CI tools or even load generation tools.  But fundamentally, is the value shown enough to warrant an additional license purchase?  Always be weary of vendors who give away test licenses for free.  That means they have conceded to the fact the value prop of their solution just is not there.


Final Rant

New Relic can help identify a significant amount of performance problems BUT I have not seen it used in a triage scenario.  The problem is, as much as they preach their data collection a lot of the data is filtered out and not stored within the environment.  I am of the opinion to not trust the computer overlords with the ability to determine what is important to store or not.  Not yet anyways.


Back to intro
Continue to AppDynamics
Continue to Dynatrace