Monday, 19 December 2016

Three Object Stores: the Good, the Bad and the Ugly

AWS S3 Object Storage
I have been looking at the different Object storage cloud services.  Don't worry if you don't know what Object storage is, only think of it as a way to store lots of buckets or containers.  These  buckets or containers are only one level deep.  This makes them very cheap and its the better way to store you documents online that are write once and read many.  So documents getting lots of read and writes should stay in SharePoint, but these documents may get lots of reads, like a web site or massive backup or archive store.

I won't go on and on like I sometimes do at home about all the great things Object storage can do.  I will just give you an overview as kind of a power user and designer.

Above is the AWS S3 storage for Object storage.  It sets the standards, look at all those ready to configure features.  You can directly host as a static web page, and that means JavaScript web page people.  And rules to lower cost storage right there, what is there not to love.

IBM Bluemix Object Storage, and little else

IBM  Bluemix has a nice Object Storage with a nice overview interface, tells you how much the Enterprise is storing is nice.  But beyond that its not much really.

Okay I know this is Azure, and it seems I will need to pay something now

I don't know what to think about storage in Azure.  I think this above is how you would set up Azures alternative to AWS Object storage.  Really I am not sure and I am going to have to pay something to find out.  This I think comes just as part of Microsoft double strategy of hybrid Cloud and Platform to Services a Service patterns.  Though Object store is a Platform it is a pretty low level one, now much consulting revenue down the road and Microsoft indifference to this space shows.  

What I first though when I started looking at Cloud storage is that outside of SQL server Microsoft had one answer: Office 365 especially OneDrive and SharePoint Online.

To be just OneDrive does offer an answer and an answer that probably will work most of the time.  The focus is more on the customers.  Object storage is in a way a competition to Office and SharePoint.  I could very well skip rolling out full Office 365 to give my users Object storage spaces.  I would myself first load massive shared files of little understood value in Object storage rather than OneDrive.

And at this point, Object Storage is Amazon Web Services. 

And the future of data store? 

Okay here is the prize, a wild speculation on the kind of data store you could make here.  A massive data store with the files entirely store in JSON.  A node set of bots.  The bots can handle all channels of request, from simple enterprise search and ECM storage to voice requests or IM requests, to email help systems.  You can have your companies main data in semi-structured form accessible via a wide range of channels and enhanced with cognitive technology or running Spark machine learning applications.

Server-less, the ultimate Cloud destination

Last post on Cloud conversion dealt with dealing with massive migration business logic, and I suggested you try and manage complexity by breaking your information worker tasks in to those you don't understand and are custom, and those that are general tools leverage by users. General tools that give users freedom should be managed by SaaS, the most famous being Office 365 and Salesforce. The others should be migrated to IaaS in mass to reduce complexity of migration. If you are not happy with this simply model you need to budge at 6 month to 1 year application rationalization and conversion project.

I am going to skip that for a later post and concentrate now on how to make new things with a Cloud.

You will be migrating your existing systems to IaaS VMs, but for new solutions you will find that the best thing to do is PaaS, where you consume code or data storage without having to worry about infrastructure.

To do this you use the state of the art framework called Serverless Architecture. This will be nothing new to experienced web developer. If you create something on GoDaddy you get some code and you load it up. You configure a MySQL or NoSQL store or log file for data storage, and you just let it run.

Traditionally this has been the test and dev world for developers, and as soon as an application work it was often moved to a Enterprise hosting solution where a set of servers needed to be defined to run the different layers of the application. A coder who just wrote a bunch of code now had to think about breaking logic and data up to sit in layers, and the specifications for the layers. Other experts had to harden the servers and as you could now install .exe code people rushed to create add ons to give the servers better performance at the cost of vendor or even server lock in.

Well with Amazon products like Lambda for storing code logic, S3 Object storage to create scalable storage without servers, and DynamoDB, Redshift and RDS you can build a scalable solution without ever defining the servers.

Yes servers still exist, but it is AWS that worry about the servers. You also have the added benefit that no one on the planet can connect the servers running and the code you have created. This provides great deal of security as human attacks on the server farm can't get your stuff, or even find it.

Personally I think the best part of Serverless design is that experienced web developers will have to learn next to nothing. All the AWS tools are easier to use than the alternative server based solution. Coders can concentrate on defining the logic, the data and being sure that the solution is responsive enough by design. AWS will provision new CPUs and SSD as needed, you won't even see it.

This is far cheaper and easier to maintain than legacy systems that sit on boxes.

And what is the negative, well there is a degree of vendor buy-in with deploying systems this way that does not happen when you install code on VM in the Cloud running standard OSs.

I personally am not so worried about his because as the effort to create new code becomes shorter Enterprises will do what web developers who hosted on IPs will have done long ago, develop quickly and make solutions that are easy to migrate. Also with the expanding pace of digital transformation the life-cycle of any bit of code will start to be shorted than your hosting agreements with AWS.

There are some political issues to worry about. As a manager you have to be a bit more aware that certain people might not be comfortable with change. Don't ask an assessment of server free design from the people's who make a living hardening servers, they will find everything wrong with it.

Rather it might be smart to start building your team around security, architecture, code and UX; with the traditional role of networks and infrastructure being reduced over time.

You should start reducing the time an effort between a desire in your staff and a tool. Everyone sort of becomes a developer. You don't have to wait 6 weeks to get some boxes built or permission to expand the VM estate, you just press some buttons.

Make sure that when you start you have an empowered Serverless team with a budget, they need to be able to say yes to things. They also need to be fail first and fail fast kind of people as you don't have to buy expensive servers to try something.

So what is it all about?

Well after telling the Why now its time for the what.

Put simply, server-less is building your solution bases on SaaS or PaaS elements rather than hosting them on Virtual Machines.

So for example if you have some code you want to run in a web server in the Cloud. The dumb thing to do is to by a Instance of EC2 the size you need and load the code on to that. It cost more and you have to do a great deal of configuration. Using Containers like Docker reduce the cost but you now have to manage a container service and still manage you containers as you would on-premise.

If you know AWS you might think to deploy your code in Elastic BeanStalk, which is a better option. You can store you persistent code on cheap Object S3 storage rather than EBS, and you only start up tiny web servers when the code is needed. But still Elastic Beanstalk will create load balancers, EC2 instances and S3 for you, and if you solution is not well written or heavily used you may end spending the same as just hosting it on EC2 with EBS attached storage.

Rather load your code up to Lambda. you create a Lambda function, define the underlying engine and provide the code directly to it along with instances and interfaces.

For front ends the user needs to face rather than creating conplex server side pages create HTML 5 pages with JavaScript. If you call WebSockets to call your Lambda functions via an API gateway. This way you can store you site on S3 Object storage. Data can be stored on PaaS like DynamoDB, Redshift or RDS.

So there is you have it, fairly 

Friday, 16 December 2016

Office 365 vs AWS

I am not talking here of the task of AWS vs Azure, but the fully SaaS Office 365 vs the PaaS and IaaS of AWS.

These 'opinions' I have been gained by hard experience, and though things are changing I think I am coming to close to a general set of principles about how larger enterprises should work.

If you are a start up you can stop reading here, start up has massively unique opportunities to save money with technology that a bank or government department will not have.  I am talking about medium to large established companies and agencies with established IT.

Firstly just migrating existing VM and containers to the Cloud as IaaS is not going to save you much money.  I have seen this at both ends, in the detailed planning phase for migration with both AWS and Microsoft Azure and working with a live IaaS system that hosted all the servers of a major enterprise.

Right up front I will tell you that if you move all your servers to IaaS either AWS and Azure and do nothing else, you are going to not see much difference between on premise and Cloud.  For firms that consume managed services the two are going to be pretty much the same, you get a bill and the bill is going to be pretty much the same.

So to get the benefit of Cloud you need to make transformations, and what I am seeing is that it is best to break your entire IT estate in to 2 major groups.

I know officially what you are suppose to do is review the business case for each system you migrate over, deciding a final destination based on the application and rationalising the application before you migrate.

In a live talk this is when you ask a room for of CIO and CTOs how many have a full understand of the business cases for their applications and the latest architecture they use.

The reality is a data center is going to be full of lots of systems that are not fully understood to anyone in IT as a group.  You may have certain people who fully understand a system or group of systems, but as you move out you inevitably get a poorer and poorer understanding.

And lets be honest, you are unlikely to have staff or afford experts who have the skills to go to each team that manages a system, to get a full understanding of the business and technical parts and communicate accurately to management.

The hard truth is that in established IT systems you can only do so much discovery, a full discovery is prohibitive by cost and time and frankly may not really even be possible.  If you have been around for more than 10 years you have a great deal of legacy stuff maybe one guy remembers and each time you talk to the one guy he keeps harping on about....well we all work with technical experts don't we.

What migration to the Cloud has to deal with is uncertainty.  Uncertainty is just a part of life, it seems to be backed in to the very fabric of the Universe and IT, if anything, only increases it as systems become more complex.

So what you need to think is 'what kind of uncertainty does this service face.'

What I would suggest as a first pass is looking at specific vs general uncertainty.

Specific uncertainty is what we usually always think about.  A system sitting on a server is old, it has a wide range of users many sharing the same login access, and it lacks updated documentation.   You know the system is doing something, but you can't be fully sure of what.  It was designed for a specific task that may be only partially understood, and users are unable or unwilling to communicate and cooperate with anyone wanting to change their systems.

We all know this.

But there is a very different kind of uncertainty that we don't fully grasp.  The uncertainty of what is in emails, SharePoint or Word documents, what is entered in the of CRM systems or lives in SAP.

We know a great deal about these systems as they are industrial standards, but because they are widely used applications, but you can't understand what is inside of them, how people are using them and what value they really create.

These two types of uncertainly should be managed in two different ways in the Cloud.

Traditionally AWS manages the first, you have a mass of VMs you know you need on your estate.  So you migrate them to the Cloud, which in itself saves no money.  So how do you save money?

You look at the usage patterns and start modifying up time, CPU count, storage, bursting and commercial things like reserved or spot instances to save money.  What you do in this you take the high level view looking at how the systems perform as servers to assign more agile systems to them.  You also may migrate things to better platforms, ending a larger EC2 Windows server with IIS to run PHP to a BeanStalk with stores data on S3 and load balances micro servers to meet just in time demand.

But what about SharePoint, Exchange, CRM, does this work?

Here I would say the different kind of uncertainty means a different kind of road to cost savings.  You know perfectly well how these systems are architected and what they do, and you could reduce costs, but you have an opportunity to leap frog all of that, and move to the most optimal Cloud hosting directly by moving to SaaS directly in the form of Office 365.

In these system where you know the kind of tasks that people do, but have no idea the meaning to them of what they are doing and, as you would need to run two companies to know what everyone in the first is doing all the time with data, you probably are best just to start defining the roles that can use standard software.  Actually as the skills of workers increase they tend to use more tools than forms, Excel, Word, PowerPoint and SharePoint are flexible platforms.

Rather than migrating these servers my experience is you will save way more money pushing an all in with Office 365, and a major effort to see how may existing systems written with say .NET can be provided by SharePoint sites.

So to sum up:

  • If you have a lot of little boxes running code you don't fully understand for reasons that are unique to a community you are probably better migrating them in mass to AWS and working on reducing their consumption latter on by turing servers off, allowing autoscaling, and migration to better platforms.
  • If you have a massive system that many or all employees use for various different reasons, that is fully Enterprise, you are better looking at buy a SaaS replacement.

The main point is that you can only really know so much about the details of your IT, and in only so much accuracy, you need to plan for moving things in mass without disruption, and then how you save money.