Monday 17 January 2011

Get your binary files out of my database! Making the case for remove blog storage in SharePoint

First let me say that remote blog storage didn't start with SQL Server 2008 and SharPoint 2010. 3rd Party products have provided these feature to SQL for some time. So remote blog storage (RBS) is something you should be thinking about with almost any SharePoint solution.

In fact looking back at things there are a few times that RBS could have saved some serious problem I had concerning migration.

Rob D'Oria has written an excellent post laying out the case from RBS here. I think probably you should read Rob's article over my own. The only thing I would want to ad is a very high level view of why SharePoint kind of forces RBS.

Rob D'Oria noticed what I notice, SharePoint has a unique and deeply disturbing problem of taking the content documents you upload to document libraries and storing them in the database. When a document like Word is stored in a database as a long string of binary data it is called a BLOB. Blog is an ugly name because BLOBs are simply put: ugly. You get no real benefit by putting the data in the database and you end up paying a fortune for storage.

Every ECM I have worked with (Open Text. EMC, and FileNet) takes the documents and stores them as just files on a file store. Open Text has some wonderful products for moving to cheaper and cheaper storage, right down to something called WORM, Write Once Read Many.

The important point is that BLOB storage is neither normal for ECM nor cheap. SharePoint has implement a normal data storage model which is both expensive and strange. And it gets worse.

Most ECM system use database to simply track metadata and some other overhead. So with something like Open Text you can store all your files in a massive single location with TBs of data, and the database is fairly small, all the major storage is on the local drive. If there is a database failure you can restore the small database very quickly. And the database does not grow very much as you add new files. And perhaps most importantly the size of the files does not directly drive up the size of the database.

SharePoint is fundamentally flawed in placing all unstructured data in the SQL database, meaning you get very large databases by the user loading up large documents. Since databases can not index BLOB storage this always felt like a radical fault to SharePoint, and it was one reason to front it with something like Open Text.

But BLOB storage is a SQL issue, and you can use a 3rd party product to support RBS with any supported SQL Server. So if you are using SharePoint and you find you have these massive SQL Server Databases that have grown over the years, and you can see no easy way to reduce them, or you have a requirement to build a site with a massive SQL Server database, I would point you to Rob D'Oria article and say GO REMOTE BLOB STORAGE.

No comments:

Post a Comment