Tuesday, June 17, 2014

My SQL Transaction Log is huge - should I switch to simple recovery mode?

BACKGROUND:
In Microsoft SQL servers you have data file(s) (the .mdf and optional .ndf files) and the log file(s) (the .ldf files)

When a transaction in SQL occurs, remember that operation occurs in memory, is noted in the t-log file, and is only occasionally written to the data files during what is called a "checkpoint" operation.

When you perform a full or differential backup, it forces a checkpoint, which means all the modified data is copied from RAM to the data file, and that operation is noted in the t-log file. The backup then copies all of the data from the data files, and copies into a blank t-log file only the transactions that happened during the backup process. The server's t-log file is NOT touched.

However, when you perform a t-log backup the entire contents of the t-log are backed up and the file is "truncated" which means that instead of adding to the end of the file, we can overwrite the contents - so the t-log file will not fill up or grow, but circle around to use the same disk space.

PROBLEM:
If you do not take t-log backups then the t-log will fill up (preventing transactions, shutting down the database) or auto-grow (filling up a hard drive, often eventually creating t-logs that are bigger than the databases they serve, since they still have every transaction since the creation of the database!!!

1) PREFERRED SOLUTION:
Take t-log backups as a part of your regularly scheduled backup process. If you do this regularly your log backups will be short and fast, and you This will give you options to recover to any point in time, to recover individual corrupted files or even 8k data pages from a data file restore, and then bring them forward to the point of the rest of the databases using t-log restores.

2) SIMPLE SOLUTION:
Switch the database from full recovery mode to simple recovery mode. In the simple recovery mode the database will truncate the log every time there is a checkpoint of data from RAM to the data file.

SIMPLE SOLUTION PROBLEM:
Because the T-Log is constantly having contents deleted from it SQL will not allow for any T-Log backups to be performed, including emergency Tail-Log backups of an offline database. This means that if you did a full backup last night and three hours into today your data file is corrupted, you are only going to be able to recover to last night - you do not (by definition) have Full Database Recovery.

3) ONE ALTERNATE SOLUTION
If you have no plan for t-log backups and don't want them cluttering your backup media but want the emergency tail-log backup options available in the full recovery mode you could do the following (often with an automated job or database maintenance plan): Stay in the full recovery mode, monitor for when the t-log gets full, and when that occurs do a t-log backup, followed by a full db backup, and on success of the full backup delete the just created t-log backup.

4) ANOTHER ALTERNATE SOLUTION:
You could also  stay in the full recovery mode, perform a FULL backup, then detach the db, delete the .ldf file, and then attach the db - which will recreates the missing log file. I will sometimes use this method when cleaning up the 4GB t-log mess before returning to method #1.

Let me know if you have any questions. Hope this helps!




Tuesday, June 10, 2014

Compare Seek Times Between RAM, HDD, and SSD Hard Drives

How much faster is RAM than a Hard Drive anyway?

Let's look at some specs, do some math, and make an analogy to help us relate!
First, let's compare the retrieval time from different media types:

  • RAM: 3 nanoseconds
  • Mobile Disk = 12 milliseconds = 12,000,000 ns
  • Standard Disk - 9 milliseconds = 9,000,000 ns
  • Server Class HDD = 3 milliseconds = 3,000,000 ns
  • Server Class SSD = 0.1 milliseconds = 100,000 ns

When we compare these media, we see that RAM is at least a 1 million (and likely 4 million) times faster than HDD and 33,333 times faster than SSD. But how do you even conceive what those kinds of numbers mean?

Let's play a game where nanoseconds become regular seconds just to get an idea of comparative speeds.

Imagine you are at a restaurant, and you ask the server for some ketchup (or catsup if you prefer) for your meal. The server looks at the adjacent empty table, grabs the available bottle of ketchup, and hands it to you at RAM speed: total time = 3 seconds.

 
 



Now if your server was operating like a standard High Speed Server HDD, it would take them... 3,000,000 seconds = 34.7 days = over a MONTH to get you the ketchup!

Heaven forbid your sever is using a Standard HDD because that would be 9,000,000 seconds = 104 days = 3.4 months!



And if your server is grabbing your "ketchup" from a Mobile HDD, it would be even worse: 12,000,000 seconds = 139 days = 4.6 months!

Even optimized with a Server class SSD your server is probably going to be slower than you would want to wait: 100,000 seconds = 1.15 days.




Hopefully the lesson is not lost on all of you:
Cache is KING!

Invest in your RAM - and look for options in your software to optimizes RAM and cache usage!

That is all.