8/25/2011

Tip for Sweep

As you probably know, manual sweep (by invoking gfix -sweep) is the important part of Firebird database maintenance (especially for big databases). Unfortunately, there are few people who understand the internals of sweep process. In this post we will not explain the magic of sweep, because it requires long and detailed explanation, instead of this we will provide you with the simple method to check that sweep was completed successfully and fulfilled its task.
After running  gfix -sweep you need to run any Firebird client (like isql.exe) and commit at least one transaction - this is necessary to move other transactions' markers after sweep.
Then run gstat -h and check its output: all transaction markers should be aligned (i.e., with minimal gap):

        Oldest transaction      16702
        Oldest active           16703
        Oldest snapshot         16703
        Next transaction        16704
If you see that gap is more than several transactions, it means that sweep did not remove all possible garbage (and in this case you can see gap of hundreds/thousands transactions). Unfortunately, sweep does not produce any errors or messages in firebird.log, so it's hard to determine the reason of failure.
Also, the big gap is an obvious alert to check database statistics (produced by gstat -r, visualized by IBAnalyst).
The most often reason why sweep does not clear all record versions is long-running writeable transaction (and this is the most often reason why automatic sweep does not work well), but there are other unpleasant options, like database corruption.
If you are sure that there were no connected users during sweep, or you saw that sweep finished unusually quickly for the big database (like several second for database 5+Gb in size), consider it as an alert, and run validation (gfix -v -full) as soon as possible.

Disadvantage of such approach (in terms of recognizing problem in system area of Firebird with failed sweep) is that sweep usually scheduled to run once per day, and it requires explicit attention of administrator, because only indirect signs appear. To monitor database health around the clock we are using our FBDataGuard tool - it check the same metadata that sweep touches during sweeping, and it sends alert immediately if something is wrong.




8/23/2011

You DON'T need CPUAffinity, nBackup, shadow and multifile databases


Right now at www.firebirdnew.org you can see the survey about Firebird features in production environment.
Some answers look like dangerous signals, especially for those who run big Firebird databases.



So, what's wrong with these answers?
1. CPUAffinity. CPUAffinity can be used to bind Firebird SuperServer to some particular CPU/core. At present we have plenty of cores even at desktop workstations, so there is no reason to use SuperServer and limit Firebird to the only CPU - use SuperClassic or Classic architectures to run Firebird at full throttle. 


2.  NBackup. In IBSurgeon we do not recommend our clients to use NBackup without external monitoring and as the only way of backup. NBackup makes the straight copy of database on page level  - it's fast, but, unlike the gbak, it does not check pages' contents. If you use only NBackup and does not perform necessary database maintenance (at least regular sweep, combined with transaction markers monitoring), and someday your database became corrupted (due to RAM problem or abnormal shutdown, for example) , NBackup will continue to make (and overwrite) "backups".  Another danger is "frozen: delta-file, when database are not unlocked correctly, and all changes are written to delta-file, and cannot be merged due to delta file problem.  To use nbackup strengths and avoid pitfalls, we setup special backup scheme for our clients.


3. Shadow is completely useless in modern production environments. It provides protection from the only type of corruption - occasional critical crash of HDD (assuming that shadow is configured correctly, with 2 separate HDDs).   Use RAID5 (or RAID10) instead - it will be much faster and more convenient for maintenance.


4. Multifile database. There is no actual reason to use multi-file database at present. With multi-file it will be impossible to do NBackup, database files will be tied to its locations, and it will give no advantage at all. Often multi-file databases are being implemented to perform copying of database to DVD, but you need to switch off Firebird during this operation.

Summary
Of course, the title of this post is a provocation, and sometimes administrator need to use CPUAffinity, nbackup and other features (not shadow!), in order to achieve some specific result, but it should be done correctly and with full understanding of steps to be done. NBackup is the most useful tool, which is very often understimated both in terms of strengths and possible problems with it.

More information:
"Firebird's Big Database" presentation at slideshare: