Berkeley DB Reference Guide:
Berkeley DB Concurrent Data Store Applications

PrevRefNext

Architecting Data Store and Concurrent Data Store applications

When building Data Store and Concurrent Data Store applications, there are two special issues to consider whenever any thread of control exits for any reason with a Berkeley DB database or database environment open.

First, unexpected application or system failure may result in lost data, corruption or inconsistencies in Berkeley DB databases. When a thread of control exits while holding Berkeley DB resources, all databases (modified since the database was last flushed to disk), should be either:

Applications where this is unacceptable should consider the Berkeley DB Transactional Data Store product, which offers standard transactional guarantees such as recoverability after failure.

Second, unexpected application or system failure requires that any persistent database environment (that is, any database environment not created using the DB_PRIVATE flag), be removed to recover the Berkeley DB resources and release any locks or mutexes that may have been held to avoid starvation as the remaining threads of control block behind the failed thread's locks or mutexes.

The Berkeley DB library cannot determine when to remove and re-create a database environment; the application must make that decision. Furthermore, database environment removal must be single-threaded; that is, one thread of control or process must remove and re-create the database environment before any other thread of control or process attempts to join the Berkeley DB environment.

There are two approaches to handling this problem:

The hard way:
Applications can track their own state carefully enough that they know when the database environment needs to be removed and re-created. Specifically, the rule to use is that any persistent database environment must be removed any time the threads of control previously using the Berkeley DB environment did not shut the environment down cleanly before exiting the environment for any reason (including application or system failure).
The easy way:
It is almost invariably simpler to remove and re-create the database environment each time a thread of control accessing a database environment fails for any reason. This requires the application detect application or system failure, of course, and remove and re-create the database environment on application, when appropriate.

There are two common ways to build Data Store and Concurrent Data Store applications. The most common way is as a single, usually multithreaded, process. This architecture is simplest because it requires no monitoring of other threads of control. When the application starts, it opens and potentially creates the environment, and then opens its databases. From then on, the application can create new threads of control as it chooses. All threads of control share the open Berkeley DB DB_ENV and DB handles. In this model, databases are rarely opened or closed when more than a single thread of control is running; that is, they are opened when only a single thread is running, and closed after all threads but one have exited. The last thread of control to exit closes the databases and the environment.

An alternative way to build Berkeley DB applications is as a set of cooperating processes, which may or may not be multithreaded. This architecture is more complicated.

First, this architecture requires that the order in which threads of control are created and subsequently access the Berkeley DB environment be controlled because database environment removal must be single-threaded. The first thread of control to access the environment must remove any previously existing environment and re-create the environment, and no other thread should attempt to access the environment until the removal is complete. (Note this ordering requirement does not apply to environment creation without removal. If multiple threads attempt to create a Berkeley DB environment, only one will perform the creation and the others will join the already existing environment.)

Second, this architecture requires that threads of control be monitored. If any thread of control that owns Berkeley DB resources exits without first cleanly discarding those resources, removing the database environment is usually necessary. Before removing the database environment, all threads using the Berkeley DB environment must relinquish all of their Berkeley DB resources (it does not matter if they do so gracefully or because they are forced to exit). Then, the database environment can be removed and and the threads of control continued or restarted.

We have found that the safest way to structure groups of cooperating processes is to first create a single process (often a shell script) that removes and re-creates the Berkeley DB environment, verifies, rebuilds or removes the databases, and then creates the processes or threads that will actually perform work. The initial thread has no further responsibilities other than to monitor the threads of control it has created, to ensure that none of them unexpectedly exits. If one exits, the initial process then forces all of the threads of control using the Berkeley DB environment to exit, removes the database environment, verifies, rebuilds or removes the databases, and restarts the working threads of control.

If it is not practical to have a single parent for the processes sharing a Berkeley DB environment, each process sharing the environment should log their connection to and exit from the environment in a way that allows a monitoring process to detect if a thread of control might have acquired Berkeley DB resources and never released them. In this model, an initial "watcher" process removes and re-creates the Berkeley DB environment, verifies, rebuilds or removes the databases, and then creates a sentinel file. Any other process wanting to use the Berkeley DB environment checks for the sentinel file; if the sentinel file exists, the other process registers its process ID with the watcher and joins the database environment. When the new process finishes with the environment, it unregisters its process ID with the watcher. The watcher periodically checks to ensure that no process has failed while using the environment. If a process does fail while using the environment, the watcher removes the sentinel file, kills all processes currently using the environment, removes and re-creates the database environment, verifies, rebuilds or removes the databases, and re-creates the sentinel file.

Obviously, it is important that the monitoring process in either case be as simple and well-tested as possible because there is no recourse if it fails.


PrevRefNext

Copyright (c) 1996-2004 Sleepycat Software, Inc. - All rights reserved.