Berkeley DB Reference Guide:
Berkeley DB Replication

PrevRefNext

Building the communications infrastructure

The replication support in an application is typically written with one or more threads of control looping on one or more communication channels, receiving and sending messages. These threads accept messages from remote environments for the local database environment, and accept messages from the local environment for remote environments. Messages from remote environments are passed to the local database environment using the DB_ENV->rep_process_message method. Messages from the local environment are passed to the application for transmission using the callback function specified to the DB_ENV->set_rep_transport method.

Processes establish communication channels by calling the DB_ENV->set_rep_transport method, regardless of whether they are running in client or server environments. This method specifies the send function, a callback function used by Berkeley DB for sending messages to other database environments in the replication group. The send function takes an environment ID and two opaque data objects. It is the responsibility of the send function to transmit the information in the two data objects to the database environment corresponding to the ID, with the receiving application then calling the DB_ENV->rep_process_message method to process the message.

The details of the transport mechanism are left entirely to the application; the only requirement is that the data buffer and size of each of the control and rec DBTs passed to the send function on the sending site be faithfully copied and delivered to the receiving site by means of a call to DB_ENV->rep_process_message with corresponding arguments. Messages that are broadcast (whether by broadcast media or when directed by setting the DB_ENV->set_rep_transport method's envid parameter DB_EID_BROADCAST), should not be processed by the message sender. In all cases, the application's transport media or software must ensure that DB_ENV->rep_process_message is never called with a message intended for a different database environment or a broadcast message sent from the same environment on which DB_ENV->rep_process_message will be called. The DB_ENV->rep_process_message method is free-threaded; it is safe to deliver any number of messages simultaneously, and from any arbitrary thread or process in the Berkeley DB environment.

There are a number of informational returns from the DB_ENV->rep_process_message method:

DB_REP_DUPMASTER
When DB_ENV->rep_process_message returns DB_REP_DUPMASTER, it means that another database environment in the replication group also believes itself to be the master. The application should complete all active transactions, close all open database handles, reconfigure itself as a client using the DB_ENV->rep_start method, and then call for an election by calling the DB_ENV->rep_elect method.
DB_REP_HOLDELECTION
When DB_ENV->rep_process_message returns DB_REP_HOLDELECTION, it means that another database environment in the replication group has called for an election. The application should call the DB_ENV->rep_elect method.
DB_REP_ISPERM
When DB_ENV->rep_process_message returns DB_REP_ISPERM, it means a permanent record, perhaps a message previously returned as DB_REP_NOTPERM was successfully written to disk. This record may have filled a gap in the log record that allowed additional records to be written. The ret_lsnp contains the maximum LSN of the permanent records written.
DB_REP_NEWMASTER
When DB_ENV->rep_process_message returns DB_REP_NEWMASTER, it means that a new master has been elected. The call will also return the local environment's ID for that master. If the ID of the master has changed, the application may need to reconfigure itself (for example, to redirect update queries to the new master rather then the old one). If the new master is the local environment, then the application must call the DB_ENV->rep_start method, and reconfigure the supporting Berkeley DB library as a replication master.
DB_REP_NEWSITE
When DB_ENV->rep_process_message returns DB_REP_NEWSITE, it means that a message from a previously unknown member of the replication group has been received. The application should reconfigure itself as necessary so it is able to send messages to this site.
DB_REP_NOTPERM
When DB_ENV->rep_process_message returns DB_REP_NOTPERM, it means a message marked as DB_REP_PERMANENT was processed successfully but was not written to disk. This is normally an indication that one or more messages, which should have arrived before this message, have not yet arrived. This operation will be written to disk when the missing messages arrive. The ret_lsnp argument will contain the LSN of this record. The application should take whatever action is deemed necessary to retain its recoverability characteristics.
DB_REP_STARTUPDONE
When DB_ENV->rep_process_message returns DB_REP_STARTUPDONE, it means that the client has completed its startup synchronization activities and is now processing live log messages from the master. Live log messages are messages that the master is sending due to operations, as opposed to resending log messages due to a request for log records from the client.

PrevRefNext

Copyright (c) 1996-2004 Sleepycat Software, Inc. - All rights reserved.