Database Abstraction

From KallestadWiki

Jump to: navigation, search

Overview

In it's current iteration, KDF is designed to support MYSQL database implementations. The framework is designed such that a drop-in replacement to support PostGres or Oracle should be pretty simple to develop.

Support for mutliple database connections is also available, but is not desirable in high traffic scenarios.

The Future

In the future, I imagine the following architecture for database interaction -

Three Processes: The first is a management based process with a SELECT based wheel - in fact a multi-threaded SELECT based wheel could actually take advantage of multiple processors / cores as efficiently as possible. This process intakes DB level transactions and adds them to one of two Berkeley DB Queues (BDB Queues are about as efficient as you can get and support multiple concurrent connections for reading and writing intelligently)

The second process is a slow DB interaction manager. This is a multi-threaded process that pops DB Commands out of a Berkeley queue and pushes the transactions through to the database. This process manages all commands that are not inserts or index-based single entry queries.

The third process is a fast DB interaction manager. Much like the fast DB interaction manager, it is a multi-threaded process that pops DB Commands out of a Berkeley queue and pushes the interactions through to the database. This process manages all commands not handled by the slow DB interaction manager.

Memcached or a similar process can handle managed query caching to alleviate database load.

The reason for this forked architecture is to reduce the total number of database connections to a reasonably expected level (so that multiple separate KDF implementations can exist on the same machine) and that database transactions can achieve their maximum throughput by taking advantage of cached execution plans, memory buffers for high volume transactional scenarios, and intelligent management of CPU resources.

Of course there will be abstraction layer overhead, but much like the rest of KDF, the abstraction layer costs would be offset by the increased efficiency that can be taken advantage of, and will be taken advantage of by most if not all of the over-lying application structure where complex object storage is a requirement.

The question of transactional integrity comes to mind, but I do have a few ideas in that department as well - from binding complete DB transactions into a single thread, to caching rollback information locally and flagging errors and completions before dumping the structures completely. Race conditions would exist for partially completed and subsequently rolled back transactions - so I'm leaning towards pushing transaction elements into a single thread, but that implies a pretty large layer of inefficiency. I'm sure that parallel commitment flagging is a subject that will be pursued in the DB community eventually as multi-cored processors become the mainstay, but that will require driver rewrites across the board if I'm not mistaken. Who knows, maybe this issue has been addressed for 7 years and I just haven't gotten into the nitty-gritty yet.

Kallestad Development Framework Topics

Personal tools