DB2(in Russian it is pronounced “dibi two”, tracing paper from English “dibi tu” is also common) - family software products in Information Management at IBM.

Most often, when referring to DB2, they mean the relational database management system DB2 Universal Database (DB2 UDB), developed and released by IBM.

The spelling "DB/2" is sometimes seen, but this spelling is incorrect: in the IBM notation, the number in the denominator of a fraction means the platform and "/2" means the product for the OS/2 operating system (or the PS/2 series of computers). For example, the version of DB2 for OS/2 was designated "DB2/2".

Implementations

The DB2 DBMS is currently available on the following platforms:

  • DB2 for Linux, UNIX and Windows v9 for AIX, HP-UX, Linux, Solaris, Windows platforms and beta for Mac OS X platform
  • DB2 for z/OS v9 for z/OS and OS/390 platforms
  • DB2 Server for VSE & VM v7 for z/VM and z/VSE platforms
  • DB2 for i for the IBM i platform (integrated into the system at the hardware and software level)

In the past, versions of the DB2 database server for OS/2, UnixWare, PTX have been released.

DB2 DBMS clients, in addition to the listed platforms, are released or have been released in various versions also for SINIX, IRIX, classic Mac OS and for MS-DOS, as well as in mobile version DB2 Everyplace for Windows CE, Palm OS, Symbian OS, Neutrino and virtual machine java.

Currently, in addition to the commercial products of the family, IBM also distributes a free distribution DB2 Express-C for Linux (x86, x86-64, POWER), Windows (x86, x86-64), Solaris (x86-64), Mac OS X (x86-64 beta) platforms. Free version has restrictions on the use of no more than one dual-core processor and 2 GB for the operation of the DBMS random access memory(the total number of processors and memory in the system can be any, but resources beyond the specified limits will not be used by the DBMS).

Story

DB2 has a long history and is considered by some to be the first DBMS to use SQL.

From 1975 to 1982, the DB2 prototype was developed at IBM under the name System Relational, or System R. The SQL language was first implemented in IBM System R, but this system was of a research nature, and the commercial product, including SQL, was first released by Oracle in 1979.

DB2 got its name in 1982 with the first commercial release for SQL/DS, and then for MVS called DB2. For a long time, along with "DB2", the "Database 2" variant, also a trademark of IBM, was used. Apparently, this was meant to be the second flagship IBM DBMS after the old hierarchical IMS DBMS.

The development of DB2 goes back to the early 1970s, when Dr. E. F. Codd, who worked for IBM, developed the theory of relational databases and published a data manipulation model in June 1970. To implement this model, he developed a relational database language and called it Alpha. IBM chose to outsource further development to a group of programmers outside of Dr. Codd's control. Violating some principles of the relational model, they implemented it as a "structured English language requests”, abbreviated as SEQUEL. Since SEQUEL was already a registered trademark, the name was shortened to SQL - "Structured Query Language" and has remained so to this day.

Thus, historically, DB2 evolved from DB2 for MVS (of which DB2 for z/OS is a descendant) and its sister SQL/DS for VM (of which DB2 Server for VSE & VM is a descendant). Subsequently, another development team at IBM implemented the OS/2 EE Database Manager server, which later evolved into DB2 v2 for OS/2, AIX and then Windows, and then into DB2 UDB (its descendant is DB2 for Linux, UNIX and Windows). Another team completed the integration of the DB2 architecture with the embedded AS/400 database (a descendant - DB2 for i). IBM is gradually moving towards the integration of all these branches.

Peculiarities

The distinguishing features of DB2 include the dialect SQL language, which determines, with rare exceptions, the purely declarative meaning of language constructs, and a powerful multi-phase optimizer that builds an effective query execution plan based on these declarative constructs. Unlike other SQL dialects, the DB2 SQL dialect has practically no hints to the optimizer, is poorly developed (and for a long time was generally absent) a stored procedure language, and thus everything is aimed at maintaining a declarative style of writing queries. At the same time, the DB2 SQL language is computationally complete, that is, it potentially allows you to define any computable correspondences between the source data and the result in a declarative form. This is achieved, among other things, through the use of table expressions, recursion and other advanced data manipulation mechanisms.

Due to IBM's focus on relational development and the firm's position in the computer industry, the DB2 SQL dialect has a significant impact on the ANSI/ISO SQL standards.

Stored procedures are not widely used in DB2, with conventional programming languages ​​traditionally used to write stored procedures. high level(C, Java, PL/I, Cobol, etc.), this allows the programmer to easily format the same code either as part of an application or as a stored procedure, depending on whether it is more appropriate to execute it on the client or on the server . DB2 also currently implements the SQL procedural extension for stored procedures, in accordance with the ANSI SQL/PSM standard.

The DB2 optimizer makes extensive use of statistics on the distribution of data in tables (if the data collection process was performed by the database administrator), so the same SQL query can be translated into completely different execution plans, depending on the statistical characteristics of the data it processes.

Since historically DB2 has evolved from multi-user systems on mainframes, a lot of attention in the DB2 architecture is given to issues of security and the distribution of roles of specialists maintaining DB2. In particular, unlike many other DBMSs, DB2 has separate roles for the DBMS administrator (responsible for configuring software components DB2 and their optimal execution in computer system) and a database administrator (responsible for managing data in a particular database).

The use, if necessary, of static SQL in programs and the concept of packages, unlike most other DBMS, allows the implementation of such a security model when the rights to perform certain operations can be granted to application programs in the absence of such rights for users working with these programs. In this case, this makes it possible to guarantee the impossibility of the user working with the database bypassing the application program, if the user has only rights to run the program, but not to independently manipulate data.

As part of the concept of increasing the level of integration of security tools in a computer system, DB2 does not have its own means of authenticating users, integrating with operating system tools or specialized security servers. Within DB2, only users authenticated by the system are authorized.

DB2 is the only general-purpose relational DBMS that has implementations at the hardware/software level (IBM i system; DB2 support is also implemented on IBM System z mainframe hardware).

Modern versions of DB2 provide enhanced support for using XML data, including operations on individual elements of XML documents.

Error processing

A useful feature of DB2 SQL Server is its ability to handle errors. The SQLCA structure is used for this purpose. SQL Communications Area- SQL link area) that returns error information to the application program after each execution of the SQL statement.

SQLCODE Structure Fields and Their Values

The main, but not always useful error diagnostics is contained in the field SQLCODE(data type - integer) inside the SQLCA block. It can take the following values:

  • 0 means success.
  • A positive number means success with one or more warnings. For example, +100 means no columns were found.
  • A negative number means failure with an error. For example, −911 means a detected expired lock wait interval (or deadlock) triggering a sequential rollback.

SQLERRM(data type - a string of 71 characters). Contains text string with a description of the error if the SQLCODE field is less than zero.

SQLERRD(data type - array, 6 integers). Describes the result of executing the last SQL statement:

  • 1 element - internal information;
  • 2nd element - contains the value of the SERIAL type field generated by the server for the INSERT statement, or an additional error code;
  • 3rd element - equal to the number of processed records;
  • 4th element - the approximate cost of executing this operator;
  • 5th element - offset of the error in the text record of the SQL statement;
  • 6th element - internal information.

Notes

Links

  • Program page on the IBM website
  • DB2 on developerWorks - DB2 articles and training
  • PlanetDB2 - DB2 Blogs

Literature

  • Date K. DB2 Relational DBMS Guide. - M.: Finance and statistics, 1988. - 320 p. - ISBN 5-279-00063-9
  • Zikopoulos P.K., Baklarz J., deRus D., Melnik R.B. DB2 Version 8: The Official Guide = DB2 Version 8: The Official Guide. - M.: KUDITS-OBRAZ, 2004. - 400 p. - ISBN 5-9579-0031-1
  • Smirnov S. N. Working with IBM DB2: Tutorial. - M.: Helios, 2001. - 304 p. - ISBN 5-85438-007-2 (recommended by UMO universities in the region information security as a teaching aid in the specialties "Integrated information security of automated systems" and "Computer security")
  • Susan Visser, Bill Wong. Teach Yourself DB2 Universal Database in 21 Days = Sams Teach Yourself DB2 Universal Database in 21 Days. - 2nd ed. - M.: Williams, 2004. - 528 p. - ISBN 0-672-32582-9
  • Hook J., Harbus R., Snow D. The Universal Guide to DB2 for Windows NT®. - New Jersey: Prentice Hall PTR, 1999. - P. 504. - ISBN 0-13-099723-4

Wikimedia Foundation. 2010 .

See what "IBM DB2" is in other dictionaries:

    IBM DB2- Developer(s) IBM Initial release 1983 (1983) ... Wikipedia

    IBM DB2- DB2 ist ein kommerzielles relationales Datenbank Management System (RDBMS) der Firma IBM, dessen Ursprünge auf das System R und die Grundlagen von E. F. Codd vom IBM Research aus dem Jahr 1970 zurückgeht. Inhaltsverzeichnis 1 Eigenschaften 1.1… … Deutsch Wikipedia

    IBM DB2- Développeur IBM Dernière version ... Wikipedia en Français

    IBM DB2 Commonstore- DB2 CommonStore Archiving software produced by IBM for managing e mail messages or SAP ERP data. Part of the IBM Information Management portfolio which builds upon the DB2 database platform. DB2 CommonStore is one of several products which are… … Wikipedia

At work, I had to deal with the IBM DB2 DBMS for some time. Because Since the system is commercial, there is not much information in Russian on the Internet, so I decided to describe some of the features of this DBMS.

Point of entry

Let's start with the entry point in the DBMS. In SQL SERVER end point is an instance, in which, of course, there can be separate databases, but the configuration and security model are the same for the entire instance. In DB2, the entry point looks like this - an instance (which corresponds to a specific port) - a database. At the same time, there is a configuration for the entire instance, and for a separate database.

You can view the instance configuration either using the db2 command:

Database Manager Configuration

Node type = Enterprise Server Edition with local and remote clients

Database manager configuration release level = 0x0b00

CPU speed (millisec/instruction) (CPUSPEED) = 2.912790e-07
Communications bandwidth (MB/sec) (COMM_BANDWIDTH) = 1.000000e+02

Max number of concurrently active databases (NUMDB) = 8
Federated Database System Support (FEDERATED) = YES
Transaction processor monitor name (TP_MON_NAME) =

Default charge-back account (DFT_ACCOUNT_STR) =

Java Development Kit installation path (JDK_PATH) = /home/db2inst1/sqllib/java/jdk32

Diagnostic error capture level (DIAGLEVEL) = 3
Notify Level(NOTIFYLEVEL) = 3
Diagnostic data directory path (DIAGPATH) = /home/db2inst1/sqllib/db2dump

Default database monitor switches
Buffer pool (DFT_MON_BUFPOOL) = OFF

Where the parameters will be specified, their meaning and decoding. An abbreviated version is also possible:

get dbm cfg

Or with a query:

Select name, value from sysibmadm.dbmcfg

From important parameters you can note:

  • authentication type (AUTHENTICATION)
  • default path for creating new databases (DFTDBPATH)
  • network server discovery (DISCOVER)
You can view the settings for a specific database like this:

connect to sample(sample - database name)

get database manager configuration

Or with approximately the same request as before:

select name, value from sysibmadm.dbcfg

Authentication

The big difference between DB2 and other DBMSs is the authentication model. There are no internal users like in SQL Server or MySQL. All authentication is performed by means external to the DBMS (dynamically loaded plugins) - by means of the operating system or external plugins (Kerberos, GSS API). The authentication type is set in the AUTHENTICATION parameter of the database manager configuration. By default, the SERVER value is set - the username and password are transmitted in clear text, and this pair is checked for correctness by means of the operating system. If the username and password are correct, then the CONNECT privilege is checked for the user or groups to which he is a member (including the special PUBLIC group, which includes all authorized users). These privileges can be viewed in the SYSCAT.DBAUTH table:

select GRANTEE from SYSCAT.DBAUTH where CONNECTAUTH = "Y"

A big configuration mistake is to include the CLIENT authentication type. In this case, DB2 trusts the connecting client to perform authentication, and if PUBLIC has the CONNECT privilege, then any user will be able to connect to the database and gain access to all the data that PUBLIC has. The username is taken from the operating system. That is, if we connect through Data Studio as an Administrator user, then all the privileges that this user. And in this case, there is no difference from which computer the access was made. This type authentication is recommended to be enabled only when there is a secure channel between the server and the client, and other clients will not be able to connect to the DBMS.

Authorization

Instance level privileges are written in the database manager configuration. These are the following privileges:

  • SYSADM
  • SYSCTRL
  • SYSMAINT
  • SYSMON
These privileges are set by specifying the group where the user will enter. In dbmcfg, these are the SYSADM_GROUP , SYSCTRL_GROUP , SYSMAINT_GROUP , and SYSMON_GROUP options, respectively.

Next, there are database-specific privileges. These are privileges such as database access (CONNECTAUTH), table creation (CREATETABAUTH), routine creation (EXTERNALROUTINEAUTH), and so on. These privileges can be viewed in the SYSCAT.DBAUTH view

And finally, access privileges to specific data - tables, subroutines, and so on. Everything here is quite trivial, but also with some peculiarities.

Table access privileges can be viewed in the SYSCAT.TABAUTH view. The type of the granted privilege is stored in separate columns, depending on the privilege itself (SELECTAUTH, DELETEAUTH, etc.). When granting a privilege using the GRANT command for the REFERENCES and UPDATE privileges, you can also specify the names of the columns to which the given privileges will be extended. In this case, information about this can be viewed in the SYSCAT.COLAUTH view

Privileges for routines (functions, procedures, and methods) can be viewed in SYSCAT.ROUTINEAUTH . Not everything is trivial here, depending on the SPECIFICNAME and TYPENAME fields, privileges can be granted to all subroutines of a given scheme.

If readers like the article, then I'm ready to talk about protecting data in DB2 using Label-Based Access Control

Overview of basic concepts and general description IBM DB2 database architectures for Linux, Unix and Windows platforms

Content series:

This content is part # of a series # of articles: Overview of DB2 LUW

http://www.?q=%D0%9E%D0%B1%D0%B7%D0%BE%D1%80+DB2+LUW&co=ru&lo=ru&ibm-submit.x=11&ibm-submit.y=13&sn= mh&lang=ru&cc=RU&en=utf&hpp=

Stay tuned for new articles in this series.

Raw Materials and contact information

Special thanks to Mark Barinshtein for taking the time to proofread the material of the articles, attention to detail and valuable comments.

The main part of the material of the articles is a free interpretation official documentation db2. The information presented has been restructured and reformulated to be concise and at the same time as clear as possible. References to the sources used in all cases are provided in the text of the articles and in the "Resources" section.

As part of a series of articles, it is planned to review the following issues:

  1. (the article you are currently reading)
  2. (installation, configuration, diagnostics, backup and recovery)
  3. advanced administration procedures (transfer of information, performance optimization, management of execution priorities);
  4. tools for building analytical data warehouses;
  5. in-memory analytics technologies - DB2 BLU;
  6. massively parallel analytical data processing with DB2 DPF (Database Partitioning Feature);
  7. distributed databases (failover configurations, data replication and federated data access);
  8. DB2 pureScale clustering capabilities for fault tolerance and scalability.

Articles in the series will be published as the relevant materials are prepared.

DB2 product family

The name "DB2" is used by IBM for a whole family of products that differ from each other both in the composition of the hardware and software platforms on which they are used, and in functionality, architecture and technological features. These differences are due to the tight integration of most DB2 family products with operating systems in which they function, as well as the specifics of these operating systems.

The DB2 product family currently includes:

  • DB2 for Linux, Unix and Windows, or DB2 LUW - a DBMS for systems running Linux (RedHat, SuSE, Ubuntu), UNIX (AIX, HP-UX, Solaris) and Microsoft Windows, which is the subject of this article and other articles in this series;
  • DB2 for z/OS- DBMS for the z/OS operating system used on IBM System z mainframes;
  • DB2 Server for VSE & VM- DBMS for operating z/VM and z/VSE used on IBM System z mainframes;
  • DB2 for i- DBMS for the System i operating system used on the IBM Power platform.

Each of the listed DBMS is architecturally adapted for the most efficient functioning in the corresponding operating systems, and includes its own specific set of tools and administration tools.

The terminology used in the documentation for the various DB2 family DBMSs is not uniform, and the same terms may have different meanings for different variants of DB2: for example, the terms "database" and "tablespace" have different meanings for DB2 LUW and DB2 for z/OS, due to architectural differences between these types of DBMS.

Thus, when working with information resources dedicated to DB2, it is necessary to clearly distinguish which of the products of the family is being discussed in order to avoid confusion and possible errors.

Deprecated DB2 LUW Functions

In one form or another, the DB2 LUW product has been on the market since 1989 (the year the OS/2 1.10 Extended Edition operating system was released, which included the Database Manager component - that is, the relational DBMS that was the basis for DB2 LUW).

During the long development of the product, some originally developed functions were rethought and replaced by another implementation, or completely excluded from the product due to their lack of need. Therefore, when working with materials prepared for older versions of DB2 (for example, version 9.7), be aware that some of the features described in these materials may be replaced in newer versions of DB2 (for example, 10.5 and 11.1). Detailed information about deprecated and replaced functions is given in .

The most notable changes for administrators and developers include:

  • replacement of obsolete graphical management tools "Control Center", "Task Center" and a number of others with the functions of the free IBM Data Studio package, as well as with the functions of the tools included in the free edition of the IBM Data Server Manager product;
  • deprecation of the DB2 Administration Server (DAS), which was required for legacy administration tools;
  • replacing the DB2 Governor and Query Patroller workload management tools with DB2 Workload Manager (DB2 WLM) functionality.

Purpose of preparing this article series

The main purpose of writing a series of review articles on IBM DB2 is to fill the lack of materials on this subject in Russian. Indeed, despite the availability of translations of a considerable part of the documentation into Russian and the available books on DB2, there is still a lack of available overview information that would allow interested readers to get an idea of ​​​​the DB2 features, built-in functionality and administration specifics.

However, it is not the intention of the author to prepare an overview of all products in the DB2 family (see the sidebar "The DB2 Product Family"), instead, it is planned to focus on the DB2 variant for Linux, Unix, and Windows operating systems - i.e. on a DB2 LUW product.

Readers interested in practical guide to get started with DB2, I recommend that you refer to the freely distributed book "", translated into Russian. This book provides many examples of common DB2 software operations, making it easy to get started with both the DB2 9.7 version described in the book and newer DB2 versions (10.5 and 11.1). When working with current versions software DB2, be aware that some functionality in version 9.7 has been deprecated and is not supported in new versions of the product (see the "Deprecated DB2 LUW Features" sidebar).

DB2 LUW functionality

IBM DB2 uses the currently accepted client-server architecture relational DBMS, with information storage on the server and connection of client applications to databases locally or via a network.

To provide concurrent access to data from parallel applications, DB2 uses a transactional locking and logging mechanism to provide standard ACID guarantees (Atomicity, Consistency, Isolation, Durability). This mechanism has come a long way of evolution to ensure maximum performance, reliability and minimize application execution delays.

DB2 provides support for all common industry standards for application data access, including a standard language SQL queries, ODBC and JDBC interfaces, work with typical text table formats, etc. In addition, DB2 includes advanced capabilities for storing and working with semi-structured data in XML formats, JSON/BSON. For developing stored procedures, DB2 provides support for a variety of procedural languages, including:

  • standard for DB2 language SQL PL,
  • the SQL/PL language used in Oracle's DBMS,
  • the ability to develop "external" stored procedures in Java, C, C++ and COBOL.

Distinctive features of DB2 are:

  • scalability, limited only by available computing resources, and the most economical use of computing resources;
  • powerful built-in means of differentiation and access control, which provides the possibility of granular restriction of access to information in the context of objects (tables, views), as well as implementing a model of mandatory access control;
  • advanced integrated data backup and recovery system;
  • availability of a full set of technologies for building "classic" analytical data warehouses (dividing tables into sections, materialized views, optimizing data caching and scanning tables and indexes, "internal" parallelism in the execution of complex queries, etc.);
  • support for building massively parallel analytical data processing (MPP) configurations from multiple servers connected via a communication network based on the DB2 Database Partitioning Feature (DB2 DPF);
  • maximum fault tolerance and near-linear scaling of DB2 pureScale cluster configurations, with data stored on shared disks;
  • DB2 BLU technology that implements support for modern in-memory "column" analytics without the use of manual optimization of the database structure.

To facilitate the migration of applications from other types of DBMS (primarily Oracle Database), DB2 provides advanced compatibility tools, including support for the necessary data types, stored procedures, and standard system views.

There are several editions of the DB2 LUW product, united by a single set of basic functions and differing from each other in the presence of restrictions on the used computing resources and support for advanced functionality. The DB2 Express-C edition, which is available at no cost, can be used for product evaluation, learning, and even small production deployments. The functionality and resource limitations of the various editions of DB2 LUW are detailed in the article "".

Structure of the database server

A DB2 database server is a computer on which the DB2 server software (DB2 engine) is installed and which provides structured information management services.

Access to DB2 services by applications is provided by the DB2 client software (IBM Data Server Driver Package) that communicates with the DB2 server in accordance with supported application connection methods (including ODBC, JDBC, OLE DB, ADO, CLI, and other methods). In most cases, the required client software is installed with the DB2 server, allowing applications hosted directly on the database server to connect to the DB2 server.

A DB2 database server can host multiple copies of DB2 software, with different software versions and installation directories. Multiple copies of DB2 software may reside independently on the same server, as long as there are no resource conflicts between them (including sufficient server computing resources and no conflicts over operating system logical resources: network names, port numbers, file system directories, and so on. ).

The direct provision of DBMS services is provided by the DB2 database manager (DB2 DBM) component. Each copy can have multiple instances of the DB2 database manager, or more briefly, DB2 instances. An instance is an independent environment in which databases can be created and applications can run. Each DB2 instance has its own configuration and provides access to a specific set of databases. DB2 instances are independent in the sense that the execution of operations on one instance does not affect others, except for resource constraints imposed by running multiple instances on the same physical or virtual server.

Starting and stopping DB2 services is done at the instance level, i.e. each DB2 instance can be in a running or stopped state. The parameters of a DB2 instance can define its resource limits (for example, in terms of memory usage). The resources of a DB2 instance are used to maintain the databases that exist within the instance.

A database is a collection of objects that make up a single information array (tables, views, indexes, etc.). Databases are independent entities and, accordingly, do not typically share objects with other databases (an exception may be distributed database configurations that use data access federation mechanisms).

A schematic example of the database server structure is shown in the figure.


In many cases, the DB2 database server contains a single DB2 installation with a single instance created serving a single database. With this configuration, all database server resources are used to run a single DB2 database.

Serving requests from connected applications on the database server side is performed by so-called DB2 agents. For each connected application within a DB2 instance, a coordinating (primary) agent is started, which can start several additional (auxiliary) agents as needed. Technically, each agent is a separate thread of execution, or (for older versions of DB2) a separate operating system process, with associated resources needed to run it.

DB2 configuration options

DB2 server configuration can be set at four different levels:

  • Environment Variables;
  • DB2 profile registry;
  • database manager configuration file (DBM CFG);
  • database configuration file (DB CFG).

Environment variables are set at the server operating system level and by means of the operating system. For Windows OS, these variables are actually global for the server, for Unix and Linux OS families, each instance can have its own specific environment variable settings.

DB2 profile registry settings can be set at the operating system level (globally) or at the instance level, with instance level settings overriding those set at the operating system level. Viewing and setting DB2 profile registry values ​​is done using the db2set command.

The database manager configuration file options are defined at the instance level, while the database configuration options are defined at the database level.

Many parameters are dynamic, i.e. changes made take effect immediately; however, there are settings that require you to stop and start the instance to change. This can be done on the command line using the db2stop and db2start commands. All applications must shut down before stopping the instance. You can use the db2stop force command to force an instance to stop.

The database manager configuration file includes settings that affect the instance and all the databases it contains. The database manager configuration file can be viewed or modified using command line(using the GET DBM CFG and UPDATE DBM CFG commands), as well as the IBM Data Studio tools.

The database configuration file includes options that affect a specific database. The database configuration file can be viewed or modified using the command line (GET DB CFG and UPDATE DB CFG commands) and IBM Data Studio.

A detailed description of the supported , as well as is given in the official DB2 documentation.

Organization of data storage

The smallest physical storage unit in DB2 is page. Allowed page sizes are 4K, 8K, 16K, and 32K. Database object information (such as table entries and index entries) is placed on pages.

The allocation of additional space for data storage is allocated by groups of pages, which are called extents. Performing extra space allocation operations at the extent level improves the performance of record inserts and updates.

The storage of information in DB2 databases is organized into objects called tablespaces. A tablespace is a named set of containers for storing information placed on the file system of a database server.

One or more tablespaces are created for each tablespace. containers(files or directories in the file system) to store information, as well as set the page size and area for caching data (buffer pool, see below), as well as a number of other parameters.

There are the following types of tablespaces:

  • ordinary: used to host user tables and indexes;
  • large: Used to host user tables and indexes, as well as large object (LOB) data and XML data. In modern versions of DB2, large tablespaces are used by default instead of regular ones;
  • temporary: used to store temporary information when executing queries (system temporary tablespaces) and temporary tables defined by applications (user temporary tablespaces).

The type of a tablespace is specified when it is created and cannot be changed except by deleting and re-creating the tablespace.

Tablespaces can also be classified by the type of control that is set when the tablespace is created:

  • system-managed tablespaces (SMS, System Managed Storage) - directories are used as tablespace containers, data files are created to place storage objects in directories. Space is not preallocated, files grow dynamically. When defined, containers are fixed at the time they are created;
  • table spaces managed by a database (DMS, Database Managed Storage) - pre-allocated files are used as table space containers, containers can be added, deleted or changed;
  • automatically managed tablespaces (automatic storage) - automatic detection of the type and location of the container depending on the type of tablespace (DMS for regular and large tablespaces, SMS for temporary tablespaces). Specific definitions of containers are not specified when creating a table space, the necessary containers are created automatically. The growth of existing containers and the addition of new containers are entirely controlled by DB2.

To enable automatic tablespace management, you first need to create a database with automatic storage enabled (which is the default) and bind a set of storage paths to it.

By default, DB2 writes sequentially to extents, "stripping" between containers. For example, if you have a table space with a page size of 4 KB and an extent size of 8 pages, and you use 3 immediate containers in a DMS table space, this means that 32 KB of data (4 KB x 8 pages in extent = 32 KB) will be written one disc before recording to the next begins.

Starting with DB2 Version 10.1, a new concept has been introduced to simplify data storage management − storage group(storage group). A storage group is a named collection of paths in the DBMS server file system that can be used to store data. The composition of storage groups in a database typically defines the set of types of storage devices available to store information. When a database is created, a default storage group is always automatically created in the database.

Each automatically managed table space is associated with one of the created storage groups, which determines the physical location of the data stored in the corresponding table spaces. It is possible to move a table space from one storage group to another using the ALTER TABLESPACE ... USING STOGROUP ... command.

Transaction Logging

IBM DB2 LUW, like most other modern relational DBMSs that provide ACID guarantees, uses the transaction log as one of the primary mechanisms to enforce these requirements.

Data modification operations performed by DB2 are recorded in the transaction log as a series of log entries. Each database maintains its own transaction log, which is a sequence of files on disk. The size of a single file is determined by the LOGFILSIZ parameter, the number of initially created files is determined by the LOGPRIMARY parameter. If necessary, DB2 can create additional files log, the maximum number of files created is controlled by the LOGSECOND parameter.

Information is written to the transaction log using a special buffer in RAM. The contents of the buffer are flushed to disk (to transactional log files) as the buffer fills up, as well as when transactions are confirmed or canceled (at the command of the application or upon abnormal closing of the connection with the application).

The transactional log file that is required to recover data after a crash is called the active log file. Active transaction log files must be available to the DB2 database manager at all times. Since the availability of transaction log files is critical to ensure the DBMS performance, a mechanism is provided for mirroring transaction logs in two file systems(configurable with the LOGMIRROR parameter).

If you choose the wrong size and number of transaction log files that do not correspond to the level of the current load, there may be situations of overflow of the transaction log due to an insufficient number of log files allowed to be created or a lack of available disk space. Depending on the database settings (see the BLK_LOG_DSK_FUL parameter), an appropriate error message may be returned to applications, or processing may be suspended until the situation is resolved by the administrator.

Also, situations of transaction log overflow can occur when there are long-running transactions that perform data modification operations. Even if such a long-running transaction makes a single, small database change that then goes uncommitted for a long time, the corresponding transaction log file remains active and cannot be reused.

There are two main modes of operation for DB2 transactional logging: circular logging and archive logging. In circular logging mode, DB2 cycles through the generated set of transactional log files. In archive logging mode, DB2 optionally copies transaction log files to the archive using the methods specified by the LOGARCHMETH1 and LOGARCHMETH2 parameters.

The circular logging mode ensures the restoration of the integrity of the database in the event of a crash of the DBMS server. Backing up such a database is possible only after shutting down all applications (that is, with the suspension of user access). Data recovery from backup only possible with bringing the database to the state at the time of the backup.

The archive logging mode also ensures that the integrity of the database is restored if the DBMS server crashes. Additionally, you can back up the database without interrupting user access and include active log files (needed to restore data integrity) in the backup. Restoring data from a backup can be supplemented by applying the changes made to the database since the backup was taken and bringing the database to a state at a selected point in time in the past (but not earlier than the time the backup was taken).

The archive logging mode requires additional resources to perform archiving operations, including increased I/O and additional disk space for storing archived transactional log files.

Organization of data caching

In order to reduce the amount of I/O performed, DB2, like other modern relational DBMSs, caches reads and writes performed on tablespaces. Caching is done using areas of RAM called buffer pools. Several different buffer pools (created by the CREATE BUFFERPOOL command) can be defined in DB2, with page size, size, and auto size control enabled flag. Each table space is mapped to a specific buffer pool, and one buffer pool can be shared among multiple table spaces.

When performing a read operation, it first searches for the desired page with data in the buffer pool. If a required page is found, the data is read from the buffer pool, otherwise the page is loaded from disk into the buffer pool. A mechanism is provided for asynchronous prefetching of pages into the buffer pool when a linear (predictable) nature of page accesses is detected. The prefetching mechanism in many cases reduces the waiting time for read operations of the necessary data from disk by performing reads in asynchronous mode.

When a write operation is performed, page adjustment is performed directly in the buffer pool. In this case, the page is not written to disk in a synchronous mode, and data safety is ensured by the transactional logging mechanism. Changed table space pages are written to disk asynchronously, in the background, and provides a reasonable minimization of the amount of work that may be required to restore the state of the database when it is abnormally (incorrectly) closed. Closing the database correctly (for example, during a regular shutdown of the DBMS server) ensures that all modified pages of all buffer pools are written to disk.

RAM usage

The DB2 memory model consists of various areas memory at the DB2 instance, database, application, and agent levels.

For a detailed description of DB2 storage areas, see below short description appointments of different areas.

A list of major DB2 storage areas is shown in the figure below (originally taken from ).

The total memory used by the DBMS instance includes:

  • Monitor Heap - memory area for monitoring operations and status, the size is regulated by the MON_HEAP_SZ parameter;
  • FCM Buffers - a memory area for interaction between the coordinating agent and its subagents, as well as for providing internal interactions in partitioned databases;
  • Audit Buffer is a memory area where audit records are placed before being flushed to the audit log.

At the database level, it is customary to distinguish between:

  • a global database area, often referred to as the "Performance memory" area, which includes various caching areas and a locking area;
  • an application data area, often referred to as the "Functional memory", which includes the various working memory areas of agents servicing database connections.

The global database area consists of the following main components:

  • Buffer pools - buffer pools, i.e. areas for caching tablespace data;
  • Lock list - an area for storing information about locks, the size of which is controlled by the LOCKLIST parameter;
  • Package cache - area for caching query execution plans, the size is regulated by the PCKCACHESZ parameter;
  • Catalog cache - area for caching the system catalog, which includes descriptions of all database objects, the size is regulated by the CATALOGCACHE_SZ parameter;
  • Utility heap - RAM for performing database maintenance operations (including backup and restore operations), the size is regulated by the UTIL_HEAP_SZ parameter;
  • Database heap - RAM for servicing database operations (including a transaction log buffer and a cache to speed up access to the system catalog, as well as an audit buffer at the database level), the size is regulated by the DBHEAP parameter.

The total size of the global database area is limited by the DATABASE_MEMORY setting.

The application data area includes:

  • Application Global Memory - common memory areas shared when processing application requests, the maximum amount is regulated by the APPL_MEMORY parameter;
  • Agent Private Memory - private memory areas used for the operation of individual agents serving connected applications.

Optionally, you can allocate areas of memory that are allocated for the DB2 driver to run on the application side. For local applications (those that use IPC rather than network access to connect to the database manager), the DB2 settings you set control how much RAM is allocated (primarily the ASLHEAPSZ setting).

RAM management when performing sort operations

Many types of DBMS operations require sorting of data, so the management of RAM used for sorting is given special attention.

If it is impossible to allocate the sort area entirely in RAM, the data for sorting is placed in the system temporary table space. The performance of queries that require such large sort operations can be significantly degraded.

Parameters that control the allocation of RAM for sorting:

  • SORTHEAP - memory limit for sort operation;
  • SHEAPTHRES - size limit of the agent's private memory allocated for the sort operation;
  • SHEAPTHRES_SHR - the limit on the amount of total RAM that can be used to perform sort operations (by all consumers in total) at any given time.

DB2 supports three basic sort memory management models:

  • Shared sorting area model (shared sort) - used by default, implies setting the SHEAPTHRES parameter to 0. RAM allocation for sorting is carried out from the global area of ​​the database.
  • Private sorting area model (private sort) - used when the SHEAPTHRES parameter is non-zero and there is no configured shared memory sorting. The allocation of RAM for sorting is carried out from the application data area (more precisely, from private areas belonging to agents).
  • Hybrid sorting model (hybrid sort) - used when the SHEAPTHRES parameter is non-zero and there is a configured shared sorting memory. Operations that require the use of shared sort memory are performed with memory allocation in the global area of ​​the database, other sort operations are performed with memory allocation in private areas of agents.

Using shared (global) memory to perform sort operations provides a number of important benefits:

  • more flexible management of RAM when executing queries, allowing you to increase the efficiency of using RAM;
  • the possibility of using a parallel version of the sorting algorithm due to the simultaneous access to the sorting memory area of ​​the coordinating agent and its subordinate DB2 sub-agents.

One of the following settings can be used to enable the use of shared memory when performing sort operations:

  • the general sorting area model is enabled by setting the SHEAPTHRES parameter to 0;
  • parallelism of operations execution is enabled by setting the INTRA_PARALLEL parameter to YES;
  • the DB2_WORKLOAD variable is set to ANALYTICS;
  • the DB2 Connection Concentrator feature is enabled (usually used when accessing DB2 for z/OS and DB2 for i databases, see the description of this feature in ).

Automatic memory allocation management

The presence of a large number of different areas of RAM and parameters that control their size can require significant effort to manually tune a DB2 server. Therefore, starting with version 9, IBM DB2 supports automatic management of the distribution of RAM between different areas using a self-tuning memory manager (STMM, Self-Tuning Memory Manager).

When bootstrapping is enabled, STMM dynamically allocates available memory resources to memory consumers in the database. STMM responds to changes in workload characteristics by adjusting memory configuration parameter values ​​and the size of buffer pools to optimize performance. To enable STMM, you must set the SELF_TUNING_MEM database configuration parameter to ON.

Automatic management of memory allocation is carried out for those memory areas for which it was explicitly allowed. When setting a configuration parameter value with the UPDATE DBM CFG and UPDATE DB CFG commands, to use STMM, the AUTOMATIC keyword is specified after the parameter value. The numerical value of the parameter specified in this case is used as the initial one, then STMM periodically adjusts the values ​​taking into account the current load, redistributing the RAM between different consumers.

STMM automatic management is supported for the following options:

  • INSTANCE_MEMORY is the total amount of RAM in the DB2 instance;
  • DATABASE_MEMORY - global database areas;
  • DBHEAP - an area for servicing database operations;
  • LOCKLIST – scope for keeping data on locks;
  • MAXLOCKS - the percentage of memory occupied by locks of one application to switch to lock escalation;
  • PCKCACHESZ - caching area for query execution plans;
  • SHEAPTHRES_SHR - general sort area;
  • SORTHEAP - sort area size for one operation;
  • APPL_MEMORY - area of ​​functional memory;
  • APPLHEAPSZ - private memory limit used by one agent;
  • STMTHEAP - limit on the size of the area used by the compiler of SQL and XQuery queries (per query);
  • STAT_HEAP_SZ is the maximum amount of RAM allocated for building statistics by the RUNSTATS utility and allocated from the functional memory area.

Types of database objects

This section provides an overview of the types of DB2 database objects. For a complete list of DB2 database object types and detailed information on each object type, see the DB2 documentation:

Scheme

Schemas are namespaces for collecting database objects. Schemes are primarily used for:

  • providing an indication of ownership of objects or links to the application;
  • logical grouping of related objects.

All DB2 database objects (except for generic synonyms) have fully qualified two-part names; schema is the first part of such a name:<имя_схемы>.<имя_объекта>

The fully qualified object name must be unique. If you connect to a database and create or access an object without specifying a schema, DB2 will use the user ID that made the connection to the database as the schema name. You can also use the SET SCHEMA statement to set the schema for the current session.

Schema creation can be done explicitly, by calling the CREATE SCHEMA statement, or implicitly, on the first attempt to create an object without specifying a schema name. In the latter case, the user must be granted the IMPLICIT_SCHEMA permission to successfully create the schema.

Synonyms can be created for most kinds of database objects, allowing the original objects to be referred to by a different name (perhaps placed in a different schema). Synonyms are created using the CREATE SYNONYM / CREATE ALIAS statement. It also supports working with public synonyms that are not tied to a specific schema. Access to public synonyms is carried out without specifying the scheme, regardless of the established current scheme of the session. Public synonyms are created using the CREATE PUBLIC SYNONYM / CREATE PUBLIC ALIAS command.

tables

A table is a collection of related data organized logically in columns and rows.

Each table row consists of the same set of named columns. When creating a table, each column is assigned a data type that limits the allowable column values ​​in table rows (database records) and determines the semantics of possible operations on the corresponding values ​​(including comparison, sorting, computational operations).

A table is created with the CREATE TABLE command, and deleted with the DROP TABLE command. It is supported to change the description of a table with the ALTER TABLE command, including adding and deleting columns, changing the data types of columns. After performing some operations of changing the table description, it is required to reorganize it (restructure the physical storage of the table for optimal access to it) using the REORG command.

The classification of DB2 built-in data types that can be used to define table columns is shown in the figure below.

In addition to one of allowed values supported data type, column values ​​can be empty, i.e. empty (NULL) value. The ability for a column to store null values ​​is determined when the table is created.

Most of the data types listed in the figure have a direct counterpart in other modern relational DBMSs and are described in detail in the DB2 documentation. The following is a brief description of data type features that are specific to DB2 or that may be difficult to use.

When working with string data, unlike some other types of DBMS, DB2 distinguishes between an empty string (string of zero length) and a NULL value of the string type. This feature affects the search order (using the equality predicate instead of the IS NULL expression) and the composition of the allowed column values ​​(if NULL values ​​are forbidden, an empty string can be stored in the column).

String values ​​of the GRAPHIC, VARGRAPHIC, and DBCLOB types differ from other string types in that they are always stored in UTF-16 encoding. When accessing the corresponding columns from the side of the client application, the DBMS provides data conversion to the encoding used by the client application.

Columns of type DATE (date) do not contain timestamps by default. In Oracle Database compatibility mode, DB2 additionally supports storing time attributes (hours, minutes, seconds) in DATE columns.

If you need to work efficiently with exact decimal numbers that include a fractional part (for example, in financial applications), it is advisable to use the DECFLOAT data type, which combines the exact representation of DECIMAL values ​​and the ability to efficiently calculate values ​​of the FLOAT type.

The BLOB data type provides the ability to store unstructured binary information (such as images or office documents) in a database. BLOB values ​​can be stored together with other record fields (if their size allows them to be compact enough), or separately, in a special storage object. In the latter case, the entry contains a reference to the stored BLOB value instead of the value itself. Storage of values ​​of CLOB and DBCLOB types is organized in a similar way.

The XML data type provides storage in table fields of structured hierarchical XML documents. For stored XML documents, attribute access operations are supported (without the need to parse the XML document when accessed), indexing of individual attributes, and other features.

In addition to built-in data types, DB2 supports user-defined data types that are defined based on built-in types. Working with user-defined data types is described in the DB2 documentation.

When creating a table, it is possible to specify rules for automatic filling of their values ​​for columns. A special case of autocomplete columns are identity columns, which are numeric columns that automatically generate a unique numeric value for each inserted row. Automatic filling can be carried out in one of two modes:

  • GENERATED ALWAYS - The value is always set by the DB2 server and cannot be explicitly set by the application;
  • GENERATED BY DEFAULT - The value is set by the DB2 server if the application did not specify an explicit assignment value when the record was inserted.

Also, at the table level, restrictions can be defined that set restrictions on the composition of attribute values. The following types of restrictions are supported:

  • primary key (PRIMARY KEY) - uniqueness constraint on a set of columns, mainly used to search for a single record, there can be only one primary key in a table;
  • uniqueness constraint (UNIQUE) – additional uniqueness constraint on a set of columns;
  • foreign key (FOREIGN KEY) – a reference in the form of a set of column values ​​pointing to a combination of columns of another table for which a foreign key or unique constraint is defined;
  • check (CHECK) - a logical condition that restricts possible values one or more columns in a post.

The restriction mechanism implements means of automatic control and ensuring the integrity of the database, including referential integrity of data (control of the presence in the "parent" table of records that are referenced through foreign keys of the records of "child" tables). Proper use of restrictions allows you to guarantee the formal correctness of filling the database and, to a certain extent, protect yourself from application and user errors when correcting data.

Since the restriction mechanism creates additional computational load when entering and updating data, in some cases its use is deliberately abandoned, placing the responsibility for the correct maintenance of the database on the application. At the same time, DB2 uses integrity constraint descriptions to determine the relationships between tables and select the most efficient query execution plan.

Temporary tables

To store temporary application data, DB2 provides a temporary table mechanism that provides full functionality for working with tabular data, but in the context of the current user session.

Temporary table access greatly improves performance by minimizing or eliminating system catalog access conflicts, and by eliminating row locking, logging (optional, depending on table creation mode), and permission checking.

Temporary tables also support indexes, that is, any standard index can be created on a temporary table. You can also collect statistics on such tables (using the RUNSTATS command) to obtain the information needed by the query optimizer.

Temporary tables reside in a user temporary table space, which must be defined before they are created.

There are two main varieties of temporary tables in DB2:

  • declared temporary tables (DGTT - Declared Global Temporary Tables);
  • created global temporary tables (CGTT - Created Global Temporary Tables).

Declared temporary tables are in-memory tables used by an application and automatically dropped when it terminates. Such tables can only be accessed by the application that created them, and they are not stored in any of the DB2 system catalog tables.

Each declared temporary table has a SESSION schema; this scheme must be specified by referring to it. The user ID used to declare temporary tables will have full privileges on those tables. Each application that declares a temporary table will have its own copy of that table.

While DGTTs allow you to declare a temporary table, the definition of such a table cannot be shared across connections or sessions. You must issue a DECLARE GLOBAL TEMPORARY TABLE statement every time you start a session.

When using Generated Global Temporary Tables (CGTT), the temporary table definition only needs to be created once because it is stored in the DB2 system catalog. This means that other connections can use the table definition instead of creating it again.

Although the structure of the CGTT table can be used immediately, data from different connections is independent of each other and disappears when the connection is closed.

Indices

An index is an ordered set of keys, each pointing to a table row. Indexes enforce row uniqueness (that is, implement the unique constraints discussed in the previous section) and improve performance. The following describes some of the characteristics that can be defined for indexes:

  • indexes can be built in ascending or descending order of column values;
  • index keys can be unique or non-unique;
  • indexes can be built on several columns (such indexes are called combined);
  • if index and table data are grouped in the same index sequence, such an index is called clustered (CLUSTERED INDEX).

Index creation is provided by the CREATE INDEX statement, deletion by the DROP INDEX statement. When creating an index, its type (unique / non-unique) and the composition of the columns for building the index are specified.

DB2 provides tools that provide automated index selection to optimize query execution. The most convenient way to work with these tools is organized in IBM Data Studio.

Sequences

Although sequence objects are independent of tables, they function similarly to identity columns and provide the generation of unique numeric sequences. The difference between sequences and identity columns is that identity columns generate unique numbers exactly in the specified column of the table, while sequence objects can be used to generate sequential numeric values, the logic of which is determined by the application.

The creation of sequences is provided by the CREATE SEQUENCE command, the next and current received values ​​are accessed using the NEXT VALUE FOR and PREVIOUS VALUE FOR operators. For compatibility with Oracle Database, the syntax for accessing sequence values ​​via pseudo-columns "NEXTVAL" and "CURRVAL" is also supported.

Representation

A view is a display of data in tables. Data for views is not stored separately, it is fetched when the view is run. Nested views are supported, i.e. views created from other views.

Views are created with the CREATE VIEW command, and deleted with the DROP VIEW command. To facilitate updating (replacing) views, the CREATE OR REPLACE VIEW syntax is provided, which provides the creation of a new view (if it does not already exist) or the replacement of an existing view with a new definition (if a view with the specified name has already been created).

triggers

A trigger is an object that automatically performs an operation on a table or view. A specific action on an object for which a trigger is defined causes the trigger to fire. Typically, a trigger is not considered an application object; accordingly, triggers are usually created not by developers, but by database administrators.

Stored procedures and functions

Stored procedures are database objects that contain SQL statements and business logic. Storing some of the application logic in the database improves performance by reducing the amount of traffic between the application and the database. In addition, stored procedures provide a central location to store program code, and, accordingly, other applications can use the same stored procedures. The CALL statement is used to call a stored procedure.

User Defined Functions (UDFs) are database objects that allow users to extend the SQL language with their own logic. A function always returns a value or values, usually as a result of the business logic included in the function. To call a function, use it as part of an SQL statement or with a VALUES statement.

In DB2, stored procedures and user-defined functions can be developed in several programming languages, including PL/SQL, SQL PL, Java, C, C++, COBOL.

System directory

One of the basic information resources The DBMS is a system catalog that stores and provides access to information about the structure of the database, including:

  • description of tables, columns and indexes;
  • description and text of views, triggers and stored procedures;
  • information about tablespaces and containers for storing data;
  • established permissions for access to database objects;
  • other database meta information.

Accessing the system catalog is required for many tasks, including automating database administration and maintenance tasks, application development, and more.

The most commonly used tables (really, views) that are part of the system catalog are:

  • SYSCAT.SCHEMAS - description of database schemas;
  • SYSCAT.TABLES - description of database tables;
  • SYSCAT.COLUMNS – description of table columns;
  • SYSCAT.INDEXES - description of indexes.

A detailed description and composition of columns for the above and other tables of the system catalog is given in .

Organization of parallel transactional processing

Transactions

A transaction (or unit of work) consists of one or more SQL statements that are treated as a separate unit when executed; in other words, the failure of one transaction statement causes the entire transaction to fail, with all statements executed up to the point of failure being rolled back.

The transaction ends with a COMMIT statement. The transaction can also end with a ROLLBACK statement or an emergency (abnormal) shutdown of the application, after which all changes made by the application to the database will be canceled. The start of a transaction is the first statement executed after the application's connection to the database was opened, or after the previous transaction completed. Each application connection to the database can have at most one active transaction.

As mentioned earlier, changes to the database are recorded in the transactional log. To ensure that changes made by a transaction being undone can be "rolled back", transaction boundaries are also recorded in the transaction log. At the same time, transactions that perform only data read operations are not written to the transaction log. Information about the beginning of a transaction is placed in the transaction log before the start of the execution of the first (for a given transaction) data write statement.

In the event of an error in the execution of a single statement that writes data, all changes made by that statement are undone using the transaction log data. The application, having received a diagnostic message about the refusal to execute the statement, can cancel the entire transaction (with the ROLLBACK statement) or perform some other actions with the database and, as a result, confirm the changes made (with the COMMIT statement).

An application can define additional rollback points within a transaction (using the SAVEPOINT statement) and undo changes made after a rollback point has been created (using the ROLLBACK TO statement). Using rollback points allows an application to selectively undo actions taken within a transaction, which can be useful when handling data integrity errors and other scenarios.

Locks

Parallel use means that multiple users can work on the same database objects at the same time. Data access must be properly coordinated to ensure data integrity and consistency.

To obtain consistent results of parallel transactions, control over the parallel use of shared resources is required. Such control is based on the use of locks.

The concepts of locking and concurrency are closely related. The lock temporarily prevents applications from performing other operations until the current operation has completed. The more actively locking is used in the system, the less opportunities for concurrency remain. On the other hand, the less often a lock is applied in a system, the more possibilities there are for concurrency.

A lock is acquired automatically as needed to maintain a transaction and is released when such a transaction is aborted (using a COMMIT or ROLLBACK command). Locks can be placed on tables or rows.

There are two main types of blocking:

  • Shared Lock (S) - Set when an application reads data and prevents other applications from making changes to the same row.
  • Exclusive lock (X) - Set when an application updates, inserts, or deletes a row.

If two and more applications need to perform an operation on the same object, one of them will have to wait to acquire the required lock. By default, the application will wait indefinitely. The application's lock timeout is controlled by the LOCKTIMEOUT database configuration parameter. The default value for this parameter is -1 (infinite wait).

You can use the CURRENT LOCK TIMEOUT session variable to set the lock timeout on a particular connection. By default, this variable is set to LOCKTIMEOUT. You can use the SET LOCK TIMEOUT statement to change this value.

In the case when two (or more) applications connected to the same database wait indefinitely for resources due to the incorrect sequence of accessing these resources, a deadlock situation occurs. The timeout period cannot expire because each application is holding a resource that the other application needs. In all cases, the deadlock problem is due to incorrect application structure or logic.

DB2 automatically detects deadlock situations by performing appropriate checks at the intervals specified by the DLCHKTIME parameter. When it detects that a deadlock has actually occurred, DB2 will use an internal algorithm to determine which of the two transactions should be rolled back and which should be continued.

Isolation levels

A detailed analysis of the problems that can occur if there is no concurrency control is given in the DB2 documentation, as well as in the literature on the theory of the functioning of relational DBMS. Possible types of problems include:

  • lost update (if one data block is changed simultaneously by different transactions, one of the changes is lost);
  • unreliable reading (reading of data added or changed by a transaction, which will not be subsequently confirmed);
  • non-repeating reading (when re-reading within the same transaction, previously read data is changed);
  • phantom reading (the same selections in one transaction give different sets of rows due to adding, deleting or changing rows by other transactions).

Application-side control of DB2's built-in protection against the problems listed above is done by setting the isolation level to be used. Isolation levels can be thought of as locking policies where, depending on the isolation level chosen, you can achieve different ways in which an application locks a database. The isolation level required by an application can be set at the session level and at the level of a single query or subrequest that is being executed.

DB2 provides the following levels of protection for data isolation:

  • unreliable reading (Uncommitted Read, UR);
  • cursor stability (Cursor Stability, CS);
  • reading stability (Read Stability, RS);
  • repeated reading (Repeatable Read, RR).

Inauthentic reading also called "dirty". This is the lowest isolation level that allows the highest degree parallel use. Rows are not locked during read operations, except when another application tries to drop or modify the table; update operations are performed in the same way as when using the next isolation level, the Cursor Stability level.

Using the Bad Read isolation level prevents the following problems:

  • lost update.

Cursor stability is the default isolation level. It provides the minimum degree of blocking. This isolation level locks the "current" row of the cursor. If the line is read-only, the lock is held until transition to new line or completion of the operation. If the row is updated, the lock is held until the operation completes.

Using the Cursor Stability isolation level prevents the following issues:

  • lost update;
  • incorrect reading.

Prior to DB2 9.7, when using the Cursor Stability isolation level, performing a write (UPDATE operation) closed a read (SELECT operation) access to the same row. The logical basis was that since the write operation makes changes to the row, the read should wait for the updates to complete in order to get the final committed value.

DB2 9.7 defaults to a different approach to the Cursor Stability isolation level for new databases. This new approach is implemented using "currently committed" (CC) semantics. When using CC semantics, a write operation does not close access to the same row for a read operation. Previously, this approach was possible using the UR isolation level; the difference with the current approach is that with UR the read operation receives invalid values, while with CC semantics it receives the values ​​currently accepted. The currently committed values ​​are the values ​​that were committed before the start of the write operation.

Reading stability provides a lock on all rows received by the application. For the given query, all rows that match the result set are blocked. Thus, using this isolation mode can result in the application acquiring a large number of locks, and, if the set limits are reached, lock escalation from the row level to the table level. However, at the Read Stability isolation level, the DB2 query optimizer does not explicitly acquire table-level locks in the execution plan of the queries it executes, even if the result set contains most of the records in the table.

Using the Read Stability isolation level prevents the following issues:

  • lost update;
  • incorrect reading;
  • non-repetitive reading.

Repetitive reading is the highest level of isolation. It provides the highest degree of locking and the least amount of concurrency. A lock is placed on the rows that are processed to build the result set; in other words, even rows that do not make it into the final result package can be blocked. Other applications cannot update, delete, or insert rows that will affect the result set until the operation in progress is completed. Repeated reading ensures that the same query, created by the application multiple times in a single operation, produces the same results each time.

The DB2 query optimizer, when using the Repeated Read isolation level, can include in the query execution plan explicit operations setting locks at the table level when the corresponding queries involve scanning all rows of the table (which means that each row of the table must be locked during query execution).

Using the Read Stability isolation level prevents all possible problems competitive access, but at the same time, the possible parallelism of the execution of operations is limited as much as possible.

Conclusion

This article has reviewed the main functionality IBM DB2 LUW DBMS, database server structure, configuration settings and data storage organization. In addition, the basic principles of the DB2 server, the supported types of database objects, and the organization of parallel DB2 transactional processing are considered.

At work, I had to deal with the IBM DB2 DBMS for some time. Because Since the system is commercial, there is not much information in Russian on the Internet, so I decided to describe some of the features of this DBMS.

Point of entry

Let's start with the entry point in the DBMS. In SQL SERVER, the endpoint is an instance, which of course can have separate databases, but the configuration and security model is the same for the entire instance. In DB2, the entry point looks like this - an instance (which corresponds to a specific port) - a database. At the same time, there is a configuration for the entire instance, and for a separate database.

You can view the instance configuration either using the db2 command:

Database Manager Configuration

Node type = Enterprise Server Edition with local and remote clients

Database manager configuration release level = 0x0b00

CPU speed (millisec/instruction) (CPUSPEED) = 2.912790e-07
Communications bandwidth (MB/sec) (COMM_BANDWIDTH) = 1.000000e+02

Max number of concurrently active databases (NUMDB) = 8
Federated Database System Support (FEDERATED) = YES
Transaction processor monitor name (TP_MON_NAME) =

Default charge-back account (DFT_ACCOUNT_STR) =

Java Development Kit installation path (JDK_PATH) = /home/db2inst1/sqllib/java/jdk32

Diagnostic error capture level (DIAGLEVEL) = 3
Notify Level(NOTIFYLEVEL) = 3
Diagnostic data directory path (DIAGPATH) = /home/db2inst1/sqllib/db2dump

Default database monitor switches
Buffer pool (DFT_MON_BUFPOOL) = OFF

Where the parameters will be specified, their meaning and decoding. An abbreviated version is also possible:

get dbm cfg

Or with a query:

Select name, value from sysibmadm.dbmcfg

Important parameters include:

  • authentication type (AUTHENTICATION)
  • default path for creating new databases (DFTDBPATH)
  • network server discovery (DISCOVER)
You can view the settings for a specific database like this:

connect to sample(sample - database name)

get database manager configuration

Or with approximately the same request as before:

select name, value from sysibmadm.dbcfg

Authentication

The big difference between DB2 and other DBMSs is the authentication model. There are no internal users like in SQL Server or MySQL. All authentication is performed by means external to the DBMS (dynamically loaded plugins) - by means of the operating system or external plugins (Kerberos, GSS API). The authentication type is set in the AUTHENTICATION parameter of the database manager configuration. By default, the SERVER value is set - the username and password are transmitted in clear text, and this pair is checked for correctness by means of the operating system. If the username and password are correct, then the CONNECT privilege is checked for the user or groups to which he is a member (including the special PUBLIC group, which includes all authorized users). These privileges can be viewed in the SYSCAT.DBAUTH table:

select GRANTEE from SYSCAT.DBAUTH where CONNECTAUTH = "Y"

A big configuration mistake is to include the CLIENT authentication type. In this case, DB2 trusts the connecting client to perform authentication, and if PUBLIC has the CONNECT privilege, then any user will be able to connect to the database and gain access to all the data that PUBLIC has. The username is taken from the operating system. That is, if we connect through Data Studio as an Administrator user, then all the privileges that this user has will be granted. And in this case, there is no difference from which computer the access was made. This type of authentication is recommended to be enabled only when there is a secure channel between the server and the client, and other clients will not be able to connect to the DBMS.

Authorization

Instance level privileges are written in the database manager configuration. These are the following privileges:

  • SYSADM
  • SYSCTRL
  • SYSMAINT
  • SYSMON
These privileges are set by specifying the group where the user will enter. In dbmcfg, these are the SYSADM_GROUP , SYSCTRL_GROUP , SYSMAINT_GROUP , and SYSMON_GROUP options, respectively.

Next, there are database-specific privileges. These are privileges such as database access (CONNECTAUTH), table creation (CREATETABAUTH), routine creation (EXTERNALROUTINEAUTH), and so on. These privileges can be viewed in the SYSCAT.DBAUTH view

And finally, access privileges to specific data - tables, subroutines, and so on. Everything here is quite trivial, but also with some peculiarities.

Table access privileges can be viewed in the SYSCAT.TABAUTH view. The type of the granted privilege is stored in separate columns, depending on the privilege itself (SELECTAUTH, DELETEAUTH, etc.). When granting a privilege using the GRANT command for the REFERENCES and UPDATE privileges, you can also specify the names of the columns to which the given privileges will be extended. In this case, information about this can be viewed in the SYSCAT.COLAUTH view

Privileges for routines (functions, procedures, and methods) can be viewed in SYSCAT.ROUTINEAUTH . Not everything is trivial here, depending on the SPECIFICNAME and TYPENAME fields, privileges can be granted to all subroutines of a given scheme.

If readers like the article, then I'm ready to talk about protecting data in DB2 using Label-Based Access Control

A relational database is a set of relationships whose names match the relationship schema names in the database schema. Today it is known big number various SQL database servers. Let's focus on the following four leading server DBMS - Oracle8i, IBM DB2, Microsoft SQL Server and Informix - and compare them in operation at each of the main stages of functioning.

Oracle8i. Oracle8i package, endowed with the most advanced set of functions for working with the Java language and accessing data over the Internet, a concurrent access optimization system. The only drawback of this DBMS is the complexity of administration, however, all the costs of its implementation and development will later pay off with efficient and reliable work. (complexity and high cost are debatable). Among the main properties of the Oracle DBMS, the following should be noted: Highest reliability. The ability to partition large databases into sections (large-database partition), which makes it possible to effectively manage gigantic gigabyte databases; Availability of universal means of information protection; Effective Methods maximum increase request processing speed; Bitmap indexing; Free tables (in other DBMS all tables are filled immediately upon creation); Parallelization of operations in a query. Availability of a wide range of development, monitoring and administration tools. Focus on Internet technologies. Solutions that are not inferior to the development of Oracle can only be found in IBM's DB2. Orientation to Internet technologies is the main motto of modern Oracle products. In this regard, the interMedia package, which provides processing of data in multimedia formats, and Jserver, a built-in tool for working with the Java language, which combines the capabilities of the Java language with the capabilities of relational databases, can be noted. Enterprise JavaBeans are the building blocks that make up Java Internet applications. Oracle is committed to the principle that all critical functions must be managed from single center, therefore, the proposed interMedia module provides users with the most advanced features for working with multimedia objects: Very advanced tools for processing audio clips; Still images; Video clips; Geographical data (with a whole set of functions related to determining the location included in the Locator module). Oracle8i implements today's best tools for object-oriented database design, including tabular structures that allow inheritance of properties and methods of other tabular database objects, which will avoid errors when building a database and facilitate their maintenance. It should also be noted that the multiversioning concurrency optimization system developed by Oracle is one of the most important characteristics of the Oracle architecture (a similar function is available only in the InterBase DBMS from InterBase from Inprise). This function allows you to eliminate the situation when one user has to wait until the other completes changes to the contents of the databases (that is, there are no read locks in Oracle). This feature allows Oracle8i to execute more transactions per second per user than any other database. In terms of performance when working in a WEB environment under LINUX, Oracle occupies an honorable second place after DBMS MySQL, while significantly surpassing all other DBMS in terms of reliability and security.

DBMS Microsoft SQL Server The most important characteristics of this DBMS are: ease of administration, the ability to connect to the Web, the speed and functionality of the DBMS server mechanism, the availability of tools remote access, The DBMS administrative toolkit includes a set of special wizards and tools for automatically setting configuration parameters. Also, this database is equipped with excellent replication tools that allow you to synchronize PC data with database information and vice versa. The OLAP server included in the package makes it possible to save and analyze all the data available to the user. In principle, this DBMS is a modern full-featured database that is ideal for small and medium-sized organizations. It should be noted that SQL Server is inferior to other DBMS under consideration in two important indicators: programmability and means of operation. When developing client database applications based on Java, HTML, the problem of insufficient SQL Server software tools often arises and it will be more difficult to use this DBMS than DB2, Informix, Oracle or Sybase systems. The global trend in the 21st century has become the almost universal transition to the LINUX platform, and SQL Server functions only in Windows environment. Therefore, the use of SQL Server is advisable only if the ODBC standard is used exclusively to access the contents of the database, otherwise it is better to use other DBMS.

IBM DB2 The IBM DB2 DBMS is the result of almost 30 years of development and research work by IBM. The latest version of this DBMS (6.x) features one of the most elaborate set of management and optimization tools and a database engine that can grow from a laptop running Windows 95 to an entire cluster of S/390 mainframes running OS/390. The DB2 package is available in two editions: DB2 Workgroup and DB2 Enterprise Edition. This DBMS implements all the innovative database engine technologies known from previous versions of DB2, such as parallel query processing, a full set of replication tools, query summary tables to improve database performance, object-oriented database design capabilities, and Java language features. In addition, the DB2 system is equipped with a complete set of multimedia extensions that allow you to save and manipulate text, sound and video fragments, images and geographic data. We can say that in terms of scalability, the database clustering technology developed by IBM specialists has no analogues. These extensions greatly facilitate the process of developing applications for the Web, as well as programs containing photographic images and voluminous text reports. The DB2 system is also quite competitive as an application development platform, as there is a Stored Procedure Builder tool that automatically converts SQL statement into the corresponding Java class and including it in the database structure. In DB2 6.1, interoperability with other DBMSs has been greatly improved by enabling the use of Microsoft's OLE DB specification, a new database access standard. DB2 administrative controls that are new version rewritten in Java and can be obtained from the Web deserve the highest praise. The main disadvantages of this DBMS are the relative complexity of administration and the lack (yet) of implementations for popular server operating systems, such as LINUX. In this DBMS, thanks to the Index Smart-Guide, it is possible to carry out tuning, forming optimal indexes for a given number of accesses, which characterizes the typical load on the database. DB2 is the only package that allows you to generate pivot tables, which significantly improves the efficiency of the DBMS as data warehouses. A PivotTable is a temporary workspace used by the database to store answers to frequently asked queries. The DB2 6.1 model is emerging as the most cost-effective high performance system. The administrative tools of this DBMS are quite appropriate for the level of tasks being solved, in addition, it provides exceptionally wide opportunities for working with multimedia data and for programming (which is clearly lacking in Microsoft SQL Server).

DBMS from Informix. Recently, there has been a transition from relational DBMS to object-oriented ones (which is clearly seen in the example of Oracle). Informix also following this concept announced a new Centaur DBMS solution based on the Informix Dynamic Server 7.3 relational database and the Informix Universal Data Option object-relational database and combining the high performance of Dynamic Server when working with data with universality and multimedia functions of the Universal Data Option. This implementation is intended for the development of Internet systems. It is expected that this DBMS will have a flexible development environment with scalability to match the intensive workloads characteristic of the Internet, and tools for working with new types of data that have become ubiquitous with the development of the Web. The Java tools implemented in the new system will allow developers to create stored procedures, user programs, and DataBlades components in this language, which Informix calls

custom database extensions. From the point of view of Inforix's customers, this will be a big step forward, since until now, when working with DataBlades, they could only use C and SPL, Informix's internal language for writing stored procedures. In addition, the Centaur package will be equipped with built-in ActiveX object handling. This will make it possible, for example, to create database stored procedures in the language Visual Basic; however, this requires the Centaur package to run in a Windows NT environment. Centaur will be an add-on to Informix Dynamic Server and work with the traditional database format for this package, so that users will have all the old functions at their disposal, and upgrading the system to the new version will not be very difficult. In addition, the Centaur package will retain all of the design and programming capabilities that made the Informix Universal Server system an outstanding engineering achievement. The new system will be equipped with facilities for object-oriented database design, creation of specialized tables and indexing programs; it will allow users to embed their own functionality in queries and not rely solely on standard means SQL. Conclusions. Having considered the main characteristics of architectures for building AIS, server operating systems and DBMS, in the future, as the architecture of AIS, we will choose the architecture of the Internet / Intranet, as the Linux server OS, as the Oracle 8i DBMS.

2) SQL SELECT clause. Built-in functions.

SELECT column FROM table WHERE column LIKE pattern

SELECT * FROM Store_Information WHERE store_name LIKE "%AN% ‘;

SELECT column_name FROM table_name WHERE column_name BETWEEN value1 AND value2

SELECT * FROM Persons WHERE LastName BETWEEN "Hansen" AND "Pettersen";

SELECT * FROM Persons WHERE LastName NOT BETWEEN "Hansen" AND "Pettersen";

SELECT Company, OrderNumber FROM Orders ORDER BY( sorting ) Company;

SELECT Company, OrderNumber FROM Orders ORDER BY Company, OrderNumber;

SELECT Company, OrderNumber FROM Orders ORDER BY Company DESC( reverse order ) ;

SELECT Company, OrderNumber FROM Orders ORDER BY Company DESC , OrderNumber ASC( right . order ) ;

SELECT * FROM Persons WHERE FirstName="Tove" AND LastName="Svendson";

SELECT * FROM Persons WHERE firstname="Tove" OR lastname="Svendson" ;

SELECT * FROM Persons WHERE (FirstName="Tove" OR FirstName="Stephen") AND LastName="Svendson" ;

SELECT store_name FROM Store_Information WHERE Sales > 1000 OR (Sales< 500 AND Sales > 275);

FunctionsSELECTfunction( column) FROMtable AVG - average value in the column; COUNT - number of values ​​in a column; MAX - the most great importance in a column; MIN - the smallest value in the column; SUM - sum of values ​​by column

Examples: SELECT AVG(Age) FROM Persons; SELECT COUNT(store_name) FROM Store_Information; SELECT COUNT(DISTINCT store_name) FROM Store_Information; SELECT MAX(Age) FROM Persons SELECT SUM(Sales) FROM Store_Information;

3) Serialization of transactions, conflicts of operations. Transaction serialization methods. Synchronization grips, granular synchronization grips. Transaction serialization methods. Predicate Synchronization Captures. Serialization based on timestamps.

To achieve isolation of transactions, the DBMS must use methods to regulate the joint execution of transactions. The plan (method) for executing a set of transactions is called serial if the result of the joint execution of transactions is equivalent to the result of some sequential execution of the same transactions. Transaction serialization- this is a mechanism for their implementation according to some serial plan. Providing such a mechanism is the main function of the DBMS component responsible for managing transactions. A system that supports transaction serialization provides real user isolation. The main implementation problem is choosing a method for serializing a set of transactions that does not overly restrict their parallelism. A trivial solution that comes to mind is really sequential execution of transactions. But there are situations in which it is possible to execute statements of different transactions in any order while preserving seriality. Examples include read-only transactions, as well as transactions that do not conflict on database objects. The following types of conflicts can exist between transactions: W-W - transaction 2 tries to modify an object modified by transaction 1 that has not ended; R-W - transaction 2 tries to modify an object read by transaction 1 that has not ended; W-R - Transaction 2 is trying to read an object modified by transaction 1 that has not ended. Transaction serialization practices are based on these conflicts.

Exist two basic approaches to serialization of transactions - based on synchronization captures of database objects and on the use of timestamps. The essence of both approaches is to detect transaction conflicts and eliminate them. The most common approach in centralized DBMSs (including systems based on the "client-server" architecture) is the approach based on adherence to the two-phase protocol of synchronization captures database objects. In general terms, the protocol is that before performing any operation in transaction T on a database object r, on behalf of transaction T, a synchronization capture of object r is requested in the appropriate mode (depending on the type of operation). The main modes of synchronization captures are: joint mode - S (Shared), meaning shared capture of an object and required to perform an object reading operation; exclusive mode - X (eXclusive), meaning exclusive capture of the object and required to perform insertion, removal and modification operations. Granular Synchronization Capture - an approach that sync captures can be queried against objects different levels: files, relations and tuples. The required object level is determined by the operation being performed (for example, to perform a delete operation on a relation, the entire relation must be the sync capture object, but to perform a tuple delete operation, that tuple). An object of any level can be captured in S or X mode. Predicate Synchronization Capture- this is not the capture of objects, but the conditions (predicates) that these objects satisfy. An alternative method of serializing transactions that works well in conditions of rare transaction conflicts and does not require the construction of a transaction wait graph. based on using timestamps. The main idea of ​​the method (of which there are many varieties) is as follows: if transaction T1 started before transaction T2, then the system provides such an execution mode as if T1 had been completely executed before T2 began.

To do this, each transaction T is assigned a timestamp t corresponding to the start time T. When performing an operation on an object r, transaction T marks it with its timestamp and the type of operation (read or change). Before performing an operation on object r, transaction T1 performs the following actions: Checks if transaction T, which marked this object, has ended. If T has ended, T1 marks the object r and performs its operation. If transaction T has not completed, then T1 checks if the operations are in conflict. If the operations are non-conflicting, the timestamp with a lower value remains or is affixed to the object r, and transaction T1 performs its operation. If operations T1 and T conflict, then if t(T) > t(T1) (that is, transaction T is younger than T), T is rolled back and T1 continues. If t(T)< t(T1) (T "старше" T1), то T1 получает новую временную метку и начинается заново. К недостаткам метода временных меток относятся потенциально более частые откаты транзакций, чем в случае использования синхронизационных захватов. Это связано с тем, что конфликтность транзакций определяется более грубо. Кроме того, в распределенных системах не очень просто вырабатывать глобальные временные метки с отношением полного порядка.