8 min read

The Core Database

Much of the data we see in Joomla! is stored in the database. A base installation has over thirty tables. Some of these are related to core extensions and others to the inner workings of Joomla!.

There is an official database schema, which describes the tables created during the installation. For more information, please refer to: http://dev.joomla.org/ component/option,com_jd-wiki/Itemid,31/id,guidelines:database/.

A tabular description is available at: http://dev.joomla.org/downloads/Joomla15_DB-Schema.htm.

We access the Joomla! database using the global JDatabase object. The JDatabase class is an abstract class, which is extended by different database drivers. There are currently only two database drivers included in the Joomla! core, MySQL and MySQLi. We access the global JDatabase object using JFactory:

$db =& JFactory::getDBO();

Extending the Database

When we create extensions, we generally want to store data in some form. If we are using the database, it is important to extend it in the correct way.

Table Prefix

All database tables have a prefix, normally jos_, which helps in using a single database for multiple Joomla! installations. When we write SQL queries, to accommodate the variable table prefix, we use a symbolic prefix that is substituted with the actual prefix at run time. Normally the symbolic prefix is #__, but we can specify an alternative prefix if we want to.

Schema Conventions

When we create tables for our extensions, we must follow some standard conventions. The most important of these is the name of the table. All tables must use the table prefix and should start with name of the extension. If the table is storing a specific entity, add the plural of the entity name to the end of the table name separated by an underscore. For example, an items table for the extension ‘My Extension’ would be called #__myExtension_items.

Table field names should all be lowercase and use underscore word separators; you should avoid using underscores if they are not necessary. For example, you can name an email address field as email. If you had a primary and a secondary email field, you could call them email and email_secondary; there is no reason to name the primary email address email_primary.

If you are using a primary key record ID, you should call the field id, make it of type integer auto_increment, and disallow null. Doing this will allow you to use the Joomla! framework more effectively.

Common Fields

We may use some common fields in our tables. Using these fields will enable us to take advantage of the Joomla! framework.

Publishing

We use publishing to determine whether to display data. Joomla! uses a special field called published, of type tinyint(1); 0 = not published, 1 = published.

Hits

If we want to keep track of the number of times a record has been viewed, we canuse the special field hits, of type integer and with the default value 0.

Checking Out

To prevent more than one user trying to edit one record at a time we can check out records (a form of software record locking). We use two fields to do this, checked_out and checked_out_time. checked_out, of type integer, holds the ID of the user that has checked out the record. checked_out_time, of type datetime, holds the date and time when the record was checked out. A null date and a user ID of 0 is recorded if the record is not checked out.

Ordering

We often want to allow administrators the ability to choose the order in which items appear. The ordering field, of type integer, can be used to number records sequentially to determine the order in which they are displayed. This field does not need to be unique and can be used in conjunction with WHERE clauses to form ordering groups.

Parameter Fields

We use a parameter field, a TEXT field normally named params, to store additional information about records; this is often used to store data that determines how a record will be displayed. The data held in these fields is encoded as INI strings (which we handle using the JParameter class). Before using a parameter field, we should carefully consider the data we intend to store in the field. Data should only be stored in a parameter field if all of the following criteria are true:

  • Not used for sorting records
  • Not used in searches
  • Only exists for some records
  • Not part of a database relationship

Schema Example

Imagine we have an extension called ‘My Extension’ and an entity called foobar. The name of the table is #__myextension_foobars. This schema describes the table:

Field

Datatype

NOT NULL

AUTO INC

UNSIGNED

DEFAULT

id

INTEGER

X

X

X

NULL

content

TEXT

X

 

 

 

checked_out

INTEGER

X

 

X

0

checked_out_time

DATETIME

X

 

 

0000-00-00 00:00:00

params

TEXT

X

 

 

 

ordering

INTEGER

X

 

X

0

hits

INTEGER

X

 

X

0

published

TINYINT(1)

X

 

X

0

This table uses all of the common fields and uses an auto-incrementing primary keyID field. When we come to define our own tables we must ensure that we use thecorrect data types and NOT NULL, AUTO INC, UNSIGNED and DEFAULT values.

The SQL displayed below will create the table described in the above schema:

CREATE TABLE `#__myextension_foobars` (
`id` INTEGER UNSIGNED NOT NULL DEFAULT NULL AUTO_INCREMENT,
`content` TEXT NOT NULL DEFAULT '',
`checked_out` INTEGER UNSIGNED NOT NULL DEFAULT 0,
`checked_out_time` DATETIME NOT NULL DEFAULT '0000-00-00 00:00:00',
`params` TEXT NOT NULL DEFAULT '',
`ordering` INTEGER UNSIGNED NOT NULL DEFAULT 0,
`hits` INTEGER UNSIGNED NOT NULL DEFAULT 0,
`published` INTEGER UNSIGNED NOT NULL DEFAULT 0,
PRIMARY KEY(`id`)
) CHARACTER SET `utf8` COLLATE `utf8_general_ci`;

Date Fields

We regularly use datetime fields to record the date and time at which an action has taken place. When we use these fields, it is important that we are aware of the effect of time zones. All dates and times should be recorded in UTC+0 (GMT / Z).

When we come to display dates and times we can use the JDate class. The JDate class allows us to easily parse dates, output them in different formats, and apply UTC time-zone offsets.

For more information about time zones, please refer to http://www.timeanddate.com.

We often use parsers before we display data to make the data safe or to apply formatting to the data. We need to be careful how we store data that is going to be parsed. If the data is ever going to be edited, we must store the data in its RAW state. If the data is going to be edited extremely rarely and if the parsing is reversible, we may want to consider building a ‘reverse-parser’. This way we can store the data in its parsed format, eradicating the need for parsing when we view the data and reducing the load on the server. Another option available tous is to store the data in both formats. This way we only have to parse data when we save it.

Dealing with Multilingual Requirements

Unlike ASCII and ANSII, Unicode is a multi-byte character set; it uses more than eight bits (one byte) per character. When we use UTF-8 encoding, character byte lengths vary.

Unfortunately, MySQL versions prior to 4.1.2 assume that characters are always eight bits (one byte), which poses some problems. To combat the issue when installing extensions we have the ability to define different SQL files for servers, that do and do not support UTF-8.

In MySQL servers that do not support UTF-8, when we create fields, which define a character length, we are actually defining the length in bytes. Therefore, if we try to store UTF-8 characters that are longer than one byte, we may exceed the size of the field. To combat this, we increase the length of fields to try to accommodate UTF-8strings. For example, a varchar(20) field becomes a varchar(60) field. We triple the size of fields because, although UTF-8 characters can be more than three bytes, the majority of common characters are a maximum of three bytes.

This poses another issue, if we use a varchar(100) field, scaling it up for a MySQL server, which does not support UTF-8, we would have to define it as a varchar(300) field. We cannot do this because varchar fields have a maximum size of 255. The next step is slightly more drastic. We must redefine the field type so as it will accommodate at least three hundred bytes. Therefore, a varchar(100) field becomes a text field.

As an example, the core #__content table includes a field named title. For MySQL severs that support UTF-8, the field is defined as:

`title` varchar(255) NOT NULL default ''

For MySQL severs that do not support UTF-8, the field is defined as:

`title` text NOT NULL default ''

We should also be aware that using a version of MySQL that does not support UTF-8 would affect the MySQL string handling functions. For example ordering by a string field may yield unexpected results. While we can overcome this using postprocessing in our scripts using the JString class, the recommended resolution is to upgrade to the latest version of MySQL.

LEAVE A REPLY

Please enter your comment!
Please enter your name here