11 min read

In this article by Allan MacGregor, author of the book Magento PHP Developer’s Guide – Second Edition, we cover details about EAV models, its usefulness in retrieving data, and the advantages it provides to the merchants and developers.

EAV stands for entity, attribute, and value and is probably the most difficult concept for new Magento developers to grasp. While the EAV concept is not unique to Magento, it is rarely implemented on modern systems. Additionally, a Magento implementation is not a simple one.

Magento PHP Developer's Guide - Second Edition

(For more resources related to this topic, see here.)

What is EAV?

In order to understand what EAV is and what its role within Magento is, we need to break down parts of the EAV model:

  • Entity: This represents the data items (objects) inside Magento products, customers, categories, and orders. Each entity is stored in the database with a unique ID.
  • Attribute: These are our object properties. Instead of having one column per attribute on the product table, attributes are stored on separate sets of tables.
  • Value: As the name implies, it is simply the value link to a particular attribute.

This data model is the secret behind Magento’s flexibility and power, allowing entities to add and remove new properties without having to make any changes to the code, templates, or the database schema.

This model can be seen as a vertical way of growing our database (new attributes and more rows), while the traditional model involves a horizontal growth pattern (new attributes and more columns), which would result in a schema redesign every time new attributes are added.

The EAV model not only allows for the fast evolution of our database, but is also more effective because it only works with non-empty attributes, avoiding the need to reserve additional space in the database for null values.

If you are interested in exploring and learning more about the Magento database structure, I highly recommend visiting www.magereverse.com.

Adding a new product attribute is as simple going to the Magento backend and specifying the new attribute type, be it color, size, brand, or anything else. The opposite is true as well and we can get rid of unused attributes on our products or customer models.

For more information on managing attributes, visit http://www.magentocommerce.com/knowledge-base/entry/how-do-attributes-work-in-magento.

The Magento community edition currently has eight different types of EAV objects:

  • Customer
  • Customer Address
  • Products
  • Product Categories
  • Orders
  • Invoices
  • Credit Memos
  • Shipments

The Magento Enterprise Edition has one additional type called RMA item, which is part of the Return Merchandise Authorization (RMA) system.

All this flexibility and power is not free; there is a price to pay. Implementing the EAV model results in having our entity data distributed on a large number of tables. For example, just the Product Model is distributed to around 40 different tables.

The following diagram only shows a few of the tables involved in saving the information of Magento products:

Magento PHP Developer's Guide - Second Edition

Other major downsides of EAV are the loss of performance while retrieving large collections of EAV objects and an increase in the database query complexity. As the data is more fragmented (stored in more tables), selecting a single record involves several joins.

One way Magento works around this downside of EAV is by making use of indexes and flat tables. For example, Magento can save all the product information into the flat_catalog table for easier and faster access.

Let’s continue using Magento products as our example and manually build the query to retrieve a single product.

If you have phpmyadmin or MySQL Workbench installed on your development environment, you can experiment with the following queries. Each can be downloaded on the PHPMyAdmin website at http://www.phpmyadmin.net/ and the MySQL Workbench website at http://www.mysql.com/products/workbench/.

The first table that we need to use is the catalog_product_entity table. We canconsider this our main product EAV table since it contains the main entity records for our products:

Magento PHP Developer's Guide - Second Edition

Let’s query the table by running the following SQL query:

SELECT FROM `catalog_product_entity`;

The table contains the following fields:

  • entity_id: This is our product unique identifier that is used internally by Magento.
  • entity_type_id: Magento has several different types of EAV models. Products, customers, and orders are just some of them. Identifying each of these by type allows Magento to retrieve the attributes and values from the appropriate tables.
  • attribute_set_id: Product attributes can be grouped locally into attribute sets. Attribute sets allow even further flexibility on the product structure as products are not forced to use all available attributes.
  • type_id: There are several different types of products in Magento: simple, configurable, bundled, downloadable, and grouped products; each with unique settings and functionality.
  • sku: This stands for Stock Keeping Unit and is a number or code used to identify each unique product or item for sale in a store. This is a user-defined value.
  • has_options: This is used to identify if a product has custom options.
  • required_options: This is used to identify if any of the custom options that are required.
  • created_at: This is the row creation date.
  • updated_at: This is the last time the row was modified.

Now we have a basic understanding of the product entity table. Each record represents a single product in our Magento store, but we don’t have much information about that product beyond the SKU and the product type.

So, where are the attributes stored? And how does Magento know the difference between a product attribute and a customer attribute?

For this, we need to take a look into the eav_attribute table by running the following SQL query:

SELECT FROM `eav_attribute`;

As a result, we will not only see the product attributes, but also the attributes corresponding to the customer model, order model, and so on. Fortunately, we already have a key to filter the attributes from this table. Let’s run the following query:

SELECT FROM `eav_attribute`
WHERE entity_type_id = 4;

This query tells the database to only retrieve the attributes where the entity_type_id column is equal to the product entity_type_id(4). Before moving, let’s analyze the most important fields inside the eav_attribute table:

  • attribute_id: This is the unique identifier for each attribute and primary key of the table.
  • entity_type_id: This relates each attribute to a specific eav model type.
  • attribute_code: This is the name or key of our attribute and is used to generate the getters and setters for our magic methods.
  • backend_model: These manage loading and storing data into the database.
  • backend_type: This specifies the type of value stored in the backend (database).
  • backend_table: This is used to specify if the attribute should be stored on a special table instead of the default EAV table.
  • frontend_model: These handle the rendering of the attribute element into a web browser.
  • frontend_input: Similar to the frontend model, the frontend input specifies the type of input field the web browser should render.
  • frontend_label: This is the label/name of the attribute as it should be rendered by the browser.
  • source_model: These are used to populate an attribute with possible values. Magento comes with several predefined source models for countries, yes or no values, regions, and so on.

Retrieving the data

At this point, we have successfully retrieved a product entity and the specific attributes that apply to that entity. Now it’s time to start retrieving the actual values. In order to simplify the example (and the query) a little, we will only try to retrieve the name attribute of our products.

How do we know which table our attribute values are stored on? Well, thankfully, Magento follows a naming convention to name the tables. If we inspect our database structure, we will notice that there are several tables using the catalog_product_entity prefix:

  • catalog_product_entity
  • catalog_product_entity_datetime
  • catalog_product_entity_decimal
  • catalog_product_entity_int
  • catalog_product_entity_text
  • catalog_product_entity_varchar
  • catalog_product_entity_gallery
  • catalog_product_entity_media_gallery
  • catalog_product_entity_tier_price

Wait! How do we know which is the right table to query for our name attribute values? If you were paying attention, I already gave you the answer. Remember that the eav_attribute table had a column called backend_type?

Magento EAV stores each attribute on a different table based on the backend type of that attribute. If we want to confirm the backend type of our name attribute, we can do so by running the following code:

SELECT FROM `eav_attribute`
WHERE `entity_type_id` =4 AND `attribute_code` = 'name';

As a result, we should see that the backend type is varchar and that the values for this attribute are stored in the catalog_product_entity_varchar table. Let’s inspect this table:

Magento PHP Developer's Guide - Second Edition

The catalog_product_entity_varchar table is formed by only 6 columns:

  • value_id: This is the attribute value unique identifier and primary key
  • entity_type_id: This is the entity type ID to which this value belongs
  • attribute_id: This is the foreign key that relates the value to our eav_entity table
  • store_id: This is the foreign key matching an attribute value with a storeview
  • entity_id: This is the foreign key relating to the corresponding entity table, in this case, catalog_product_entity
  • value: This is the actual value that we want to retrieve

Depending on the attribute configuration, we can have it as a global value, meaning, it applies across all store views or a value per storeview.

Now that we finally have all the tables that we need to retrieve the product information, we can build our query:

SELECT p.entity_id AS product_id, var.value AS product_name, p.sku AS product_sku
FROM catalog_product_entity p, eav_attribute eav, catalog_product_entity_varchar var
WHERE p.entity_type_id = eav.entity_type_id
AND var.entity_id = p.entity_id
   AND eav.attribute_code = 'name'
   AND eav.attribute_id = var.attribute_id

Magento PHP Developer's Guide - Second Edition

From our query, we should see a result set with three columns, product_id, product_name, and product_sku. So let’s step back for a second in order to get product names with SKUs with raw SQL. We had to write a five-line SQL query, and we only retrieved two values from our products, from one single EAV value table if we want to retrieve a numeric field such as price or a text-value-like product.

If we didn’t have an ORM in place, maintaining Magento would be almost impossible. Fortunately, we do have an ORM in place, and most likely, you will never need to deal with raw SQL to work with Magento.

That said, let’s see how we can retrieve the same product information by using the Magento ORM:

  1. Our first step is going to be to instantiate a product collection:
    $collection = Mage::getModel('catalog/product')->getCollection();
  2. Then we will specifically tell Magento to select the name attribute:
    $collection->addAttributeToSelect('name');
  3. Then, we will ask it to sort the collection by name:
    $collection->setOrder('name', 'asc');
  4. Finally, we will tell Magento to load the collection:
    $collection->load();
  5. The end result is a collection of all products in the store sorted by name. We can inspect the actual SQL query by running the following code:
    echo $collection->getSelect()->__toString();

In just three lines of code, we are telling Magento to grab all the products in the store, to specifically select the name, and finally order the products by name.

The last line $collection->getSelect()->__toString(); allows to see the actual query that Magento is executing in our behalf.

The actual query being generated by Magento is as follows:

SELECT `e`.. IF( at_name.value_id >0, at_name.value, at_name_default.value ) AS `name`
FROM `catalog_product_entity` AS `e`
LEFT JOIN `catalog_product_entity_varchar` AS `at_name_default` ON (`at_name_default`.`entity_id` = `e`.`entity_id`)
AND (`at_name_default`.`attribute_id` = '65')
AND `at_name_default`.`store_id` =0
LEFT JOIN `catalog_product_entity_varchar` AS `at_name` ON ( `at_name`.`entity_id` = `e`.`entity_id` )
AND (`at_name`.`attribute_id` = '65')
AND (`at_name`.`store_id` =1)
ORDER BY `name` ASC

As we can see, the ORM and the EAV models are wonderful tools that not only put a lot of power and flexibility in the hands of the developers, but they also do it in a way that is comprehensive and easy to use.

Summary

In this article, we learned about EAV models and how they are structured to provide Magento with data flexibility and extensibility that both merchants and developers can take advantage of.

Resources for Article:


Further resources on this subject:


LEAVE A REPLY

Please enter your comment!
Please enter your name here