back home

How effectively use compound indexes in RDBMS?

We recently played around indexes in JIRA and one thing that was brought up is that compound indexes can be tricky. You don’t use them often and it’s easy to forget how to create them correctly. So here’s my friendly reminder. Although examples show a custom entity engine that’s used in JIRA I bet you’ll understand the meaning easily.

In entity engine you define a compound index by listing multiple index-fields like:

<index unique="true" name="uk_group_attr_name_lval">
        <index-field name="groupId"/>
        <index-field name="name"/>
        <index-field name="lowerValue"/>
</index>

The thing it’s worth to have in mind that the first index-field should be the most selective, preferably unique. Thanks to that RDBMS can do a pretty good job traversing the index tree.

What’s great about compound indexes is that they are used also when your query uses only a few constraints, but the trick is to use the left most ones only.

So in case of the above index following queries will probably use the index:

SELECT * FROM cwd_group_attributes WHERE group_id = 34;
SELECT * FROM cwd_group_attributes WHERE group_id = 34 AND name = 'jira-users';

But queries like that will not:

SELECT * FROM cwd_group_attributes WHERE name = 'jira-users';
SELECT * FROM cwd_group_attributes WHERE group_id = 34 AND lowerValue = 'value';

So knowing that you don’t need to create two indexes indexing the same column as long as it’s mentioned in compound index in first place.

What’s great about indexes is that they can speed up queries using another technique, imagine query like:

SELECT name FROM users WHERE user_id = 23;

If you create an index for user_id only RDBMS will have to read the row’s position then read row from the table and read name from it. But if you create an index for both of them:

<index unique="true" name="uk_user_id_name">
        <index-field name="userId"/>
        <index-field name="name"/>
</index>

Then the above query will read index only, because the needed value is already there!

But remember that if you want to enforce constraints on each column separately you still need to have separate index for each of the columns.

These are general notes, as far as I know they relate to all popular engines but some RDBMS systems offer you more fine grained control for indexes (and cool features).