Drupal views join on multiple columns

drupal
Published

April 24, 2014

Drupal views UI is nice from time to time, once you’ve spent lots of time poking at it and know it’s quirks. The rest of the time, what you need can’t be done via the UI, and you are plunged into the realms of programmatic customization whose APIs may or may not be properly documented. Today, I had to fix the query behind a view I’d inherited which had been constructed mostly through the UI and had a bad join, because the UI only lets you specify a single column from the left and right tables as defining the relationship. Determining and writing the correct SQL manually took 5 minutes, but getting views to duplicate it took reading the docs - and then the views code - for the rest of the afternoon. Here’s what I finally figured out is the most simple / direct way to specify joins against multiple columns with views:

1. Add a join_handler to your relationship in hook_views_data

Chances are if you need to be joining on multiple columns, you are using a custom data model and have described it to views with hook_views_data already. (If not, you’ve probably got something different going on and this post may not help you. see my addendum at the bottom.) A basic “relationship” aka SQL join on only one column is defined something like this in hook_views_data:

$data['tablea']['some_field_from_tableb'] = array(
  'title' => t('Some field from table B'),
  'relationship' => array(
    'base' => 'tableb',
    'base field' => 'a_id',
    'field' => 'id',
    'handler' => 'views_handler_relationship',
  ),
);

This’ll get you SQL along the lines of

... FROM tablea LEFT JOIN tableb ON tablea.id = tableb.a_id

But if you want SQL along the lines of

... FROM tablea LEFT JOIN tableb ON tablea.id = tableb.a_id AND tablea.instance = tableb.instance

then try the following php:

$data['tablea']['some_field_from_tableb'] = array(
  'title' => t('Some field from table B'),
  'relationship' => array(
    'base' => 'tableb',
    'base field' => 'a_id',
    'field' => 'id',
    'handler' => 'views_handler_relationship',
    'join_handler' => 'my_fixed_up_join',
  ),
);

2. Code my_fixed_up_join

Then, code up my_fixed_up_join, which should be a class extending views_join. You can get quite carried away with that if you want, but in the unlikely event that you’re of the mindset that we are simply trying to add an extra condition to a join in an SQL query here, and defining a new class at all in order to achieve this is already a bit much, here’s a quick and dirty version that’s “good enough” if you only have a few views that use it:

class my_fixed_up_join extends views_join
{
  public function build_join($select_query, $table, $view_query) {
    $select_query->addJoin('INNER', $this->table, $table['alias'], "table_alias_a.id = ${table['alias']}.a_id AND table_alias_a.instance = ${table['alias']}.instance");
  }
}

This class needs to live in a file that drupal’s auto-including tendrils have already sucked in; it won’t happen automatically by virtue of any file naming conventions. I suggest sticking it in the same file the related hook_views_data lives in.

You’ll also need to clear drupal’s caches so it reruns your hook_views_data.

Addendum - altering field joins

If your view is of content or other fieldable entities, and you need to tweak a join for one of the fields, this can also be done by writing an extension to the views_join class – you just have to tell Views about it in a different way.

In hook_views_query_alter:

function MYMODULE_views_query_alter(&$view, &$query) {
  // logic to filter down to just the view we want to be modifying, such as:
  if($view->name == 'whatever-my-view-is') {
    if(!empty($query->table_queue['table_alias_views_assigned_to_the_table_whose_join_needs_altering']['join'])) {
      $query->table_queue['table_alias_views_assigned_to_the_table_whose_join_needs_altering']['join'] = new my_fixed_up_join();
    }
  }
}

In my case, it was sufficient to pass no arguments to the my_fixed_up_join constructor, and have it just kick out a hardcoded string of SQL conditions:

public function build_join($select_query, $table, $view_query) {
    $join_sql = <<<JOINSQL
   (field_collection_item_field_data_field_sessions.item_id = field_collection_item_field_data_field_sessions__field_data_field_event_date.entity_id  AND field_collection_item_field_data_field_sessions__field_data_field_event_date.entity_type = 'field_collection_item')
OR (
    node.nid = field_collection_item_field_data_field_sessions__field_data_field_event_date.entity_id AND field_collection_item_field_data_field_sessions__field_data_field_event_date.entity_type = 'node'
  )
  AND (
    field_collection_item_field_data_field_sessions__field_data_field_event_date.deleted = '0'
  )
JOINSQL;
 
    $select_query->addJoin('LEFT', $table['table'], $table['alias'], $join_sql);
}

Why would you need to do this? In the example above, I was displaying several types of content in the view, and some of them attach the field directly while others are associated to the field through a “relationship” to another entity. The views UI will let you add the field as a member of the content, and will let you add a different field as a member of the related entity, but under the hood it joins to the field data table twice, creates two result columns, and you won’t be able to do things like sort on that field. What you really want is one join to the field data table with an OR condition, so the field comes up as a single column.