Thursday, 12 April 2012

Entities vs Nodes

For the last six months I've been working on a Drupal 6 site professionally, but developing a couple of personal Drupal 7 sites. However I haven't been going into D7 in any great detail - except for developing entities.

Now I've moved contracts and I'm developing a new Drupal 7 site to replace an older non-Drupal site with lots of additional facilities. And the Drupal implementation is entirely up to me.

Being an OOP person at heart I love the concept of entities but when working with a commercial website you have to make some serious decisions. My personal preferences have to give way to the reality of building a website that delivers the spec and can be maintained and extended by other developers in the future.

So how do you decide what should be a node and what should be an entity?

There's a line in this blog post which says:
You can now actually create data structures specific to your application domain with their own database table (or any other storage mechanism really) plus a standardised way to add fields to them. No need to turn nodes into something they are not.
Which is technically accurate but does not always provide clear guidance, so here's my step-by-step analysis method to decide whether a specific data structure should be a node or not:

1. Is it content? Is the item definitely stuff that gets turned into HTML and displayed for the user? If you were building a review site, a review would definitely be content, so that's a node.

2. Does it need to have revisions? The node module provides revisions and the associated modules make it easy and powerful. Building revisions into custom entities is hard. So if it has to have revisions then it has to be a node.

3. Is there additional "property" (as opposed to "field") information? I had a situation where I wanted to define a "proxy", and a proxy needs a web address, a port number, maybe a username and password. These are fundamental properties of a proxy. You could add these as fields for a node, of course, but it makes more sense for them to be properties of a proxy entity. So this should be considered as an entity. (Another way of looking at this is: is there a need for a new table containing information specific to this data structure? If so, think entity.)

4. Is the structure normally invisible to the user? That should be an entity.

5. Would using an entity instead of a node obscure the function? Perhaps this is tricky to answer, after all dividing functionality out to a new object should never make things more complex. But it's worth asking the question.

Any other ways of to help make the decision?

Remember: there is virtually no overhead in creating a new entity. And there are huge advantages in additional functionality that the core, Entity API, Views, Features and other entity-related modules can give you.

Thursday, 2 February 2012

Cleaning the watchdog


As a Drupal developer - do you do this?

Do you go through watchdog, find every error and fix it? You probably should - and here's why:

If there's even just a Notice, it means something's not right - checking it out might reveal something far more important.

Every error that goes to Watchdog is consuming valuable processing time. I had an instance quite recently where an error in a third party module was generating 10,000 notices per page. And the original author never noticed. That's ten thousand database writes. As you might imagine fixing it speeded things up quite dramatically.

When you can load any page on your website and have no reports added to the watchdog - you know your site is good.

Optimising JavaScript
On a similar note, I inherited a site which had a lot of JavaScript (all jQuery) which had to run on start up. It wasn't a public-facing site so all the JavaScript wasn't a problem what was a problem was that start-up time on IE7 was sloooooooooow - at least 10 seconds maybe more.

I spent a lot of time justifying this by saying "Well IE7 is slow and its JavaScript runs like a pig" which might be true but the client wasn't buying it and wanted it fixed. Fair enough.

Eventually I came to look at the code and it was possibly the most abysmally written code I have ever seen.

Imagine this code was baking 100 cookies, here's what it did.

  • Fetch the ingredients from the cupboard;
  • Get out enough for 1 cookie;
  • Put the ingredients back in the cupboard;
  • Mix the ingredients;
  • Cut the cookie;
  • Cook the cookie until done;

Repeat 100 times.

It worked, but it was stupid and used the most inefficient jQuery selectors imaginable, and even when it knew what object it wanted to act on (because it had already found it) it would make jQuery find it again, using exactly the same, useless, selector. The same coder, I later discovered, who in another script had written $('#x').parent().parent().parent().parent().parent().parent().parent().parent();

I re-coded it to make 100 cookies in a single batch. Start-up time <1sec.

This is a good site for jQuery optimisation: http://hungred.com/useful-information/jquery-optimization-tips-and-tricks/

Monday, 30 January 2012

Little debugging aid

I had been having a real problem tracking down a PHP error in Drupal which was passing an array to htmlspecialchars() in check_plain() instead of a string. I needed to use debug_backtrace() but the issue was related to Views which meant that the arguments being passed at higher levels were catastrophically huge.

It was a real problem, using krumo just resulted in out-of-memory errors. And the problem was also happening inside AJAX calls so any attempt to simply print the output was doomed.

I needed a way to get a function backtrace which was not too verbose, didn't crash the machine and would work in an AJAX call.

The solution is the following routine which takes a debug_backtrace() extracts only the information we need and outputs to a file. It's not Drupal-specific but you may need to set up the directory:


/**
 * Backtrace to a file for when nothing else works
 */
function debug_backtrace_file($fname = 'backtrace') {
  $file = fopen("c:\\tmp\\{$fname}.log", 'ab');
  if (!$file) {
    return;
  }
  fputs($file, '=============== ' . date('Y-m-d H:i:s') . " ===============\n");
  $line = __LINE__;
  foreach (debug_backtrace() as $entry) {
    $function = $entry['function'];
    if ($function!=__FUNCTION__) {
      if (isset($entry['class']) && $entry['class']) {
        $function = "{$entry['class']}::$function";
      }
      fputs($file, sprintf("function %s at line %d in file %s\n", $function, $line, isset($entry['file'])?$entry['file']:t('unknown')));
    }
    $line = $entry['line'];
  }
  fclose($file);
}


Happy bug hunting.

Thursday, 26 January 2012

Where the **** is it?

You know how it is? You have this complex and massive PHP structure with recursive elements and somewhere in all that is a value you're looking for. Views object I'm looking at you!

Here's a PHP routine, which is not Drupal specific, that will dig down into an object to find the property or array item you're looking for - and it avoids recursion by keeping a list of structures it's already searched by MD5ing the serialized version of the structure.

The parameters are the initial structure to search; the property you're looking for - it can be partial if you're not sure what the property is called; $id would usually be the identifier of the structure you're supplying and $depth should be ignored. It returns an array of strings that show the way into the structure to find the property. It's quite common to get multiple results.

One thing to watch out for: If your structure is recursive (like a Views object) and you start the search deeper into the structure with the idea that you'll shorten the search? Well you might, but if it's recursive you might also end up going into one of the recursed objects because the routine hasn't seen it before.

Anyway, with that in mind, here it is:


function common_locate($struct, $seek, $id = '', $depth = 0) {
  static $structs = array(), $results = array();
  if (is_array($struct) || is_object($struct)) {
    $struct = (array) $struct;
    foreach ($struct as $key => $value) {
      if (strpos($key, $seek)!==FALSE) {
        $results[] = "$id => $key = $value";
      }
      if (is_array($value)) {
        $idx = str_replace('0', 'g', md5(serialize($value)));
        if (!isset($structs[$idx])) {
          $structs[$idx] = "$key:$depth";
          common_locate($value, $seek, "{$id}[$key]", $depth+1);
        }
      }
      elseif (is_object($value)) {
        $idx = str_replace('0', 'g', md5(serialize($value)));
        if (!isset($structs[$idx])) {
          $structs[$idx] = "object:$key:$depth";
          common_locate($value, $seek, "{$id}->$key", $depth+1);
        }
      }
    }
  }
  return $results;
}

Wednesday, 25 January 2012

Wiki filter for Drupal


One of the known problems with Drupal is no Wiki module. It is possible to put together a Wiki using various resources but the biggest stumbling block is the text filter.


There have been attempts and some successes. The current perceived wisdom is to use FlexiFilter - sorry but it's just too cumbersome. In fact nightmarish.


And I needed a Wiki filter for a project so this evening I spent 5 hours writing a Wiki filter for Drupal 7.


Does pretty much everything you could want including nested OL/UL lists which can be mixed - I only mention that particularly because it was a bitch. Otherwise it's got italics, bold, underline, strikethrough, h2-h6, blockquotes, code (pre), superscript and subscript.


So to set up a Wiki Text format on Drupal 7 you use these in this order:

  • Limit allowed HTML tags - to none, to clear out any tags in the text.
  • Then my filter
  • Convert line breaks
  • Assign IDs to anchors (From the TOC module)
  • Table of contents filter module
  • Freelinking module to deal with URL links

Then have WikiTools hijack Freelinking (it's an option). And configure the rest as desired.


Voila! A Drupal Wiki.


I will be putting it onto drupal.org presently.

Saturday, 21 January 2012

Stable Field Value Extraction module released

Okay, after weeks of nobody complaining about any aspect of my field_extract module I have finally got around to issuing the code as the full stable version. Hurray.

Now because I'm lazy I use Eclipse for development purposes - I know, I know, how can I be a proper developer if I don't use Linux and Vim? Well, I don't. I was using command line before you were born (unless you were born before 1983). Been there, done that.

However Eclipse and Drupal Git are strange bedfellows, and it can take a bit of work learning how they can be made to work together.

Getting Drupal repos cloned locally isn't too much of an issue once you've got your SSH keys sorted out, and for cloning you can happily use http.

Pushing upstream is another matter entirely (and you will to need to use just Push... instead of Push upstream... until you get it sorted out and configured). If you try using http you may well hit a brick wall, just as I did. the trick is to use git+ssh for your protocol, and it'll work nicely. One thing, which is obvious unless you forget it, is to include Add all tags spec if you are uploading tags as well as branches. Ahem.

Hopefully it won't be so long before my next posting - I have a fun new website specifically for developers coming up. I think you're going to like it, it provides a service that all developers need from time to time, and there is only one website I've found that fulfils the need, and not as well as my version. It's written in D7 of course, leveraging it as a development framework rather than a CMS.

And with that enigma, I'll leave you.

Thursday, 15 December 2011

MD5s as IDs

While there is the chance of duplication it can be handy to use MD5s for creating a "unique" ID for strings. I had this in my current project but I've done it before and there is a potential problem which is more likely than duplicate IDs.

There is a chance that you'll get an MD5 that begins with a zero and, potentially even worse, one that begins with a zero and is all decimal numeric (does not contain a-e digits).

In this instance, with the loose typing of PHP, you might find your MD5 gets converted to a number and loses its leading zero (or zeroes). In which case it's useless as an ID and it will take you a very long time to track it down. I know the first time it happened to me it took me a couple of days.

But there is a very simple solution, when you create your MD5 do an immediate search and replace to change all zeroes into 'g'.

$id = str_replace('0', 'g', md5($source));

Now you can be sure you will never lose your leading zero, because there isn't one.

(By the way, after 12 weeks my field_extract module has received no complaints or bug reports so I shall be promoting it to a full version.)