Tuesday, January 15, 2013

CodeIgniter tutorial: how to extend a helper


CodeIgniter helpers are toolkits of functions created to ease your coder life. You can find them in the system/helpers/ directory. Among them, we get arrays, captchas, cookies, emails, and forms helpers to cite just a few.

The inflector helper (inflector_helper.php) defines the functions singular (which returns the singular form of the word given as parameter) and plural (which returns the plural form of the word given as parameter), but only for the English language. We will see in this tutorial how to add the support of another language (here, French) to the plural function, and thus enabling the pluralization of French.

To extend a helper


As the helpers are not classes, we cannot inherit from them technically speaking, but CodeIgniter proposes a way to extend them, i.e. to add new functions to them.

To do so, you need to create the matching file in the directory application/helpers/ using the helper file name prefixed with MY_, for instance, MY_inflector_helper.php.

That prefix can be changed in the configuration file application/config/config.php, through the variable $config['subclass_prefix'].

The call for the helper is made in the controller:
$this->load->helper('inflector');
$data['man_singular'] = plural('man');
$data['man_plural'] = plural('man');
And the view contains the following code:
1 <?= $man_singular ?>, 2 <?= $man_plural ?>
That will display: 1 man, 2 men

Pluralisation rules


Let's now create the following file application/helpers/MY_inflector_helper.php
function plural_en($str, $force = FALSE)
{
 $result = strval($str);

 $plural_rules = array(
  // always singular
  '/^(benshi|otaku|samurai)$/' => '\1',
  '/^(bison|deer|fish|moose|pike|plankton|salmon|sheep|swine|trout)$/' => '\1',
  '/^(blackfoot|cherokee|chinese|comanchee|cree|delaware|hopi|kiowa|navajo|ojibwa|sioux|swiss|zuni)$/' => '\1',
  // -um => -a (addendum)
  '/^(addend|corrigend|dat|for|medi|millenni|ov|spectr)um$/' => '\1a',
  // -a => -ae (formula)
  '/^(alumn|formul)a$/' => '\1ae',
  // -u => -i (alumnus)
  '/^(alumn|foc|fung|incub|radi|styl|succub)us$/' => '\1i',
  // -on => -a (automaton)
  '/^(automat|criteri|phenomen|polyhedr)on$/' => '\1a',
  // - => -en (ox)
  '/^(ox)$/' => '\1en',
  // -ouse => -ice (mouse)
  '/([m|l])ouse$/' => '\1ice',
  // -ix/-ex => -ices (matrix)
  '/(matr|vert|ind)ix|ex$/' => '\1ices',
  // - => -es (search)
  '/(x|ch|ss|sh)$/' => '\1es',
  // irregulars ending with -y
  '/^penny$/' => 'pence',
  '/^passerby$/' => 'passersby',
  // -y => -ies (query)
  '/([^aeiouy]|qu)y$/' => '\1ies',
  // -hive => -hives (archive)
  '/(hive)$/' => '\1\2s',
  // -f => -ves (half, wife)
  '/(?:([^f])fe|([lr])f)$/' => '\1\2ves',
  // -sis => -ses (basis)
  '/sis$/' => 'ses',
  // -us => -era
  '/^viscus$/' => 'viscera',
  // -o => -oes (tomato)
  '/(buffal|tomat)o$/' => '\1oes',
  // -s => -ses
  '/(bu|campu)s$/' => '\1\2ses', // bus, campus
  '/(alias|census|octopus|platypus|prospectus|status|virus)/' => '\1es', // alias
  // -is => -es (axis)
  '/(ax|cris|test)is$/' => '\1es',
  // -uk => -uit
  '/^(in|inuksh)uk$/' => '\1uit',
  // person => people
  '/(p)erson$/' => '\1eople',
  '/^corpus$/' => 'corpora',
  '/^genus$/' => 'genera',
  '/^foot$/' => 'feet',
  '/^goose$/' => 'geese',
  '/^hoof$/' => 'hooves',
  '/^leaf$/' => 'leaves',
  '/^tooth$/' => 'teeth',
  // compound
  '/^aide-de-camp$/' => 'aides-de-camp',
  '/^director general$/' => 'directors general',
  '/^man-/' => 'men-\2',
  '/^manservant$/' => 'menservants',
  '/^minister-president$/' => 'ministers-president',
  '/^(daughter|father|mother|son)-in-law$/' => '\1s-in-law',
  // man => men
  '/(m)an$/' => '\1en',
  // child => children
  '/(c)hild$/' => '\1hildren',
  // no change (compatibility)
  '/s$/' => 's',
  '/$/' => 's',
 );

 foreach ($plural_rules as $rule => $replacement)
 {
  if (preg_match($rule, $result))
  {
   $result = preg_replace($rule, $replacement, $result);
   break;
  }
 }

 return $result;
}

if ( ! function_exists('plural_fr'))
{
 function plural_fr($str, $force = FALSE)
 {
  $result = strval($str);

  $plural_rules = array(
   // misc exceptions: bonshommes, mesdames, mesdemoiselles, messieurs, yeux
   '/^bonhomme$/u' => 'bonshommes',
   '/^madame$/u' => 'mesdames',
   '/^mademoiselle$/u' => 'mesdemoiselles',
   '/^monsieur$/u' => 'messieurs',
   '/^œil$/u' => 'yeux',
   // exceptions: bleus, landaus, sarraus, pneus
   '/^(bleu|landau|sarrau|pneu)$/u' => '\1s',
   // tuyaux, ruisseaux, feux
   '/(au|eau|eu)$/u' => '\1x',
   // exceptions: bijoux, cailloux, choux, genoux, hiboux, joujoux, poux
   '/^(bij|caill|ch|gen|hib|jouj|p)ou$/u' => '\1oux',
   // -ou => -ous (fous)
   '/ou$/u' => 'ous',
   // exceptions: baux, coraux, émaux, fermaux, soupiraux, travaux, vantaux, ventaux, vitraux
   '/^(b|cor|ém|ferm|soupir|trav|vant|vent|vitr)ail$/u' => '\1aux',
   // exceptions: avals, bals, cals, carnavals, chacals, chorals, cérémonials, festivals, nopals, pals, régals, narvals, récitals
   '/^(av|b|c|carnav|chac|chor|cérémoni|festiv|nop|p|rég|narv|récit)al$/u' => '\1als',
   // -al => -aux (chevaux)
   '/al$/u' => 'aux',
   '/s$/u' => 's',          // no change (compatibility)
   '/x$/u' => 'x',          // no change (compatibility)
   '/$/u' => 's',    // regular plural
  );

  foreach ($plural_rules as $rule => $replacement)
  {
   if (preg_match($rule, $result))
   {
    $result = preg_replace($rule, $replacement, $result);
    break;
   }
  }

  return $result;
 }
}
Two remarks are needed. First, we “overload” the plural function of system/helpers/inflector_helper.php by adding to its name the language code (hence plural_en) in order to homogenize the calls. We could have stopped there, but we also rewrote that function, as the default one does not cover enough cases. And secondly, we add to the regular expressions the Unicode u parameter (and we save the file as UTF-8 encoded) where necessary, i.e. for the French pluralization rules in this example.

The call is made the same way in the controller: the first loaded file is the generic helper, then CodeIgniter automatically loads the application helper.

Pluralization tests


To facilitate the understanding of this example, we do not use a language file. However, we use a test method named inflector in the controller, calling a specific view (application/views/inflector.php). This method checks the pluralization: if a specific case has been forgotten, its inclusion in this method will help validate the proper functioning of plural_fr, as well as any modification of the plural_fr method will remain testable against this set of tests.
    public function inflector()
    {
  $test_data = array(
   'en' => array(
    'ability'=>'abilities', 'addendum'=>'addenda', 'agency'=>'agencies', 'aide-de-camp'=>'aides-de-camp', 'alias'=>'aliases', 'alumna'=>'alumnae', 'alumnus'=>'alumni', 'archive'=>'archives', 'automaton'=>'automata', 'axis'=>'axes', 'basis'=>'bases', 'benshi'=>'benshi', 'bison'=>'bison', 'blackfoot'=>'blackfoot', 'buffalo'=>'buffaloes', 'bus'=>'buses', 'calf'=>'calves', 'campus'=>'campuses', 'census'=>'censuses', 'cherokee'=>'cherokee', 'child'=>'children', 'chinese'=>'chinese', 'comanchee'=>'comanchee', 'corpus'=>'corpora', 'corrigendum'=>'corrigenda', 'cree'=>'cree', 'crisis'=>'crises', 'criterion'=>'criteria', 'datum'=>'data', 'deer'=>'deer', 'delaware'=>'delaware', 'diagnosis'=>'diagnoses', 'director general'=>'directors general', 'dwarf'=>'dwarves', 'elf'=>'elves', 'fish'=>'fish', 'focus'=>'foci', 'foot'=>'feet', 'formula'=>'formulae', 'forum'=>'fora', 'fungus'=>'fungi', 'genus'=>'genera', 'goose'=>'geese', 'half'=>'halves', 'hive'=>'hives', 'hoof'=>'hooves', 'hopi'=>'hopi', 'incubus'=>'incubi', 'index'=>'indices', 'inuk'=>'inuit', 'inukshuk'=>'inukshuit', 'iroquois'=>'iroquois', 'kiowa'=>'kiowa', 'knife'=>'knives', 'leaf'=>'leaves', 'life'=>'lives', 'louse'=>'lice', 'man'=>'men', 'man-about-town'=>'men-about-town', 'man-of-war'=>'men-of-war', 'manservant'=>'menservants', 'matrix'=>'matrices', 'medium'=>'media', 'millennium'=>'millennia', 'minister-president'=>'ministers-president', 'moose'=>'moose', 'mouse'=>'mice', 'navajo'=>'navajo', 'octopus'=>'octopuses', 'ojibwa'=>'ojibwa', 'orange'=>'oranges', 'otaku'=>'otaku', 'ox'=>'oxen', 'ovum'=>'ova', 'passerby'=>'passersby', 'penny'=>'pence', 'person'=>'people', 'phenomenon'=>'phenomena', 'pike'=>'pike', 'plankton'=>'plankton', 'platypus'=>'platypuses', 'policeman'=>'policemen', 'policewoman'=>'policewomen', 'polyhedron'=>'polyhedra', 'postman'=>'postmen', 'prospectus'=>'prospectuses', 'québécois'=>'québécois', 'query'=>'queries', 'radius'=>'radii', 'sabertooth'=>'sabertooths', 'safe'=>'saves', 'salesperson'=>'salespeople', 'salmon'=>'salmon', 'samurai'=>'samurai', 'seaman'=>'seamen', 'series'=>'series', 'sheep'=>'sheep', 'sioux'=>'sioux', 'son-in-law'=>'sons-in-law', 'species'=>'species', 'spectrum'=>'spectra', 'spokesman'=>'spokesmen', 'status'=>'statuses', 'succubus'=>'succubi', 'stylus'=>'styli', 'swine'=>'swine', 'swiss'=>'swiss', 'tenderfoot'=>'tenderfoots', 'testis'=>'testes', 'tête-à-tête'=>'tête-à-têtes', 'tomato'=>'tomatoes', 'tooth'=>'teeth', 'trout'=>'trout', 'vertex'=>'vertices', 'virus'=>'viruses', 'viscus'=>'viscera', 'wife'=>'wives', 'woman'=>'women', 'zuni'=>'zuni'
    ),
   'fr' => array('aval'=>'avals', 'bail'=>'baux', 'bal'=>'bals', 'bonhomme'=>'bonshommes', 'caillou'=>'cailloux', 'cal'=>'cals', 'carnaval'=>'carnavals', 'cérémonial'=>'cérémonials', 'chacal'=>'chacals', 'cheval'=>'chevaux', 'choral'=>'chorals', 'chou'=>'choux', 'corail'=>'coraux', 'bijou'=>'bijoux', 'bleu'=>'bleus', 'émail'=>'émaux', 'épouvantail'=>'épouvantails', 'fermail'=>'fermaux', 'festival'=>'festivals', 'feu'=>'feux', 'fou'=>'fous', 'genou'=>'genoux', 'hibou'=>'hiboux', 'joujou'=>'joujoux', 'landau'=>'landaus', 'madame'=>'mesdames', 'mademoiselle'=>'mesdemoiselles', 'monsieur'=>'messieurs', 'narval'=>'narvals', 'nopal'=>'nopals', 'œil'=>'yeux', 'pal'=>'pals', 'pneu'=>'pneus', 'pou'=>'poux', 'récital'=>'récitals', 'régal'=>'régals', 'ruisseau'=>'ruisseaux', 'sarrau'=>'sarraus', 'soupirail'=>'soupiraux', 'travail'=>'travaux', 'tuyau'=>'tuyaux', 'vantail'=>'vantaux', 'ventail'=>'ventaux', 'vitrail'=>'vitraux'
    )
  );

  $this->load->helper('inflector');
  $results  = array();
  $all_passed = array();
  foreach ($test_data as $lang => $test_lang_data)
  {
   $all_passed[$lang] = true;
   foreach ($test_lang_data as $singular => $plural)
   {
    eval('$pluralized = plural_' . $lang . '($singular);');
    $results[$lang][] = array(
     'singular'  => $singular,
     'plural' => $pluralized,
     'expected' => $plural,
     'result' => ($pluralized == $plural)
     );
    $all_passed[$lang] = $all_passed[$lang] && ($pluralized == $plural);
   }
  }

  $data['results']  = $results;
  $data['all_passed'] = $all_passed;
  $data['languages'] = array('en'=>'English', 'fr'=>'French');
  $this->load->view('inflector', $data);
 }
With the following view: application/views/inflector.php
<!DOCTYPE html>
<html>
<head>
    
    Inflector test
</head>
<body>

<?php foreach ($results as $lang => $lang_results) { ?> Test inflector in <?= $languages[$lang] ?>
<?php if ($all_passed[$lang]) { ?> All tests passed. <?php } else { ?> <?php foreach ($lang_results as $k => $result) { ?> <?php if ($result['result'] !== true) { ?> <?php } ?> <?php } ?>
Singular Plural Expected
<?= $result['singular'] ?> <?= $result['plural'] ?> <?= $result['expected'] ?>
<?php } ?>

<?php } ?>
</body> </html>

Let's sum it up


We have thus extended the inflector helper of CodeIgniter to enable the support of other languages. And more specifically here, to manage the pluralization of French, while improving the one of English. The other helpers are extendable the same way to add to these generic toolkits the specific functions used by your application.


Tutoriel CodeIgniter : étendre les helpers (in French)
Tutorial CodeIgniter: los helpers (in Spanish)
Tutorial CodeIgniter: os helpers (in Portuguese)

No comments:

Post a Comment