Creating custom matching engines

Matching engines are small applications that identify duplicate contacts in your AbleOrganizer site. The matching engine system has been designed to be extended through the use of custom matching enginges. AbleOrganizer comes with a single matching engine, called Default Matching Engine. It is appropriate for most, but not all websites.

When do you need a custom matching engine

Custom matching engines can significantly expand the usefulness of AbleOrganizer. You might want to create a custom matching engine in the following situations:

  • You have special rules under which duplicates can be identified. For instance, you have a database of everyone who has ever come to your organization for service. You want to create new contact records for someone who has not contacted you for over 2 years. 
  • You are operating a high-performance website. For instance, you are running a site that receives millions of new contacts daily. You can signficiantly improve the performance of decuplication using an alternate matching engine and different backend for indexing the contacts (such as Apache Solr).
  • You need to identify a duplicate contact from another system. For instance, let's say you have a separate membership database. You might want a custom matching engine to check contacts in AbleOrganizer against that membership database.
  • You need to modify contact data as it is being created in your system. For instance, let's say you assign new contacts to caseworkers. You might want a custom matching engine to check where someone lives and assign the contact to the right caseworker.
  • You need to limit the conditions under which a contact can be created in your system. For instance, let's say a contact should only be allowed to join your site when they are a practicing, licensed physician. You might want to have a custom matching engine to disallow the creation of the contact if a valid certification number is not presented with other contact data.

How to create a custom matching engine

Complete documentation for how to create a custom matching engine is available from the website. Sample code is included there which can run as part of a custom module, which will be automatically identified and made available to AbleOrganizer as a matching engine.

When creating a custom matching engine, the most important thing to remember is that the order in which matching engines execute affects the results returned to AbleOrganizer. Each matching engine receives and passes back an associative array listing potential duplicates. This list can be modified by each matching engine, and AbleOrganizer will always use the top value returned after all matching engines have done their thing.

The second most important thing to remember when creating a matching engine is that matching engines can inject data into contact records and otherwise modify the values being passed. This creates some interesting opportunities to create federated matches that exist between more than one system. You could take contact data from one system, use it to identify a match in an external system via web services, and inject the id from the external system into the AbleOrganizer contact record.

The third most important thing to remember when creating a custom matching engine is that it can be used to dramatically boost the performance of your website. The Default Matching Engine provided with AbleOrganizer was designed to operate on the widest number of systems possible, but not to be the most efficient tool possible. It is possible to author matching engines that leverage Apache Solr, memcached, and a variety of other technologies to provide de-duplication over massive numbers of records.