Skip to content

Extending Annotator with a Custom Concept Recognizer

Michael Dorf edited this page Feb 24, 2015 · 6 revisions

Below is a set of guidelines on how to implement your own concept recognizer module within NCBO Annotator.

Create a custom class

Take a look at the lib directory of ncbo_annotator project in Github:

https://github.com/ncbo/ncbo_annotator/tree/master/lib/ncbo_annotator

In /recognizers folder, you'll see the currently implemented concept recognizers. You will need to create a new file there, dataone.rb (or any other name you choose). In that Ruby file, you'll need to implement a method:

def annotate_direct(text, options={})

This method must return a Hash of the following format:

allAnnotations = {
	concept_resource_id => ontology_resource_id
}

Take a look at ncbo_annotator/lib/ncbo_annotator/recognizers/mallet.rb for a sample code. Mallet is a recognizer that we implemented as an alternative to Mgrep.

https://github.com/ncbo/ncbo_annotator/blob/master/lib/ncbo_annotator/recognizers/mallet.rb

Enable and pass the "recognizer" parameter

Take a look at the REST service controller for Annotator:

https://github.com/ncbo/ontologies_api/blob/master/controllers/annotator_controller.rb

Specifically, this code:

recognizer = (Annotator.settings.enable_recognizer_param && params_copy["recognizer"]) || 'mgrep'

# see if a name of the recognizer has been passed in, use default if not or error
begin
	recognizer = recognizer.capitalize
	clazz = "Annotator::Models::Recognizers::#{recognizer}".split('::').inject(Object) {|o, c| o.const_get c}
	annotator = clazz.new
rescue
	annotator = Annotator::Models::Recognizers::Mgrep.new
end

As you can see, the controller accepts a parameter "recognizer", which contains the name of your custom recognizer class. For example:

/annotator?text=blah&recognizer=mgrep    => Annotator::Models::Recognizers::Mgrep.new
/annotator?text=blah&recognizer=mallet   => Annotator::Models::Recognizers::Mallet.new
/annotator?text=blah&recognizer=dataone  => Annotator::Models::Recognizers::Dataone.new

Annotator.settings.enable_recognizer_param is a config parameter that enables the custom recognizer functionality in the Annotator. It is set in ncbo_annotator/config/config.rb. You will need to set it to true in order to enable the custom recognizer feature.

Annotator.config do |config|
	config.enable_recognizer_param = true
end

See sample here:

https://github.com/ncbo/ontologies_api/blob/master/config/environments/config.rb.sample

Test your custom recognizer

Go to:

http://localhost:9393/annotator?text=my%20sample%20text%20that%20contains%20concepts&recognizer=dataone

and make sure you get back results consistent with your custom recognizer's output.