-
Notifications
You must be signed in to change notification settings - Fork 9
Extending Annotator with a Custom Concept Recognizer
Below is a set of guidelines on how to implement your own concept recognizer module within NCBO Annotator.
Take a look at the lib directory of ncbo_annotator project in Github:
https://github.com/ncbo/ncbo_annotator/tree/master/lib/ncbo_annotator
In /recognizers
folder, you'll see the currently implemented concept recognizers. You will need to create a new file there, dataone.rb (or any other name you choose). In that Ruby file, you'll need to implement a method:
def annotate_direct(text, options={})
This method must return a Hash of the following format:
allAnnotations = {
concept_resource_id => ontology_resource_id
}
Take a look at ncbo_annotator/lib/ncbo_annotator/recognizers/mallet.rb
for a sample code. Mallet is a recognizer that we implemented as an alternative to Mgrep.
https://github.com/ncbo/ncbo_annotator/blob/master/lib/ncbo_annotator/recognizers/mallet.rb
Take a look at the REST service controller for Annotator:
https://github.com/ncbo/ontologies_api/blob/master/controllers/annotator_controller.rb
Specifically, this code:
recognizer = (Annotator.settings.enable_recognizer_param && params_copy["recognizer"]) || 'mgrep'
# see if a name of the recognizer has been passed in, use default if not or error
begin
recognizer = recognizer.capitalize
clazz = "Annotator::Models::Recognizers::#{recognizer}".split('::').inject(Object) {|o, c| o.const_get c}
annotator = clazz.new
rescue
annotator = Annotator::Models::Recognizers::Mgrep.new
end
As you can see, the controller accepts a parameter "recognizer", which contains the name of your custom recognizer class. For example:
/annotator?text=blah&recognizer=mgrep => Annotator::Models::Recognizers::Mgrep.new
/annotator?text=blah&recognizer=mallet => Annotator::Models::Recognizers::Mallet.new
/annotator?text=blah&recognizer=dataone => Annotator::Models::Recognizers::Dataone.new
Annotator.settings.enable_recognizer_param
is a config parameter that enables the custom recognizer functionality in the Annotator. It is set in ncbo_annotator/config/config.rb
. You will need to set it to true
in order to enable the custom recognizer feature.
Annotator.config do |config|
config.enable_recognizer_param = true
end
See sample here:
https://github.com/ncbo/ontologies_api/blob/master/config/environments/config.rb.sample
Go to http://localhost:9393/annotator?text=my sample text that contains concepts&recognizer=dataone and make sure you get back results consistent with your custom recognizer's output.