Tokenizer Web Service

Example Usage

tokenizer-server start
curl -d "input=this is an english text&language=en" http://localhost:9393 -XPOST
outputs:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<KAF version="v1.opener" xml:lang="en">
  <kafHeader>
    <linguisticProcessors layer="text">
      <lp name="opener-sentence-splitter-en" timestamp="2013-06-11T13:29:21Z" version="0.0.1"/>
      <lp name="opener-tokenizer-en" timestamp="2013-06-11T13:29:22Z" version="1.0.1"/>
    </linguisticProcessors>
  </kafHeader>
  <text>
    <wf length="4" offset="0" para="1" sent="1" wid="w1">this</wf>
    <wf length="2" offset="5" para="1" sent="1" wid="w2">is</wf>
    <wf length="2" offset="8" para="1" sent="1" wid="w3">an</wf>
    <wf length="7" offset="11" para="1" sent="1" wid="w4">english</wf>
    <wf length="4" offset="19" para="1" sent="1" wid="w5">text</wf>
  </text>
</KAF>
curl -d 'text=<?xml version="1.0" encoding="UTF-8" standalone="yes"?><KAF xml:lang="en"><raw>this is an english text</raw></KAF>&kaf=true' http://localhost:9292 -XPOST
outputs:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<KAF version="v1.opener" xml:lang="en">
  <kafHeader>
    <linguisticProcessors layer="text">
      <lp name="opener-sentence-splitter-en" timestamp="2013-06-11T13:26:15Z" version="0.0.1"/>
      <lp name="opener-tokenizer-en" timestamp="2013-06-11T13:26:16Z" version="1.0.1"/>
    </linguisticProcessors>
  </kafHeader>
  <text>
    <wf length="4" offset="0" para="1" sent="1" wid="w1">this</wf>
    <wf length="2" offset="5" para="1" sent="1" wid="w2">is</wf>
    <wf length="2" offset="8" para="1" sent="1" wid="w3">an</wf>
    <wf length="7" offset="11" para="1" sent="1" wid="w4">english</wf>
    <wf length="4" offset="19" para="1" sent="1" wid="w5">text</wf>
  </text>
</KAF>

Try the webservice

* required

** When entering a value no response will be displayed in the browser.














Actions

POST /
Tokenize the input text. See arguments listing for more options.
GET /
Show this page

Arguments

The webservice takes the following arguments:

* required

text*
The input text
kaf [true | false]
The input is in KAF format.
language [English | German | Dutch | French | Spanish | Italian]
The language of the provided text
callbacks
You can provide a list of callback urls. If you provide callback urls the tokenizer will run as a background job and a callback with the results will be performed (POST) to the first url in the callback list. The other urls in callback list will be provided in the "callbacks" argument.

Using callback you can chain together several OpeNER webservices in one call. The first, will call the second, which will call the third, etc. See for more information the webservice documentation online.
error_callback
URL to notify if errors occur in the background process. The error callback will do a POST with the error message in the 'error' field.