nerf alternatives and similar packages
Based on the "Natural Language Processing" category.
Alternatively, view nerf alternatives based on common mentions on social networks and blogs.
chatter8.9 0.0 nerf VS chatterA library of Natural Language Processing algorithms for Haskell.
mecab8.0 0.0 nerf VS mecabA Haskell binding to MeCab
numerals7.8 0.0 nerf VS numeralsConvert numbers to number words
punkt6.9 0.0 nerf VS punktUnsupervised multilingual sentence segmentation.
cndict6.4 0.0 nerf VS cndictChinese/Mandarin <-> English dictionary, Chinese lexer.
concraft-pl5.9 0.0 nerf VS concraft-plA morphosyntactic tagger for Polish based on conditional random fields
hext5.8 0.7 nerf VS hexta text classification library
concraft5.0 0.0 nerf VS concraftA morphosyntactic disambiguation library based on constrained conditional random fields
partage4.4 0.0 nerf VS partageA* parser for tree adjoining grammars
PTQ4.4 0.0 nerf VS PTQAn implementation of Montague's PTQ (Proper Treatment of Quantification).
minimorph4.1 0.0 nerf VS minimorphEnglish spelling functions with an emphasis on simplicity. Originally by https://github.com/kowey.
numerals-base3.9 0.0 nerf VS numerals-baseConvert numbers to number words
tsuntsun3.6 0.0 nerf VS tsuntsunInteracts with tesseract to ease reading of RAW Japanese manga.
haskell-postal3.1 0.0 nerf VS haskell-postalHaskell binding for the libpostal library
corenlp-parser2.8 0.0 nerf VS corenlp-parserLaunches CoreNLP and parses the JSON output
hist-pl2.5 0.0 nerf VS hist-plPrograms and libraries related to the historical dictionary of Polish
polh-lexicon2.5 0.0 nerf VS polh-lexiconPrograms and libraries related to the historical dictionary of Polish
sentiwordnet-parser2.5 0.0 nerf VS sentiwordnet-parserParser for the [SentiWordNet](http://sentiwordnet.isti.cnr.it/) tab-separated file
data-named2.1 0.0 nerf VS data-namedNamed entity data layer
crf-chain2-tiers2.1 0.0 nerf VS crf-chain2-tiersSecond-order, tiered, constrained, linear conditional random fields
phonetic-code1.8 0.0 nerf VS phonetic-codephonetic codes in Haskell
adict1.8 0.0 nerf VS adictApproximate dictionary searching Haskell library
moan1.3 0.0 nerf VS moanLanguage-agnostic analyzer for positional morphosyntactic tags
ENIG1.3 0.0 nerf VS ENIGKorean postposition particle selector
concraft-hr1.3 0.0 nerf VS concraft-hrA part-of-speech tagger for Croatian based on the concraft library.
penntreebank-megaparsec1.2 0.0 nerf VS penntreebank-megaparsecMegaparsec parsers for trees in the Penn Treebank format
arpa- - nerf VS arpaLibrary for reading ARPA n-gram models
Access the most powerful time series database as a service
Do you think we are missing an alternative of nerf or a related project?
Nerf is a statistical named entity recognition (NER) tool based on linear-chain conditional random fields (CRFs). It has been adapted to recognize tree-like structures of NEs (i.e., with recursively embedded NEs) by using the joined label tagging method which -- for a particular sentence -- works as follows:
- CRF model is used to determine the most probable sequence of labels,
- Extended IOB method is used to decode the sequence into a forerst of NEs.
The extended IOB method also provides the inverse encoding function which is needed during the model training.
It is recommanded to install nerf using the
Haskell Tool Stack, which you will need to downoload and
install on your machine beforehand. Then clone this repository into
a local directory and use
stack to install the library by running:
The only data encoding supported by Nerf is
The current version of Nerf works with a simple data format in which:
- Each sentence is kept in a separate line,
- Named entities are represented with embedded beginning and ending tags,
- Contents of individual tags represent named entity types.
<organization>Church of the <deity>Flying Spaghetti Monster</deity></organization> .
Text and label values should be escaped by prepending the
\ character before special
Have a look in the
example directory for an example of a file in the
NER input data
Below is a list of data formats supported within the NER mode.
Nerf can be used to annotate raw text with named entites. The annotated data will be presented in the format which is also used for training and has already been described above. Each sentence should be supplied in a separate line -- currently, Nerf doesn't perform any sentence-level segmentation.
It is also possible to annotate data stored in the XCES format.
Once you have an annotated data file
train.nes (and, optionally, an evaluation
eval.nes) conformant with the format described above you can train
the Nerf model using the following command:
nerf train train.nes -e eval.nes -o model.bin
nerf train --help to learn more about the program arguments and possible
The nerf tool can be also supplied with additional runtime system options. For example, to train the model using four threads, use:
nerf train train.nes -e eval.nes -o model.bin +RTS -N4
WARNING: Currently, the
-N runtime option sometimes leads to errors in
the training process and therefore should be avoided for the time being.
Nerf supports a list of NE-related dictionaries:
To use the particular dictionary during NER you have to supply it as a command line argument during the training process, for example:
nerf train train.nes --polimorf PoliMorf-0.6.1.tab
Named entity recognition
To annotate the
input.txt data file using the trained
model.bin model, run:
nerf ner model.bin < input.txt
Annotated data will be printed to
stdout. Data formats currently supported within
the NER mode has been described above. Run
nerf ner --help to learn more about the
additional NER arguments.
Nerf provides also a client/server mode. It is handy when, for example, you need to annotate a large collection of small files. Loading Nerf model from a disk takes considerable amount of time which makes the tagging method described above very slow in such a setting.
To start the Nerf server, run:
nerf server model.bin
You can supply a custom port number using a
--port option. For example,
to run the server on the
10101 port, use the following command:
nerf server model.bin --port 10101
To use the server in a multi-threaded environment, you need to specify the
-N RTS option. A set of options which usually yield good
server performance is presented in the following example:
nerf server model.bin +RTS -N -A4M -qg1 -I0
nerf server --help to learn more about possible server-mode options.
The client mode works just like the tagging mode. The only difference is that, instead of supplying your client with a model, you need to specify the port number (in case you used a custom one when starting the server; otherwise, the default port number will be used).
nerf client --port 10101 < input.txt > output.nes
nerf client --help to learn more about the possible client-mode options.