Skip to content

semanticize/dumpparser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

92 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Travis

Wikipedia dump parser for semanticizest

This program parses Wikipedia database dumps for consumption by semanticizest.

Installing

Make sure you have a Go compiler (1.2 or newer) and Git. On Debian/Ubuntu/Mint, that's:

sudo apt-get install git golang-go

On CentOS:

sudo yum -y install git golang

Set up a Go workspace, if you haven't already. For example:

mkdir /some/where/go
cd /some/where/go
export GOPATH=$(pwd)

Fetch and compile:

go get github.com/semanticize/st
go install github.com/semanticize/st/dumpparser
go install github.com/semanticize/st/semanticizest

You now have a working parser at ${GOPATH}/bin/dumpparser. Issue:

${GOPATH}/bin/dumpparser --help

to figure out how to generate a semanticizer model, then use this model from the REST API:

${GOPATH}/bin/semanticizest --http=:5002 your_model
curl http://localhost:5002/all -d 'Does the entity linking work?'

About

MediaWiki dump parser in Go

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages