GeoTxt is a scalable geoparsing system for the recognition and geolocation of place names in unstructured text. GeoTxt offers six named entity recognition (NER) algorithms for place name recognition, and utilizes Apache Solr for the indexing, ranking, and retrieval of toponyms, enabling scalable geoparsing for streaming text. GeoTxt offers a flexible application programming interface (API), generating a GeoJSON FeatureCollection as output.
GeoTxt is described in the following publication. Please use this citation to refer to the system:
Karimzadeh, M., Pezanowski, S., Wallgrün, J. O., MacEachren, A. M., & Wallgrün, J. O. (2019). GeoTxt: A scalable geoparsing system for unstructured text geolocation. Transactions in GIS, 23(1), 118–136. https://doi.org/10.1111/tgis.12510
The codes in this project includse GeoTxt as well as the GeoAnnotator Web API / Java Core code.
This file will be packaged with your application, when using activator dist
.
We plan on enriching the instructions for building the system from source soon.