MDSWriter is a software for manually creating multi-document summarization corpora and a platform for developing complex annotation tasks spanning multiple steps.
Please use the following citation:
@InProceedings{Meyer:2016:ACLdemo,
author = {Meyer, Christian M. and Benikova, Darina and Mieskes, Margot and Gurevych, Iryna},
title = {MDSWriter: Annotation tool for creating high-quality multi-document summarization corpora},
booktitle = {Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL): System Demonstrations},
month = {August},
year = {2016},
address = {Berlin, Germany},
publisher = {Association for Computational Linguistics},
pages = {97--102},
url = {http://www.aclweb.org/anthology/P/P16/P16-4017.pdf}
}
Abstract: In this paper, we present MDSWriter, a novel open-source annotation tool for creating multi-document summarization corpora. A major innovation of our tool is that we divide the complex summarization task into multiple steps which enables us to efficiently guide the annotators and to record all their intermediate results and user–system interaction data. This allows evaluating the individual components of a complex summarization system and learning from the human composition process. MDSWriter is highly flexible and can be adapted to multiple other tasks.
Contact person: Christian M. Meyer, http://www.ukp.tu-darmstadt.de/people/meyer
Don't hesitate to send us an e-mail or report an issue, if something is broken (and it shouldn't be) or if you have further questions.
For license information, see LICENSE.txt and NOTICE.txt files.
- Video tutorial explaining our initial setup: https://www.youtube.com/channel/UC1-qTfTCnVBZklJwCj2kGDQ
- Screenshots of the proposed seven steps for multi-document summarization:
doc/screenshots.pdf
- Corresponding annotation guidelines:
doc/annotation_guidelines_en.pdf
- Installation guide: see below
- Java 7 and higher
- J2EE platform with JavaServer Pages (JSP) and WebSocket implementations (e.g., Apache Tomcat 7 and higher)
- Maven
- MySQL database (or other SQL database)
- Install Java, Maven, Tomcat, and MySQL.
- Download the source code from GitHub.
- Import the empty schema from
doc/mdswriter_schema.sql
to your database. - Update
src/main/webapp/META-INF/context.xml
with your database settings. - Update
src/main/webapp/js/st.js
: Set the SERVER_URL variable to the URL the software will be depolyed to. - Build the software using
mvn package
- Deploy the war file from
target/
to your application server. - Open http://localhost:8080/mdswriter (or accordingly) and try to log in using admin1:admin2.
- Test if everything works and then import your own data into the schema.
Adapting MDSWriter to a new task works best if you first follow the installation guide and get the basic system to work. For developing your application, we recommend using a J2EE-ready IDE, such as Eclipse or IntelliJ. The following steps are necessary to make MDSWriter do what your application needs:
- Define the annotation steps you want to provide. For each step, add a corresponding JSP file with the user interface to the
webapp
folder. You can of course reuse the existing user interfaces which should save you quite some development time. All JSP files refer to the common_header
,_title
, and_footer
templates to ensure a similar appearance and menu. For the corresponding guidelines, you may want to add a help file to thewebapp/help
folder. If you care about internationalization, put all your strings into the property files atresources/i18n/
- currently we have English and German. - The core link between user interface (JSP) and MDSWriter server is our WebSocket communication protocol. The Java class
de.tudarmstadt.aiphes.mdswriter.Message
contains an overview of all predefined messages. Change the messages according to your needs and implement or reuse the corresponding business logic inde.tudarmstadt.aiphes.mdswriter.MDSWriterEndpoint
and its child and helper classes. Most likely, you will require authentication and storing user-system interaction data which MDSWriter provides you without further adaptation. In case of a cross-document annotation task, you can also reuse the classes in thede.tudarmstadt.aiphes.mdswriter.doc
package. - If necessary, make sure that you also update the database schema for your particular task.