-
Notifications
You must be signed in to change notification settings - Fork 13
DSO Project
Initial Commit
This version is based on the prototype branch of helloqa. For helloqa, we made the following changes:
- Fixed dependency errors
- Fixed SQL errors
- Fixed platform errors
For DSO project, we convert the components developed in summer'10 to CSE's phases. More specifically, we have converted the following components:
- AnswerTypeExtractor (https://github.com/oaqa/helloqa/tree/prototype/src/main/java/edu/cmu/lti/oaqa/openqa/dso/phase/answertype)
- KeytermExtractor (https://github.com/oaqa/helloqa/tree/prototype/src/main/java/edu/cmu/lti/oaqa/openqa/dso/phase/keyterm)
- ICEventExtractor (https://github.com/oaqa/helloqa/tree/prototype/src/main/java/edu/cmu/lti/oaqa/openqa/dso/phase/icevent)
- PassageRetrieval (https://github.com/oaqa/helloqa/tree/prototype/src/main/java/edu/cmu/lti/oaqa/openqa/dso/phase/passage)
- InformationExtractor (https://github.com/oaqa/helloqa/tree/prototype/src/main/java/edu/cmu/lti/oaqa/openqa/dso/phase/ie)
- AnswerGenerator (https://github.com/oaqa/helloqa/tree/prototype/src/main/java/edu/cmu/lti/oaqa/openqa/dso/phase/answer)
Those components are extended from the following abstract classes:
- edu.cmu.lti.oaqa.openqa.dso.framework.base.AbstractAnswerTypeExtractor
- edu.cmu.lti.oaqa.openqa.dso.framework.base.AbstractICEventExtractor
- edu.cmu.lti.oaqa.openqa.dso.framework.base.AbstractKeytermExtractor
- edu.cmu.lti.oaqa.openqa.dso.framework.base.AbstractPassageRetrieval
- edu.cmu.lti.oaqa.openqa.dso.framework.base.AbstractInformationExtractor
- edu.cmu.lti.oaqa.openqa.dso.framework.base.AbstractAnswerGenerator
The JCas objects are handled by the following classes:
- edu.cmu.lti.oaqa.openqa.dso.framework.jcas.AnswerTypeJCasManipulator
- edu.cmu.lti.oaqa.openqa.dso.framework.jcas.ICEventJCasManipulator
- edu.cmu.lti.oaqa.openqa.dso.framework.jcas.KeytermJCasManipulator
- edu.cmu.lti.oaqa.openqa.dso.framework.jcas.DocumentJCasManipulator
- edu.cmu.lti.oaqa.openqa.dso.framework.jcas.AnsJCasManipulator
Updated Aug-19-2013
To run the system, we have to make sure that the following files exist under helloqa:
- conf folder with its property files (used by Ephyra)
- res folder with all its files
- install indri by following the instructions http://lemur.sourceforge.net/indri/
- copy the libindri_jni.so file generated to helloqa/lib/
- if the file libindr_jni.so cannot be found, try ./configure --prefix= --enable-java --with-javahome=$JAVA_HOME, and make, make install again
- We also employ the type system developed previously (summer '11) for DSO, the type system can be found here: https://github.com/oaqa/helloqa/blob/prototype/src/main/resources/edu/cmu/lti/oaqa/OAQATypes.xml
Updated Aug-20-2013
Fixed all the dependency issues by converting jar files to maven dependencies.
The following steps are required to run the system:
- Get an account to access the nexus server http://mu.lti.cs.cmu.edu:8081/nexus/index.html#welcome
- ask Zi, Avner, or Rui to create one
- Configure internal Maven repository, if you are using Linux (other platforms (e.g. windows) may have problems):
- You can find your maven files here:
cd ~/.m2
- If no .m2 folder exists:
mkdir -p ~/.m2
- Create a file settings.xml underUpdated Aug-29-2013 .m2
- ask Zi, Avner, or Rui to email you the file
- Open eclipse, click tab ‘Windows->Preferences->Maven->User Settings’, click button
update settings
andapply
. - It should be working now.
- You can find your maven files here:
Updated Aug-29-2013
Evaluation is ready! Baseline System is ready!
An initial testing results for the DSO framework:
- Question: Who is the mastermind of World Trade Center bombing?
- Answer Key: Ramzi Yousef
- Phase 1: AnswerTypeExtractor
- Input: Who is the mastermind of World Trade Center bombing?
- Output: NEproperName->NEperson->NEterrorist
- Phase 2: KeytermExtractor
- Input: Who is the mastermind of World Trade Center bombing?
- Output: [mastermind, World Trade Center, bombing]
- Phase 3: ICEventExtractor
- Input: Who is the mastermind of World Trade Center bombing?
- Output: combined-27409
- Phase 4: PassageRetrieval
- Input: Question, Keyterms and AnswerType
- Output: doc size = 152
- Phase 5: InformationExtractor
- Input: Question, Keyterms, AnswerType and Retrieved Docs
- Output: NER size = 307, [Ramzi Yousef, Jemaah Islamiyah, Abu Sayyaf, ... ]
- Phase 6: AnswerGenerator
- Input: Answer Candidates
- Output: Ranked Answers: [Ramzi Yousef and co-conspirators, Ramzi Yousef, Khalid Sheikh Mohammed, ... ]
An initial testing result for a single question:
- Who is the mastermind of World Trade Center bombing?
- Reciprocal rank: 0.5
- Accuracy: 0.0
- Binary recall: 1.0
- EVALUATION REPORT Experiment: 8e9c8b9d-cd2e-43b2-ad0e-e84b741c4b48:1 Evaluator,Configuration,DocumentMAP,PassageMAP,AspectMAP,Count PassageMAPMeasuresEvaluator, 1|AnswerTypeExtractor[persistence-provider:inherit: ecd.default-log-persistence-provider ]> 2|KeytermExtractor[persistence-provider:inherit: ecd.default-log-persistence-provider ]> 3|ICEventExtractor[persistence-provider:inherit: ecd.default-log-persistence-provider ]> 4|PassageRetrieval[persistence-provider:inherit: ecd.default-log-persistence-provider ]> 5|InformationExtractor[persistence-provider:inherit: ecd.default-log-persistence-provider ]> 6|AnswerGenerator[persistence-provider:inherit: ecd.default-log-persistence-provider ], 0.5000,0.0000,1.0000,1