KWARC for students
The KWARC Group welcomes student involvement in research. If you are interested, please send an e-mail to m.kohlhase@jacobs-university.de, or come to our seminars and courses
Graduate Course: 320541 Computational Semantics of Natural Language
Project Work
This might lead to internships or RA positions in our group (just talk to Michael Kohlhase if you want to help). See below the list of potential topics:
- 1. ABox extraction from OMDoc
- We want to extract explicit knowledge (e.g. "Example 1 explains
the Pythagorean Theorem") from mathematical documents encoded in
OMDoc and store it as RDF
triples. This should preferably be done in a way that is
independent from OMDoc's XML schema and OMDoc's system ontology,
so that it can be applied to other data formats as well.
Secondly, the XHTML+MathML web pages generated from OMDoc should be annotated with this knowledge using RDFa, that is, RDF embedded into HTML. - 2. arXMLiv project
- We can use LaTeX gurus and Perl hackers (and people who want to become either)
- 3. MWS Editor UI
- Integrating the WIRIS Math Editor into the MathWebSearch engine UI
- 4. Theory Morphisms in GenCS
- A theory morphism is a symbol mapping between theories, such that truth is preserved (this actually has a precise definition). There must be lots of them in the GenCS material, they should be found and marked up.
Guided Research/Masters Theses
The Kwarc group offers the following thesis topics:
- 1. Transformation of the TPTP content to OMDoc
- TPTP is a library of theorems in standard syntax. They are used to benchmark automated provers. There are parsers for this, so you would only have to hack the emitter for OMDoc.
- 2. Transformation of the ACL2 library to OMDoc
- the same idea as above, only here the syntax is in LISP. (remember: ACL2 Book is $15 from the website; also talk to Matt Kaufmann)
- 3. PVS to OMDoc transformation (XSL; ambitious)
- PVS is a higher-order theorem prover wit a very substantial library. It can output XML system-near versions, so here we we only have to write an XSLT stylesheet here
- 4. Translate first-order logic with dependent types (DFOL) into first-order logic (ambitious)
- DFOL (see Florian's page for the paper) extends first-order logic with a restricted version of dependent types. If DFOL is translated to first-order logic, an automated prover can be used. While the translation is not too hard in principle, this topic is very challenging because it would require to get familiar with dependent types and possibly institutions (which in turn requires basic category theory). But if you like formal logic and abstract mathematics, this can be very interesting for you.
- 5. Develop type-checking for TPTP-HOL (ambitious)
- TPTP is a standardized language to represent first-order formulas. Recently an extension of the language to higher-order logic has been proposed. However, it is not clear yet, which formulas make sense. For this, the language has to be translated into Twelf. A parser for TPTP-HOL and the base signature for the Twelf translation already exist. What is needed now is to define the details of the translation (including the decision which formulas can be translated) and implement it. The project is challenging because it requires to get familiar with higher-order logic and Twelf.
- 6. Theorem Proving in Modal Logics
- The Modal Logic $100 challenge calls for a program that automatically determines the subset relationship between modal logics. The topic would be to improve last year's winner. The nice thing about this project is that it is an interesting and advanced open problem but is also relatively accessible to students. And since there is already a working implementation, only the interesting part of the code needs to be written (in the functional language SML/NJ). The amount of work is very flexible and can also be split among several students.
- 7. Automated Name Translation
- When translating between different languages, it is often non-triviaL to translate symbol and variable names because different languages allow different characters in names: The names should not change much, but it can be difficult to replace only a few symbols while still guaranteeing name uniqueness. The topic is to extend OMDoc with functionality that allows to specify name formation rules so that intuitive names can be generated automatically.
- 8. A Part-of-speech analyzer for mathematical texts
- This does some linguistic analysis for legacy scientific texts
- 9. Isabelle-style mixfix declarations for OMDoc
- Extending the specification of the OMDoc Presentation Module to include mixfix declarations and implementing the transformation in XSLT.