KWARC for students

The KWARC Group welcomes student involvement in research. If you are interested, please send an e-mail to m.kohlhase@jacobs-university.de, or come to our seminars and courses

Graduate Course: 320541 Computational Semantics of Natural Language

Project Work

This might lead to internships or RA positions in our group (just talk to Michael Kohlhase if you want to help). See below the list of potential topics:


1. ABox extraction from OMDoc
We want to extract explicit knowledge (e.g. "Example 1 explains the Pythagorean Theorem") from mathematical documents encoded in OMDoc and store it as RDF triples. This should preferably be done in a way that is independent from OMDoc's XML schema and OMDoc's system ontology, so that it can be applied to other data formats as well.
Secondly, the XHTML+MathML web pages generated from OMDoc should be annotated with this knowledge using RDFa, that is, RDF embedded into HTML.
2. arXMLiv project
We can use LaTeX gurus and Perl hackers (and people who want to become either)
3. MWS Editor UI
Integrating the WIRIS Math Editor into the MathWebSearch engine UI
4. Theory Morphisms in GenCS
A theory morphism is a symbol mapping between theories, such that truth is preserved (this actually has a precise definition). There must be lots of them in the GenCS material, they should be found and marked up.

Guided Research/Masters Theses

The Kwarc group offers the following thesis topics:

1. Transformation of the TPTP content to OMDoc
TPTP is a library of theorems in standard syntax. They are used to benchmark automated provers. There are parsers for this, so you would only have to hack the emitter for OMDoc.
2. Transformation of the ACL2 library to OMDoc
the same idea as above, only here the syntax is in LISP. (remember: ACL2 Book is $15 from the website; also talk to Matt Kaufmann)
3. PVS to OMDoc transformation (XSL; ambitious)
PVS is a higher-order theorem prover wit a very substantial library. It can output XML system-near versions, so here we we only have to write an XSLT stylesheet here
4. Translate first-order logic with dependent types (DFOL) into first-order logic (ambitious)
DFOL (see Florian's page for the paper) extends first-order logic with a restricted version of dependent types. If DFOL is translated to first-order logic, an automated prover can be used. While the translation is not too hard in principle, this topic is very challenging because it would require to get familiar with dependent types and possibly institutions (which in turn requires basic category theory). But if you like formal logic and abstract mathematics, this can be very interesting for you.
5. Develop type-checking for TPTP-HOL (ambitious)
TPTP is a standardized language to represent first-order formulas. Recently an extension of the language to higher-order logic has been proposed. However, it is not clear yet, which formulas make sense. For this, the language has to be translated into Twelf. A parser for TPTP-HOL and the base signature for the Twelf translation already exist. What is needed now is to define the details of the translation (including the decision which formulas can be translated) and implement it. The project is challenging because it requires to get familiar with higher-order logic and Twelf.
6. Theorem Proving in Modal Logics
The Modal Logic $100 challenge calls for a program that automatically determines the subset relationship between modal logics. The topic would be to improve last year's winner. The nice thing about this project is that it is an interesting and advanced open problem but is also relatively accessible to students. And since there is already a working implementation, only the interesting part of the code needs to be written (in the functional language SML/NJ). The amount of work is very flexible and can also be split among several students.
7. Automated Name Translation
When translating between different languages, it is often non-triviaL to translate symbol and variable names because different languages allow different characters in names: The names should not change much, but it can be difficult to replace only a few symbols while still guaranteeing name uniqueness. The topic is to extend OMDoc with functionality that allows to specify name formation rules so that intuitive names can be generated automatically.
8. A Part-of-speech analyzer for mathematical texts
This does some linguistic analysis for legacy scientific texts
9. Isabelle-style mixfix declarations for OMDoc
Extending the specification of the OMDoc Presentation Module to include mixfix declarations and implementing the transformation in XSLT.
© 2006 Copyrights KWARC. | XHTML 1.0 | CSS | Page generated from XML sources with the WSML package