LIDER Hackathon

  • The MLODE Hackathon will be on September 1st at 9am. You can check the full agenda in the registration page.

 

This hackathon is sponsored by the LIDER project (http://lider-project.eu) will start with a session about the current technological progress and available resources in Clarin, META-SHARE, NLP2RDF, Linked Open Data, lemon and Marl. This session will be followed by 3 sessions of hacking and developing. Developers will be there to help you and give advice. At the beginning of each session, you will have the chance to describe your problem with a lightning talk and ask the audience for help.

The AKSW research group and the LIDER project  will provide bleeding-edge linked data experts who will help you and provide technical counselling and hands-on help.

So bring your laptop (or send your dev, if you are CEO), we offer:

  • individual coaching
  • gain training in applying best practices
  • receive help in generating Linked Data from your assets
  • get tips on open-source frameworks

We will split up in small teams on-site.

Confirmed Topics

How to create a NodeJS component for context analysis

ConTEXT is a platform for lightweight text analytics based on Linked Data. ConTEXT consists of several components for data collection, input processing, content visualization and exploration as well as user feedback. The application is written in MEAN framework (MongoDB, ExpressJS, AngularJS, NodeJS) and is available on Github.

In this project, we would like to write extensions for conTEXT which would consume Linked Data services for the purpose of text analytics.

Links:
http://context.aksw.org/

https://github.com/AKSW/context

Contact person:

Ali Khalili (khalili@informatik.uni-leipzig.de)

Enrichment of industry lexical data with LOD

The growing amount of LD dictionary resources in the Lemon format incites the need of sophisticated sense mapping techniques that enable interoperability between dictionary resources and linking to existing LOD datasets. Furthermore, sense mapping can be used to enrich legacy industry data with data from the LD cloud, providing a tangible use-case of LD for potential industry adopters.

http://kdictionaries.com/ is a commercial provider of rich, high-quality, multilingual dictionary data. They have shown much interest in the possibilities of 3LD but are unsure of the benefits for them.

As a preliminary step, an XSLT stylesheet was written to migrate their data to the Lemon format. However, this alone does not present much of a benefit over their current infrastructure. But enrichment of the data with other data from the 3LD cloud would be very beneficial to them on one hand and present an exemplary mapping, merging and linking task on the other hand.

For this purpose, K Dictionaries provides a 120 entry take-out of their German language database for testing. As a resource to merge with and link to. DBnary will be used. WordNet-RDF, BabelNet and DBpedia are further candidates.

The task includes:

  • Migrating the original KDictionaries XML format into Lemon RDF (already done)
  • Extracting LexicalSenses from the dictionary
  • Map them to equivalent LexicalSenses in the 3LD cloud
  • Add links (for example via lemon:reference) and missing properties to the mapped senses of the dictionary
  • (Maybe) add new LexicalSense resources to the LexicalEntries of the dictionary if they are missing
  • Present an enriched/linked Lemon RDF document as result

A successful completion of this task would also provide an interoperability component for Lemon, that could be used to better merge multiple Lemon dictionaries.

Contact: Martin Brümmer (bruemmer@informatik.uni-leipzig.de)

Discovering and declaring rights expressions

A basic rights expression language and an simple API will be described.
By using the API, participants will be able to automatically declare, publish and parse rights expressions along with the Linguistic Linked Data resource. From access control to surveillance of uses contrary to copyright law, participants will have a wide range of nice applications to conceive!

Links:

[1] ODRL 2.0 Ontology http://www.w3.org/ns/odrl/2/
[2] RDFLicense Dataset with licenses in RDF: http://datahub.io/dataset/rdflicense
[3] ODRL API:  http://vroddon.github.io/odrlapi/

Contact person: Víctor Rodríguez-Doncel (vrodriguez@fi.upm.es)

Roundtrip conversion from TBX2RDF and back

The idea of this is to work on a roundtrip conversion from the TBX standard for representing terminology to RDF and back. The idea would be to build on the existing code at bitbucket: https://bitbucket.org/vroddon/tbx2rdf

Links:
Source code: https://bitbucket.org/vroddon/tbx2rdf

TBX Standard: http://www.ttt.org/oscarstandards/tbx/

Contact person: Philipp Cimiano (jmccrae@cit-ec.uni-bielefeld.de), John McCrae (jmccrae@cit-ec.uni-bielefeld.de), Victor Rodriguez-Doncel (vrodriguez@fi.upm.es) and Tatiana Gornostay (tatiana.gornostay@tilde.lv)

Converting multilingual dictionaries as LD on the Web

The experience on the creation of the Apertium RDF dictionaries will be presented. Taking as starting point a bilingual dictionary represented in LMF/XML, a mapping into RDF was made by using tools such as Open Refine. From each bilingual dictionary three components (graphs) were created in RDF: two lexicons and a translation set. The used vocabularies were lemon for representing lexical information and the translation module for representing translations. Once they were published on the Web, some immediate benefits arise such as: automatic enrichment of the monolingual lexicons each time a new dictionary is published (due to the URIs ruse), simple graph-based navigation across the lexical information and, more interestingly, simple querying across (initially) independent dictionaries.

The task could be either to reproduce part of the Apertium generation process, for those willing to learn about lemon and about techniques for representing translations in RDF, or to repeat the process with other input data (bilingual or multilingual lexica) provided by participants.

Contact person: Jorge Gracia (jgracia@fi.upm.es)

Converting the output of Babelfy into RDF-NIF

Babelfy is a unified, multilingual, graph-based approach to Entity Linking and Word Sense Disambiguation. Based on a loose identification of candidate meanings, coupled with a densest subgraph heuristic which selects high-coherence semantic interpretations, Babelfy is able to annotate free text with with both concepts and named entities drawn from BabelNet’s sense inventory.

The task consists of converting text annotated by Babelfy into RDF format. In order to accomplish this, participants will start from free text, will annotate it with Babelfy and will eventually make use of the NLP2RDF NIF module.

Contact person: Tiziano Flati (flati@di.uniroma1.it), Roberto Navigli (navigli@di.uniroma1.it)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s