DBLab School of Computer and Electrical Engineering KDBSL NTUA
Wednesday, October 20, 2021

Internal Seminars 2006-2007

Event-Condition-Action Rule Languages on Semistructured Data

Γιώργος Παπαμάρκος

Ontology-based Conceptual Design of ETL Processes for both Structured and Semi-structured Data

One of the most important tasks performed in the early stages of a
data warehouse project is the analysis of the structure and content
of the existing data sources and their intentional mapping to a
common data model. Establishing the appropriate mappings between
the attributes of the data sources and the attributes of the
data warehouse tables is critical in specifying the required transformations
in an ETL workflow. The selected data model should
be suitable for facilitating the redefinition and revision efforts,
typically occurring during the early phases of a data warehouse
project, and serve as the means of communication between the
involved parties. In this paper, we argue that ontologies constitute
a very suitable model for this purpose and show how the usage of
ontologies can enable a high degree of automation regarding the
construction of an ETL design.

Distributed triggers for peer data management

Βηρένα Καντερέ

A combination of trie-trees and inverted files for the indexing

Μανώλης Τερροβίτης

Focusing on Streams

Applications in various domains are increasingly shifting their focus
from processing historic
data to analyzing streaming data in an online fashion. The benefit they
get by continuously
tracking internal and external processes, is that they can react to
significant situations in a
timely and efficient manner.

In this talk, I will present some application scenarios other than
sensor networks, that are
gaining popularity in the business world. I will focus on the discussion
of a particular problem
in this space, namely deviation detection in streaming data, and present
our approach for
solving this problem in a distributed manner. Finally, I will also talk
about some other work
we are currently doing in the context of streaming data management.