1. Introduction

We study models and techniques for managing Web data, and we explore alternative ways of interaction between databases and Web pages. XML has become a standard for information exchange in the Web, facilitating automatic processing of Web data sources. Since XML can be seen at a logical level as a data model that structures data in a hierarchy, a lot of interesting research issues emerge concerning the management of XML Web data.

2. Research Issues

-Hierarchical Schema Management
T. Sellis, T. Dalamagas

Hierarchical schemas play an important role in the Web. They are used to enrich semantically the available information, e.g. in the form of tree-like structures with syntactic constraints and type information (e.g. DTDs, XML schemas), or even in the form of hierarchies on a category/subcategory basis (e.g. portal catalogs). We have proposed a framework to manage lightweight hierarchical structures for the Web as first class citizens. Our major tasks include:

  • application of clustering algorithms to discover groups of similar structures encoded as XML documents,
  • formal methods and operators to manipulate the structure of hierarchical schemas (e.g. in portal catalogs),
  • languages for pattern-based manipulation of navigational pathways in hierarchical schemas.

-Context-dependent Information Management
T. Sellis, Y. Stavrakas

We are interested in exploring the role of context in managing and accessing information. Context-dependent data becomes particularly relevant in the frame of the Web, and our goal is to investigate how to augment the capabilities of information sources so that they treat context as first-class citizen. We are also interested in applications of the above in the area of delivering personalized information. We have developed prototype tools for:

  • designing and populating context-aware semistructured databases,
  • querying such databases using MQL, a query language that directly supports the notion of context,
  • using context information for tailoring XML docs to particular users, through an especially made web server,
  • representing and querying the history of semistructured and XML data (valid time is seen as context).

-Web Caching
T. Sellis, M. Veliskakis

Our work focuses on the development of algorithms for caching dynamic web objects, i.e. query results that a client on the Web might request from the origin server.  Web caching characterizes the dynamic web objects as non-cacheable objects since they can’t be up-to-date for a reasonable period of time. We are implementing a new kind of proxy that satisfies requests concerning dynamic web objects, considering issues of proxy topology and query rewriting.

3. Selected Publications

Y. Stavrakas, M. Gergatsoulis, C. Doulkeridis, V. Zafeiris. Representing and Querying Histories of Semistructured Databases Using Multidimensional OEM. In Information Systems Journal (IS), Vol. 29, Issue 6, pp 461-482, September 2004.

T. Dalamagas, T. Cheng, K. J. Winkel, T. Sellis, Clustering XML documents using structural summaries, EDBT Workshop on Clustering Information over the Web (ClustWeb'04), Heraklion, Greece,  2004.

T. Dalamagas, A. Meliou, T. Sellis, Modeling and Manipulating the Structure of Portal Catalogs, Tech. Report TR-2003-2, KDBS Lab, Dept of Electrical and Computer Eng, NTU Athens, 2003.

M. Gergatsoulis, and Y. Stavrakas. Representing Changes in XML Documents Using Dimensions. In XML Database Symposium (XSym 2003) in conjunction with VLDB 2003, Berlin, Germany, Sep 2003.

Y. Stavrakas, M. Gergatsoulis. Multidimensional Semistructured Data: Representing Context-Dependent Information on the Web. In Proc. of the 14th International Conference on Advanced Information Systems Engineering (CAiSE 2002), Toronto, Canada, May 2002.

4. Funding

DELOS Network of Excellence on Digital Libraries, European Commission, IST programme, G038-507618, 2004-2006.

PYTHAGORAS EPEAEK II programme, EU and Greek Ministry of Education, 2004-2006.

5. Contacts

Prof. Timos Sellis                      
Phone: +30-1-772-1601, Fax: +30-1-772-1442
e-mail: timos@dblab.ece.ntua.gr

Theodore Dalamagas
Phone: +30-1-772-1402, Fax: +30-1-772-1442
e-mail: dalamag@dblab.ece.ntua.gr

School of Electr. and Comp. Engineering
Computer Science Division
National Techn. Univ. of Athens
Zographou, 157 73 Athens, Greece

