This is an Advanced Graduate Course intended to cover the state-of-the-art developments in XML and World-Wide Web research and technologies.

Prerequisite: You should meet all of the following conditions to take this course:

  • You need to be a graduate student.
  • You should be already familiar with the basic Database topics.
  • You should be willing to read at least 2 research papers every week.

Since this is a seminar course, it is important for students to read ahead materials and actively participate in the discussions.

Goal: The advent of powerful personal computers and the Internet resulted in exponential growth in digital information. Anyone can create rich contents using his or her own personal computer and make it available on the Web. Many high-quality data sources are also available online within a single click of mouse.

The growth of digital information, however, has brought in tremendous challenges in managing, organizing and accessing such information. The inherent heterogeneity of the information as well as the wide variety of unstructured text, semi-structured, and structured data make it challenging to handle them. In this course, we will go over recent research papers to study the cutting edge methods that people have taken to manage Web and XML information on the Internet.

In particular, the class will be structured around two major themes:

  • The characteristics and understanding of the World-Wide Web, and
  • The XML Model from Database perspective

In addition, this class aims at training students to be able to write a "(database/web) research paper";

  • Where to start
  • How to read and criticize other's paper
  • How to find problems and solutions
  • How to package them in a publishable format

Introduction: The course consists of instructor's lectures, paper readings, in-class discussions, selected presentations, a project, and a final exam. Most of the work in this course consists of reading the state-of-the-art journal and conference papers. We will cover around 2 papers in each week (one paper per class). This class will be primarily discussion based, although it will have some lecturing components. Active discussion will (hopefully) give students a non-trivial understanding of the material. The only way this approach can work is if students read the papers carefully in advance. There will be as much as 10 hours of reading per week.

Do NOT take this course unless you are willing to do a lot of reading.

  • Instructor: Dongwon Lee, dongwon (at), 4E Thomas Bldg.
  • Time: T R 09:45A - 11:00A
  • Office Hour: T R 11:00A - noon
  • Location: 174 WILLARD
  • NO Textbook: This course is based on research papers and hand-outs.


  • Reading Assignments (7-10): 30%
  • Class participation: 10%
  • Presentations (1): 10%
  • Project (1): 30%
  • Take-home Final (1): 20%

Reading Assignments

There is no textbook for this course. The course is based on a collection of journal and conference papers. To encourage reading papers, each student needs to pick at least 2 papers from the week's list (or other related papers with the consent of the instructor) and write a review. Each review is required to have a 3-5 paragraphs (or shorter) consisting of:

  1. Summary (1 paragraph),
  2. Comments and criticisms (2-4 paragraphs)
  3. One exam question (1-2 paragraphs)
The instructor will select some of the good exam questions to use for the final. Look at this
example question.

The review of papers is due at midnight of every MONDAY . Please submit the review by emailing it to both (1) instructor as well as (2) the assigned grader or graders for the week (to be explained below) with the subject "IST597 Paper Review: Title", where Title is the title of the paper. Please send just plain ASCII text messages (i.e., no Postscript, PDF, or MS-Word files). Please use separate emails for different papers even if they are due at the same time.

To accommodate unexpected emergency, students may skip the paper review FOR ONE WEEK without penalty.

Each week, a few students in the class will be pre-selected as assigned graders. The graders will be responsible for reading the submitted reviews carefully and grading them as Excellent, Good, or Fair. 20% of the papers would belong to Excellent, 20% to Fair and the remaining 60% to Good.

Here is an example review that is considered as Excellent.


Students will complete a substantial research or implementation project, on given or self-defined topics related to the theme of the class. Students work ALONE or form a team upto 2 people.

In general, students will choose a set of interesting research papers being discussed in class (or elsewhere) and try to do:

  • Survey,
  • Extension,
  • Application, and/or
  • Implementation

Eventual output of the project is (ideally) somewhat like a research paper with a publishable quality.

At the end of the semester, students present their projects in class, and submit a written document.


NOTE: Lists are subject to change as semester passes.

NOTE: we only provide the copy of the papers here for your convenience. The copyrights are with the original authors or publishers of the papers. Please do not use or distribute them other than the purposes of the study.

Week 9/2: Introduction

  • Slides: THUR 1 (Instructor), THUR 2 (Instructor)
  • Course and project description
  • How to write research papers (or how to graduate quickly?)

Week 9/9: NO CLASS

Midnight 9/15: Presentation Paper Selection Due: Decide three papers with your preference (1st, 2nd, and 3rd) that you want to present in class, and email the titles to instructor. You will be assigned only one of the three. Pick three from week 9/23 till the end (do not choose ones in week 9/16)

Week 9/16: Web Characteristics

Midnight 9/22: Project Draft Due: Decide your research project, write ONE page plan (eg, what and how to do, etc) in plain text, and email it to instructor.

Week 9/23: Web Search

Week 9/30: Web Communities

Week 10/7: XML Overview

Week 10/14: Querying XML

Week 10/21: XML to Relational Conversions

Midnight 10/27: Project Progress Report: Write a progress report upto FIVE pages (eg, what and how you did, etc) in plain text, and email it to instructor.

Week 10/28: XML Query Relaxation

Week 11/4: XML Indexing and Query Optimization

Week 11/11: XML and Security

Week 11/18: XML and Triggers

Week 11/26 -- 11/28: Thanksgiving Holidays

11/25, 12/2 and 12/4: Project Presentation

Tue. 12/9: Take-home Final Exam

12/10: Final Project Report: Submit a hardcopy of your final project report. You can use any word processing programs here (eg, Word, PDF). Submit this along with your take-home final exam.


University Policies

Academic Integrity: According to the Penn State Principles and University Code of Conduct: Academic integrity is a basic guiding principle for all academic activity at Penn State University, allowing the pursuit of scholarly activity in an open, honest, and responsible manner. In according with the University's Code of Conduct, you must not engage in or tolerate academic dishonesty. This includes, but is not limited to cheating, plagiarism, fabrication of information or citations, facilitating acts of academic dishonesty by others, unauthorized possession of examinations, submitting work of another person, or work previously used without informing the instructor, or tampering with the academic work of other students. Any violation of academic integrity will be investigated, and where warranted, punitive action will be taken. For every incident when a penalty of any kind is assessed, a report must be filed. This form is used for both undergraduate and graduate courses. This report must be signed by both the instructor and the student, and then submitted to the Senior Associate Dean.

Affirmative Action & Sexual Harassment: The Pennsylvania State University is committed to a policy that all persons shall have equal access to programs, facilities, admission, and employment without regard to personal characteristics not related to ability, performance, or qualifications as determined by University policy or by Commonwealth or Federal authorities. Penn State does not discriminate against any person because of age, ancestry, color, disability or handicap, national origin, race, religious creed, gender, sexual orientation, or veteran status. Direct all inquiries to the Affirmative Action Office, 211 Willard Building.

Americans with Disabilities Act: IST welcomes persons with disabilities to all of its classes, programs, and events. If you need accommodations, or have questions about access to buildings where IST activities are held, please contact us in advance of your participation or visit. If you need assistance during a class, program, or event, please contact the member of our staff or faculty in charge. Access to IST courses should be arranged by contacting the Office of the Senior Associate Dean, 002D Thomas Building: (814) 865-4457

An Invitation to Students with Learning Disabilities: It is Penn State's policy to not discriminate against qualified students with documented disabilities in its educational programs. If you have a disability-related need for modifications in your testing or learning situation, your instructor should be notified during the first week of classes so that your needs can be accommodated. You will be asked to present documentation from the Office of Disability Services (located in 116 Boucke Building, 863-1807) that describes the nature of your disability and the recommended remedy. You may refer to the Nondiscrimination Policy in the Student Guide to University Policies and Rules.

