Publications: Theses
  1. Refinement of Temporal Constraints in an Event Recognition System using Small Datasets. Ph.D. Thesis, Supervisor: Prof. D.S. Brée, Department of Computer Science, University of Manchester, UK, 1997.

    Abstract:

    This thesis presents a simple representation for event detection systems and an efficient method for refining the temporal constraints of such a system, with the use of a small set of training examples. This task is in many respects atypical of existing work in machine learning and knowledge refinement.

    Firstly this task involves continuous sequential input from the environment, with a high degree of interdependence between events. In order to deal with this problem, a graphical representation is proposed which allows the hierarchical definition of events on the basis of sequences of part-events. This representation is augmented with a set of simple sequential and temporal constraints that capture the relations between events participating in a definition.

    An additional challenge is set by the limited quantity of available training data. This problem is tackled by the use of a bias towards minimal change of the original model. Special cost functions are introduced to implement this bias.

    Finally, the problem of refinement under partial supervision is addressed, whereby feedback from the training data is available for only a subset of the events in the detection system. This problem requires a special method of distributing feedback, which is facilitated by the hierarchical definition of events.

    The system is tested using real world data that encode marine mammal sounds. In particular, we use recordings of humpback whale songs, recorded in Hawaii in 1978. The task is the thematic analysis of the songs, which have a very rich structure.

  2. The Scalability of Machine Learning Algorithms. M.Sc. Thesis, Supervisor: Prof. D.S. Brée, Department of Computer Science, University of Manchester, UK, 1993.

    Abstract:

    During the last two decades, there has been a significant research activity in Machine Learning, which has mainly concentrated on the task of empirical concept learning. This method of learning involves the acquisition of knowledge from a set of examples, the training set, using generalisation techniques.

    The task of empirical concept learning can be thought of as being equivalent to the classification task, previously performed by statistical techniques. Despite the existence of a large number of problems which can be considered classification tasks, ML techniques have not been widely applied to real-world problems. One of the possible reasons for this is that learning programs cannot handle large-scale data, used in real applications.

    Considering that possibility, the presented thesis examined the scalability of five concept-learning algorithms, defining scalabilty by the effect that an increase in the size of the training set has on the computational performance of the algorithm. The programs that were considered are: NewID (Niblett, 1989), C4.5 (Quinlan, 1993), PLS1 (Rendell, 1983), CN2 (Clark, 1989) and AQ15 (Michalski, 1986).

    The first part of the project involved the theoretical analysis of the algorithms, concentrating on their worst-case computational complexity. The obtained results deviate substantially from those previously presented (e.g. (O'Rorke, 1982) and (Rendell, 1989)), providing over-quadratic worst-case estimates.

    The second part of the work is an experimental examination, using real and artificial data sets. Two large real data sets have been selected for that purpose, one dealing with letter recognition and the other with chromosome classification. The experiments that were done, using those two sets, provide an indication of the average-case performance of the programs, which is significantly different from the worst-case one. The artificial data set, on the other hand, provides a near-worst case situation, which confirms the obtained theoretical results.

    The results of the theoretical and experimental analyses show that, although their worst-case computational complexity is over-quadratic, most of the examined algorithms can handle large amounts of data. Those which had difficulties did not do so because of their order of complexity, but because of their standard computational "unit-cost", which affects significantly their performance. The size of the training set is only one of the parameters affecting scalability. The examination of other factors (e.g. the complexity of the learning task) is equally interesting.