gr.demokritos.iit.ducTools
Class extractSubDocs

java.lang.Object
  extended by gr.demokritos.iit.ducTools.extractSubDocs

public class extractSubDocs
extends java.lang.Object

A class to extract subdocuments from a DUC 2005 XML document.


Constructor Summary
extractSubDocs()
           
 
Method Summary
static void main(java.lang.String[] args)
           
static void removeLineTag(java.lang.String sFile)
          Removes the line tags from the lines of an input file.
static void splitTexts(java.lang.String sFile)
          Splits an input document to its subdocuments.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

extractSubDocs

public extractSubDocs()
Method Detail

main

public static void main(java.lang.String[] args)
Parameters:
args - the command line arguments. The first argument is the name of the input file. If there is no input file selected, then the file test.txt is assumed to be the input file.

splitTexts

public static void splitTexts(java.lang.String sFile)
Splits an input document to its subdocuments. Every input document is supposed to have a form complying with the DUC 2005 XML format. The function creates a file for every subdocument.

Parameters:
sFile - The filename to use as input.

removeLineTag

public static void removeLineTag(java.lang.String sFile)
Removes the line tags from the lines of an input file.

Parameters:
sFile - The filename of the input file.