gr.demokritos.iit.conceptualIndex.documentModel
Class DistributionWordDocument

java.lang.Object
  extended by gr.demokritos.iit.conceptualIndex.documentModel.DistributionDocument
      extended by gr.demokritos.iit.conceptualIndex.documentModel.DistributionWordDocument
All Implemented Interfaces:
java.io.Serializable

public class DistributionWordDocument
extends DistributionDocument

See Also:
Serialized Form

Field Summary
 
Fields inherited from class gr.demokritos.iit.conceptualIndex.documentModel.DistributionDocument
DataString, Graph, OnCompare
 
Constructor Summary
DistributionWordDocument(int iNeighbourhoodWindow)
          Creates a new instance of DistributionWordDocument.
DistributionWordDocument(int iNeighbourhoodWindow, int iSourceNGramSize)
          Creates a new instance of DistributionWordDocument.
 
Method Summary
static void main(java.lang.String[] sArgs)
           
 double normality(java.lang.String s)
          Calculates a degree of normality, indicating whether a given string appears in a form similar to text in the document.
 void setDataString(java.lang.String sDataString, int iNGramSize, boolean clearCurrentData)
          Creates and saves the graph representation of a string, using word n-grams of selected size as source nodes and word n-grams of size 1 (words) as destination nodes.
 
Methods inherited from class gr.demokritos.iit.conceptualIndex.documentModel.DistributionDocument
clearDocumentGraph, getDataString, length, loadDataStringFromFile, loadDataStringFromFile, mergeWith, prune, setDocumentGraph, toString
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

DistributionWordDocument

public DistributionWordDocument(int iNeighbourhoodWindow)
Creates a new instance of DistributionWordDocument. The source n-gram size is set to the default value of 1.

Parameters:
iNeighbourhoodWindow - The size of the window indicative of neighbourhood between a source n-gram and a given token.

DistributionWordDocument

public DistributionWordDocument(int iNeighbourhoodWindow,
                                int iSourceNGramSize)
Creates a new instance of DistributionWordDocument.

Parameters:
iNeighbourhoodWindow - The size of the window indicative of neighbourhood between a source n-gram and a given token.
iSourceNGramSize - The size of the source n-grams in character length.
Method Detail

setDataString

public void setDataString(java.lang.String sDataString,
                          int iNGramSize,
                          boolean clearCurrentData)
Creates and saves the graph representation of a string, using word n-grams of selected size as source nodes and word n-grams of size 1 (words) as destination nodes.

Overrides:
setDataString in class DistributionDocument
Parameters:
sDataString - The data string to analyse and represent as a distribution graph.
iNGramSize - The size of the n-grams used as source nodes.
clearCurrentData - Indicates whether the new data replace existing data. If this parameter is set to false, then the new data is appended to existing data.

normality

public double normality(java.lang.String s)
Calculates a degree of normality, indicating whether a given string appears in a form similar to text in the document. The process actually compares distributions. These distributions appear in same edges of the graph representations of the DistributionDocument object, and another DistributionDocument, created by use of the given string. If the public variable OnCompare has been set it is used to compare the distributions.

Overrides:
normality in class DistributionDocument
See Also:
Distribution

main

public static void main(java.lang.String[] sArgs)