gr.demokritos.iit.conceptualIndex.documentModel
Class DistributionWordDocument
java.lang.Object
  
gr.demokritos.iit.conceptualIndex.documentModel.DistributionDocument
      
gr.demokritos.iit.conceptualIndex.documentModel.DistributionWordDocument
- All Implemented Interfaces: 
 - java.io.Serializable
 
public class DistributionWordDocument
- extends DistributionDocument
 
- See Also:
 - Serialized Form
 
 
 
| 
Constructor Summary | 
DistributionWordDocument(int iNeighbourhoodWindow)
 
          Creates a new instance of DistributionWordDocument. | 
DistributionWordDocument(int iNeighbourhoodWindow,
                         int iSourceNGramSize)
 
          Creates a new instance of DistributionWordDocument. | 
 
| 
Method Summary | 
static void | 
main(java.lang.String[] sArgs)
 
            | 
 double | 
normality(java.lang.String s)
 
          Calculates a degree of normality, indicating whether a given string appears in a form
 similar to text in the document. | 
 void | 
setDataString(java.lang.String sDataString,
              int iNGramSize,
              boolean clearCurrentData)
 
          Creates and saves the graph representation of a string, using word n-grams of selected size 
 as source nodes and word n-grams of size 1 (words) as destination nodes. | 
 
 
| Methods inherited from class java.lang.Object | 
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait | 
 
DistributionWordDocument
public DistributionWordDocument(int iNeighbourhoodWindow)
- Creates a new instance of DistributionWordDocument. The source n-gram size is set to the default
 value of 1.
- Parameters:
 iNeighbourhoodWindow - The size of the window indicative of neighbourhood between a
source n-gram and a given token.
 
DistributionWordDocument
public DistributionWordDocument(int iNeighbourhoodWindow,
                                int iSourceNGramSize)
- Creates a new instance of DistributionWordDocument.
- Parameters:
 iNeighbourhoodWindow - The size of the window indicative of neighbourhood between a
source n-gram and a given token.iSourceNGramSize - The size of the source n-grams in character length.
 
setDataString
public void setDataString(java.lang.String sDataString,
                          int iNGramSize,
                          boolean clearCurrentData)
- Creates and saves the graph representation of a string, using word n-grams of selected size 
 as source nodes and word n-grams of size 1 (words) as destination nodes.
- Overrides:
 setDataString in class DistributionDocument
 
- Parameters:
 sDataString - The data string to analyse and represent as a distribution graph.iNGramSize - The size of the n-grams used as source nodes.clearCurrentData - Indicates whether the new data replace existing data. If this parameter
is set to false, then the new data is appended to existing data.
 
 
normality
public double normality(java.lang.String s)
- Calculates a degree of normality, indicating whether a given string appears in a form
 similar to text in the document. The process actually compares distributions. These 
 distributions appear in same edges of the graph representations of the DistributionDocument
 object, and another DistributionDocument, created by use of the given string.
 If the public variable 
OnCompare has been set it is used to compare the distributions.
- Overrides:
 normality in class DistributionDocument
 
- See Also:
 Distribution
 
 
main
public static void main(java.lang.String[] sArgs)