gr.demokritos.iit.jinsect.documentModel.representations
Class DocumentGraph

java.lang.Object
  extended by gr.demokritos.iit.jinsect.documentModel.representations.DocumentGraph
All Implemented Interfaces:
java.io.Serializable

public class DocumentGraph
extends java.lang.Object
implements java.io.Serializable

See Also:
Serialized Form

Field Summary
 NormalizerListener Normalizer
           
 TextPreprocessorListener TextPreprocessor
           
 WordEvaluatorListener WordEvaluator
           
 
Constructor Summary
DocumentGraph()
          Creates a new instance of DocumentGraph
DocumentGraph(int iMinSize, int iMaxSize, int iCorrelationWindow)
          Creates a new instance of DocumentGraph
 
Method Summary
 double calcCoexistenceImportance(java.lang.String sNode)
          Returns a functions of [element graph edges max],[number of neighbours], where [element graph edges max] refers to the maximum weight of the edges including [sNode], and [number of neightbours] is its number of neighbours in the graph.
 double calcCoexistenceImportance(salvo.jesus.graph.Vertex vNode)
           
 void createEdgesConnecting(Graph gGraph, java.lang.String sStartNode, java.util.List lOtherNodes, double dStartWeight, double dIncreaseWeight)
          Creates an edge in [gGraph] connecting [sBaseNode] to each node in the [lOtherNodes] list of nodes.
 void createNGramGraphs()
          Creates the graph of n-grams, for all the levels specified in the MinSize, MaxSize range.
 void createWeightedEdgesConnecting(Graph gGraph, java.lang.String sStartNode, java.util.List lOtherNodes, double dStartWeight, double iNewWeight, double dDataImportance)
          Creates an edge in [gGraph] connecting [sBaseNode] to each node in the [lOtherNodes] list of nodes.
 void deleteItem(java.lang.String sItem)
          Removes an item (node) from all graphs.
 java.util.HashSet getAllNodes()
           
 java.lang.String getDataString()
           
 Graph getGraphLevel(int iIndex)
          Returns graph with M-based index
 Graph getGraphLevelByNGramSize(int iNGramSize)
          Returns graph with n-gram-size-based index
 int getMaxSize()
           
 int getMinSize()
           
 boolean isEmpty()
           
 int length()
           
 void loadDataStringFromFile(java.lang.String sFilename)
           
 void mergeNGramGraph(DocumentGraph dgOtherGraph, double fWeightPercent)
          Merges the data of [dgOtherGraph] document graph to the data of this graph, by adding all existing edges and moving the values of those existing in both graphs towards the new graph values based on a tendenct modifier.
 void nullify()
          Sets all weights in all graphs to zero
 void setDataString(java.lang.String sDataString)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

Normalizer

public NormalizerListener Normalizer

WordEvaluator

public WordEvaluatorListener WordEvaluator

TextPreprocessor

public TextPreprocessorListener TextPreprocessor
Constructor Detail

DocumentGraph

public DocumentGraph()
Creates a new instance of DocumentGraph


DocumentGraph

public DocumentGraph(int iMinSize,
                     int iMaxSize,
                     int iCorrelationWindow)
Creates a new instance of DocumentGraph

Parameters:
iMinSize - The minimum n-gram size
iMaxSize - The maximum n-gram size
iCorrelationWindow - The maximum distance of terms to be considered as correlated.
Method Detail

length

public int length()

isEmpty

public boolean isEmpty()

loadDataStringFromFile

public void loadDataStringFromFile(java.lang.String sFilename)
                            throws java.io.IOException,
                                   java.io.FileNotFoundException
Throws:
java.io.IOException
java.io.FileNotFoundException

getGraphLevel

public Graph getGraphLevel(int iIndex)
Returns graph with M-based index

Parameters:
iIndex - The index of the graph. Zero (0) equals to the graph for level MinSize n-grams.

getGraphLevelByNGramSize

public Graph getGraphLevelByNGramSize(int iNGramSize)
Returns graph with n-gram-size-based index

Parameters:
iNGramSize - The n-gram size of the graph.

getAllNodes

public java.util.HashSet getAllNodes()

createEdgesConnecting

public void createEdgesConnecting(Graph gGraph,
                                  java.lang.String sStartNode,
                                  java.util.List lOtherNodes,
                                  double dStartWeight,
                                  double dIncreaseWeight)
Creates an edge in [gGraph] connecting [sBaseNode] to each node in the [lOtherNodes] list of nodes. If an edge exists, its weight is increased by [iIncreaseWeight], else its weight is set to [iStartWeight]

Parameters:
gGraph - The graph to use
sStartNode - The node from which all edges begin
lOtherNodes - The list of nodes to which sBaseNode is connected
dStartWeight - The initial weight for first-occuring nodes
dIncreaseWeight - The increase of weight for already existing nodes, when there is an occurance

createWeightedEdgesConnecting

public void createWeightedEdgesConnecting(Graph gGraph,
                                          java.lang.String sStartNode,
                                          java.util.List lOtherNodes,
                                          double dStartWeight,
                                          double iNewWeight,
                                          double dDataImportance)
Creates an edge in [gGraph] connecting [sBaseNode] to each node in the [lOtherNodes] list of nodes. If an edge exists, its weight is increased by [iIncreaseWeight], else its weight is set to [iStartWeight]

Parameters:
gGraph - The graph to use
sStartNode - The node from which all edges begin
lOtherNodes - The list of nodes to which sBaseNode is connected
dStartWeight - The initial weight for first-occuring nodes
iNewWeight - The new weight
dDataImportance - The tendency towards the new value. 0.0 means no change to the current value. 1.0 means the old value is completely replaced by the new. 0.5 means the final value is the average of the old and the new.

createNGramGraphs

public void createNGramGraphs()
Creates the graph of n-grams, for all the levels specified in the MinSize, MaxSize range. ONLY printable non-space characters are taken into account


mergeNGramGraph

public void mergeNGramGraph(DocumentGraph dgOtherGraph,
                            double fWeightPercent)
Merges the data of [dgOtherGraph] document graph to the data of this graph, by adding all existing edges and moving the values of those existing in both graphs towards the new graph values based on a tendenct modifier. The convergence tendency towards the starting value or the new value is determined by [fWeightPercent].

Parameters:
dgOtherGraph - The second graph used for the merging
fWeightPercent - The convergence tendency parameter. A value of 0.0 means no change to existing value, 1.0 means new value is the same as that of the new graph. A value of 0.5 means new value is exactly between the old and new value (average).

getMinSize

public int getMinSize()

getMaxSize

public int getMaxSize()

calcCoexistenceImportance

public double calcCoexistenceImportance(java.lang.String sNode)
Returns a functions of [element graph edges max],[number of neighbours], where [element graph edges max] refers to the maximum weight of the edges including [sNode], and [number of neightbours] is its number of neighbours in the graph.

Parameters:
sNode - The node object the Coexistence Importance of which we calculate

calcCoexistenceImportance

public double calcCoexistenceImportance(salvo.jesus.graph.Vertex vNode)

deleteItem

public void deleteItem(java.lang.String sItem)
Removes an item (node) from all graphs.

Parameters:
sItem - The item to remove.

nullify

public void nullify()
Sets all weights in all graphs to zero


setDataString

public void setDataString(java.lang.String sDataString)

getDataString

public java.lang.String getDataString()