|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES All Classes | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectgr.demokritos.iit.jinsect.documentModel.representations.DocumentNGramGraph
public class DocumentNGramGraph
Represents the graph of a document, with vertices n-grams of the document and edges the number of the n-grams' co-occurences within a given window.
Field Summary | |
---|---|
protected int |
CorrelationWindow
The minimum and maximum n-gram size, and the cooccurence window. |
protected java.lang.String |
DataString
|
protected java.util.HashMap |
DegradedEdges
|
protected int |
MaxSize
The minimum and maximum n-gram size, and the cooccurence window. |
protected int |
MinSize
The minimum and maximum n-gram size, and the cooccurence window. |
protected UniqueVertexGraph[] |
NGramGraphArray
|
NormalizerListener |
Normalizer
|
TextPreprocessorListener |
TextPreprocessor
|
WordEvaluatorListener |
WordEvaluator
|
Constructor Summary | |
---|---|
DocumentNGramGraph()
Creates a new instance of INSECTDocumentGraph |
|
DocumentNGramGraph(int iMinSize,
int iMaxSize,
int iCorrelationWindow)
Creates a new instance of INSECTDocumentGraph |
Method Summary | |
---|---|
DocumentNGramGraph |
allNotIn(DocumentNGramGraph dgOtherGraph)
Returns all edges not existent in another graph. |
double |
calcCoexistenceImportance(java.lang.String sNode)
Returns a functions of [element graph edges max],[number of neighbours], where [element graph edges max] refers to the maximum weight of the edges including [sNode], and [number of neightbours] is its number of neighbours in the graph. |
double |
calcCoexistenceImportance(salvo.jesus.graph.Vertex vNode)
|
java.lang.Object |
clone()
|
void |
createEdgesConnecting(UniqueVertexGraph gGraph,
java.lang.String sStartNode,
java.util.List lOtherNodes,
java.util.HashMap hAppearenceHistogram)
Creates an edge in [gGraph] connecting [sBaseNode] to each node in the [lOtherNodes] list of nodes. |
void |
createGraphs()
Creates the graph of n-grams, for all the levels specified in the MinSize, MaxSize range. |
void |
createWeightedEdgesConnecting(UniqueVertexGraph gGraph,
java.lang.String sStartNode,
java.util.List lOtherNodes,
double dStartWeight,
double dNewWeight,
double dDataImportance)
Creates an edge in [gGraph] connecting [sBaseNode] to each node in the [lOtherNodes] list of nodes. |
void |
degrade(DocumentNGramGraph dgOtherGraph)
|
double |
degredationDegree(salvo.jesus.graph.Edge e)
|
void |
deleteItem(java.lang.String sItem)
Removes an item (node) from all graphs. |
java.util.HashSet |
getAllNodes()
|
java.lang.String |
getDataString()
|
UniqueVertexGraph |
getGraphLevel(int iIndex)
Returns graph with M-based index |
UniqueVertexGraph |
getGraphLevelByNGramSize(int iNGramSize)
Returns graph with n-gram-size-based index |
int |
getMaxSize()
|
int |
getMinSize()
|
int |
getWindowSize()
|
protected void |
InitGraphs()
|
DocumentNGramGraph |
intersectGraph(DocumentNGramGraph dgOtherGraph)
|
DocumentNGramGraph |
inverseIntersectGraph(DocumentNGramGraph dgOtherGraph)
Returns the difference (inverse of the intersection) graph between the current graph and a given graph. |
boolean |
isEmpty()
|
int |
length()
Measures an indication of the size of a document n-gram graph based on the edge count of its contained graphs. |
void |
loadDataStringFromFile(java.lang.String sFilename)
Creates the graph based on a data string loaded from a given file. |
static void |
main(java.lang.String[] args)
|
void |
merge(DocumentNGramGraph dgOtherObject,
double fWeightPercent)
See the mergeGraph member for details. |
void |
mergeGraph(DocumentNGramGraph dgOtherGraph,
double fWeightPercent)
Merges the data of [dgOtherGraph] document graph to the data of this graph, by adding all existing edges and moving the values of those existing in both graphs towards the new graph values based on a tendency modifier. |
void |
nullify()
Sets all weights in all graphs to zero |
void |
prune(double dMinCoexistenceImportance)
|
void |
setDataString(java.lang.String sDataString)
|
java.lang.String |
toCooccurenceText(java.util.Map mCooccurenceMap)
|
Methods inherited from class java.lang.Object |
---|
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected int MinSize
protected int MaxSize
protected int CorrelationWindow
protected java.lang.String DataString
protected java.util.HashMap DegradedEdges
public NormalizerListener Normalizer
public WordEvaluatorListener WordEvaluator
public TextPreprocessorListener TextPreprocessor
protected UniqueVertexGraph[] NGramGraphArray
Constructor Detail |
---|
public DocumentNGramGraph()
public DocumentNGramGraph(int iMinSize, int iMaxSize, int iCorrelationWindow)
iMinSize
- The minimum n-gram sizeiMaxSize
- The maximum n-gram sizeiCorrelationWindow
- The maximum distance of terms to be considered
as correlated.Method Detail |
---|
protected void InitGraphs()
public int length()
public boolean isEmpty()
public void loadDataStringFromFile(java.lang.String sFilename) throws java.io.IOException, java.io.FileNotFoundException
sFilename
- The filename of the file containing the data string.
java.io.IOException
java.io.FileNotFoundException
public UniqueVertexGraph getGraphLevel(int iIndex)
iIndex
- The index of the graph. Zero (0) equals to the graph for
level MinSize n-grams.
UniqueVertexGraph
of the corresponding level.public UniqueVertexGraph getGraphLevelByNGramSize(int iNGramSize)
iNGramSize
- The n-gram size of the graph.
UniqueVertexGraph
of the corresponding level.public java.util.HashSet getAllNodes()
public void createEdgesConnecting(UniqueVertexGraph gGraph, java.lang.String sStartNode, java.util.List lOtherNodes, java.util.HashMap hAppearenceHistogram)
gGraph
- The graph to usesStartNode
- The node from which all edges beginlOtherNodes
- The list of nodes to which sBaseNode is connectedhAppearenceHistogram
- The histogram of appearences of the termspublic void createWeightedEdgesConnecting(UniqueVertexGraph gGraph, java.lang.String sStartNode, java.util.List lOtherNodes, double dStartWeight, double dNewWeight, double dDataImportance)
gGraph
- The graph to usesStartNode
- The node from which all edges beginlOtherNodes
- The list of nodes to which sBaseNode is connecteddStartWeight
- The initial weight for first-occuring nodesdNewWeight
- The new weightdDataImportance
- The tendency towards the new value. 0.0 means no change
to the current value. 1.0 means the old value is completely replaced by the
new. 0.5 means the final value is the average of the old and the new.public void createGraphs()
public void mergeGraph(DocumentNGramGraph dgOtherGraph, double fWeightPercent)
dgOtherGraph
- The second graph used for the mergingfWeightPercent
- The convergence tendency parameter. A value of 0.0
means no change to existing value, 1.0 means new value is the same as
that of the new graph. A value of 0.5 means new value is exactly between
the old and new value (average).public DocumentNGramGraph intersectGraph(DocumentNGramGraph dgOtherGraph)
public DocumentNGramGraph inverseIntersectGraph(DocumentNGramGraph dgOtherGraph)
dgOtherGraph
- The graph to compare to.
public int getMinSize()
public int getMaxSize()
public int getWindowSize()
public double calcCoexistenceImportance(java.lang.String sNode)
sNode
- The node object the Coexistence Importance of which we calculatepublic double calcCoexistenceImportance(salvo.jesus.graph.Vertex vNode)
public void prune(double dMinCoexistenceImportance)
public void deleteItem(java.lang.String sItem)
sItem
- The item to remove.public void nullify()
public void setDataString(java.lang.String sDataString)
public java.lang.String getDataString()
public void degrade(DocumentNGramGraph dgOtherGraph)
public double degredationDegree(salvo.jesus.graph.Edge e)
public java.lang.String toCooccurenceText(java.util.Map mCooccurenceMap)
public static void main(java.lang.String[] args)
public java.lang.Object clone()
clone
in class java.lang.Object
public void merge(DocumentNGramGraph dgOtherObject, double fWeightPercent)
merge
in interface IMergeable<DocumentNGramGraph>
dgOtherObject
- The second object used for the merging.fWeightPercent
- The convergence tendency parameter. Typically,
a value of 0.0 means no change to existing object,
1.0 means updated object is the same as the new object.
A value of 0.5 means new object is equally similar to the two source
objects (averaging effect).public DocumentNGramGraph allNotIn(DocumentNGramGraph dgOtherGraph)
dgOtherGraph
- The graph to use for intersection and difference.
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES All Classes | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |