|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectgr.demokritos.iit.conceptualIndex.documentModel.DistributionDocument
public class DistributionDocument
Represents a document, described as a graph of distributions. Each distribution indicates the probability of a token (character) to appear after a given n-gram, indicated as source. Allows input and output operations and can function as grammar indicator, to determine normality of other texts.
Field Summary | |
---|---|
protected java.lang.String |
DataString
The string corresponding to input texts directly. |
protected DistributionGraph |
Graph
The Graph representing the document |
IDistributionComparisonListener |
OnCompare
An event, used to attach a comparator of distributions to this class. |
Constructor Summary | |
---|---|
DistributionDocument(int iNeighbourhoodWindow)
Creates a new instance of DistributionDocument. |
|
DistributionDocument(int iNeighbourhoodWindow,
int iSourceNGramSize)
Creates a new instance of DistributionDocument. |
Method Summary | |
---|---|
void |
clearDocumentGraph()
Clears the document graph, resetting the representation. |
java.lang.String |
getDataString()
Returns the current data string (i.e. |
int |
length()
Calculates the size of the full document object, by getting the edge count of the corresponding graph and not the datastring (i.e. |
void |
loadDataStringFromFile(java.lang.String sFilename,
boolean clearCurrentData)
Loads the contents of a file as the datastring. |
void |
loadDataStringFromFile(java.lang.String sFilename,
boolean clearCurrentData,
java.lang.String sEncoding)
Loads the contents of a file as the datastring. |
static void |
main(java.lang.String[] sArgs)
|
void |
mergeWith(DistributionDocument tpData,
double fLearningRate)
TODO: Document |
double |
normality(java.lang.String s)
Calculates a degree of normality, indicating whether a given string appears in a form similar to text in the document. |
void |
prune(double dMinCoexistenceImportance)
TODO: Document |
void |
setDataString(java.lang.String sDataString,
int iNGramSize,
boolean clearCurrentData)
Creates and saves the graph representation of a string, using substrings of selected size as source nodes and substrings of size 1 (letters) as destination nodes. |
void |
setDocumentGraph(DistributionGraph dgNew)
Sets the document graph to a selected existing graph. |
java.lang.String |
toString()
Returns a string representation of the document graph. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
protected DistributionGraph Graph
DistributionGraph
protected java.lang.String DataString
public IDistributionComparisonListener OnCompare
normality
function.
Constructor Detail |
---|
public DistributionDocument(int iNeighbourhoodWindow)
iNeighbourhoodWindow
- The size of the window indicative of neighbourhood between a
source n-gram and a given token.public DistributionDocument(int iNeighbourhoodWindow, int iSourceNGramSize)
iNeighbourhoodWindow
- The size of the window indicative of neighbourhood between a
source n-gram and a given token.iSourceNGramSize
- The size of the source n-grams in character length.Method Detail |
---|
public void clearDocumentGraph()
public void setDocumentGraph(DistributionGraph dgNew)
dgNew
- The distribution graph to replace the existing one.DistributionGraph
public int length()
public void loadDataStringFromFile(java.lang.String sFilename, boolean clearCurrentData)
sFilename
- The filename of the input file.clearCurrentData
- Indicates whether the new file replaces existing text. If this parameter
is set to false, then the new file is appended to existing text.public void loadDataStringFromFile(java.lang.String sFilename, boolean clearCurrentData, java.lang.String sEncoding)
sFilename
- The filename of the input file.clearCurrentData
- Indicates whether the new file replaces existing text. If this parameter
is set to false, then the new file is appended to existing text.sEncoding
- The encoding of the input file.public void setDataString(java.lang.String sDataString, int iNGramSize, boolean clearCurrentData)
sDataString
- The data string to analyse and represent as a distribution graph.iNGramSize
- The size of the n-grams used as source nodes.clearCurrentData
- Indicates whether the new data replace existing data. If this parameter
is set to false, then the new data is appended to existing data.public java.lang.String getDataString()
public void mergeWith(DistributionDocument tpData, double fLearningRate)
public void prune(double dMinCoexistenceImportance)
public java.lang.String toString()
toString
in class java.lang.Object
public double normality(java.lang.String s)
OnCompare
has been set it is used to compare the distributions.
Distribution
public static void main(java.lang.String[] sArgs)
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |