gr.demokritos.iit.summarization.generation
Class HLDATextGenerator

java.lang.Object
  extended by gr.demokritos.iit.summarization.generation.HLDATextGenerator

public class HLDATextGenerator
extends java.lang.Object

A generator of texts, based on the HLDA model of a text corpus and a grammar evaluator.


Field Summary
 java.util.Date dStart
           
 
Constructor Summary
HLDATextGenerator(HierLDAGibbs hlgModel, IGrammaticallityEvaluator igeEval, java.util.Map<java.lang.Integer,java.lang.String> mWordMap)
          Creates a new instance of HLDATextGenerator, given and HLDA model and a grammaticality evaluator.
 
Method Summary
 int generateNextWord(java.util.Vector vCurrentText)
          Randomly generates a word (index) based on the overall distribution of words over topics.
 java.util.Vector<java.lang.Integer> generateNormalText(int iMeanSize, int iGrammarVincinity)
          Generates a normal (string) text , based on the model, given the mean text length.
 java.lang.String getVectorToText(java.util.Vector<java.lang.Integer> vText)
          Gets a vector of indices representing a text and returns the actual text representation, based on the integer to string map of the text generator.
static void main(java.lang.String[] sArgs)
          Utility main method that creates a random text, based on a model corpus.
static void printSyntax()
          Utility method that outputs syntax information for calling the main class.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

dStart

public java.util.Date dStart
Constructor Detail

HLDATextGenerator

public HLDATextGenerator(HierLDAGibbs hlgModel,
                         IGrammaticallityEvaluator igeEval,
                         java.util.Map<java.lang.Integer,java.lang.String> mWordMap)
Creates a new instance of HLDATextGenerator, given and HLDA model and a grammaticality evaluator.

Method Detail

generateNextWord

public int generateNextWord(java.util.Vector vCurrentText)
Randomly generates a word (index) based on the overall distribution of words over topics. The probability of generation of a candidate word is directly proportional to the grammaticality of the given preceding text, followed by the candidate word, and also proportional to the probability of appearence of the given word, within the whole corpus.

Parameters:
vCurrentText - The preceding text.
Returns:
The index of the generated word.

generateNormalText

public java.util.Vector<java.lang.Integer> generateNormalText(int iMeanSize,
                                                              int iGrammarVincinity)
Generates a normal (string) text , based on the model, given the mean text length.

Parameters:
iMeanSize - The mean text length in terms.
iGrammarVincinity - The distance upon which to calculate grammaticality.
Returns:
A list of term indices.

getVectorToText

public java.lang.String getVectorToText(java.util.Vector<java.lang.Integer> vText)
Gets a vector of indices representing a text and returns the actual text representation, based on the integer to string map of the text generator.

Parameters:
vText - A Vector of integers, representing indices of strings in a given map.

printSyntax

public static void printSyntax()
Utility method that outputs syntax information for calling the main class.


main

public static void main(java.lang.String[] sArgs)
Utility main method that creates a random text, based on a model corpus.