probabilisticmodels
Class HierLDAGibbs

java.lang.Object
  extended by probabilisticmodels.HierLDAGibbs
All Implemented Interfaces:
java.io.Serializable

public class HierLDAGibbs
extends java.lang.Object
implements java.io.Serializable

Estimate word assignements in leaf topics and topic assignements in super topics from a corpus of documents. The method is based on Gibbs sampling.

See Also:
Serialized Form

Field Summary
protected  double alpha
           
protected  double beta
           
protected  probabilisticmodels.Matrix2D documentTermMatrix
           
protected  probabilisticmodels.Matrix2D[] documentTopicMatrixPerLevel
           
protected  probabilisticmodels.Matrix2D leafTopicTermMatrix
           
protected  int numOfLevels
           
protected  NotificationListener ProgressIndicator
          A NotificationListener that would expect a double number as oParams object.
protected  probabilisticmodels.Matrix2D[] topicAboveTopicMatrixPerLevel
           
 
Constructor Summary
HierLDAGibbs(int iNumOfLevels, int[][] iaDocTermMatrix, double dAlpha, double dBeta)
          Constructor method: Creates a new instance of HierLDAGibbs
 
Method Summary
 Distribution calcTopicProbsUnderSuperTopic(int iTopicsLevel, int iSuperTopicIndex)
           
 int generateNextLeafTopic()
          Returns the index of the next leaf topic, following the generative process defined by the model.
 java.util.List generateText(int iMeanSize)
          Generates a text (i.e.
 int getDocumentCount()
          Get the number of documents in the corpus
 int getNumOfLevels()
           
 Distribution getTopicTermDistro(int iLevel, int iTopicNumber)
           
 int getVocabularySize()
          Get the vocabulary size
static void main(java.lang.String[] args)
           
 void performGibbs(int iIterations, int iBurnIn, int iThreads)
           
 java.lang.String printoutNormalizedTopicTerms(int iLevel, int iTopicNumber, int iFirstNWords, java.util.Map<java.lang.Integer,java.lang.String> mTermNumToTerm)
           
 java.lang.String printoutTopicTerms(int iLevel, int iTopicNumber, int iFirstNWords, java.util.Map<java.lang.Integer,java.lang.String> mTermNumToTerm)
           
 void setProgressIndicator(NotificationListener ProgressIndicator)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

documentTermMatrix

protected probabilisticmodels.Matrix2D documentTermMatrix

documentTopicMatrixPerLevel

protected probabilisticmodels.Matrix2D[] documentTopicMatrixPerLevel

topicAboveTopicMatrixPerLevel

protected probabilisticmodels.Matrix2D[] topicAboveTopicMatrixPerLevel

leafTopicTermMatrix

protected probabilisticmodels.Matrix2D leafTopicTermMatrix

numOfLevels

protected int numOfLevels

alpha

protected double alpha

beta

protected double beta

ProgressIndicator

protected NotificationListener ProgressIndicator
A NotificationListener that would expect a double number as oParams object.

Constructor Detail

HierLDAGibbs

public HierLDAGibbs(int iNumOfLevels,
                    int[][] iaDocTermMatrix,
                    double dAlpha,
                    double dBeta)
Constructor method: Creates a new instance of HierLDAGibbs

Parameters:
iNumOfLevels - number of levels of the hierarchy
iaDocTermMatrix - the document - term matrix (the input)
dAlpha - the Dirichlet parameter alpha
dBeta - the Dirichlet parameter beta
Method Detail

getVocabularySize

public final int getVocabularySize()
Get the vocabulary size

Returns:
the vocabulary size

getDocumentCount

public final int getDocumentCount()
Get the number of documents in the corpus

Returns:
the number of documents

performGibbs

public void performGibbs(int iIterations,
                         int iBurnIn,
                         int iThreads)

printoutTopicTerms

public java.lang.String printoutTopicTerms(int iLevel,
                                           int iTopicNumber,
                                           int iFirstNWords,
                                           java.util.Map<java.lang.Integer,java.lang.String> mTermNumToTerm)

getTopicTermDistro

public Distribution getTopicTermDistro(int iLevel,
                                       int iTopicNumber)

printoutNormalizedTopicTerms

public java.lang.String printoutNormalizedTopicTerms(int iLevel,
                                                     int iTopicNumber,
                                                     int iFirstNWords,
                                                     java.util.Map<java.lang.Integer,java.lang.String> mTermNumToTerm)

calcTopicProbsUnderSuperTopic

public Distribution calcTopicProbsUnderSuperTopic(int iTopicsLevel,
                                                  int iSuperTopicIndex)

generateNextLeafTopic

public final int generateNextLeafTopic()
Returns the index of the next leaf topic, following the generative process defined by the model.

Returns:
The index of the leaf topic.

generateText

public java.util.List generateText(int iMeanSize)
Generates a text (i.e. list of term indices), based on the model, given the mean text length.

Parameters:
iMeanSize - The mean text length in terms.
Returns:
A list of term indices.

main

public static void main(java.lang.String[] args)

getNumOfLevels

public int getNumOfLevels()

setProgressIndicator

public void setProgressIndicator(NotificationListener ProgressIndicator)