Questions tagged [topic-models]

A topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents.

A topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents. Intuitively, given that a document is about a particular topic, one would expect particular words to appear in the document more or less frequently: "dog" and "bone" will appear more often in documents about dogs, "cat" and "meow" will appear in documents about cats (source:wikipedia)

Software for topic modelling include

242 questions
7
votes
1 answer

Limitation of LDA (latent dirichlet allocation)

I'd like to get a list of limitations of LDA. I know that LDA does not work for short document set like a set of tweets very well. Are there such known limitations of LDA? Some reference including a list of the limitations is preferable.
4
votes
1 answer

Can LDA assign more than one topic for a word?

I have just started reading about Latent Dirichlet Allocation LDA and want to apply it to my project. May I know if LDA is able to assign a topic to more than one word? For example, Article A talks about "river banks" while Article B talks about…
4
votes
1 answer

Basic Question about the Latent Dirichlet Allocation Generative Model

So Here is the LDA Generative Model The $\alpha$ and $\beta$ nodes represent the parameters for two Dirichlet distributions. The $\theta$ and $\phi$ nodes represent the parameters for two multinomial distributions. My question is about the $Z$…
user1893354
  • 1,875
  • 4
  • 18
  • 27
3
votes
0 answers

How does word co-occurrence let two words from the same topic link together in LDA topic model?

The essence of LDA is word co-occurrence. I want to know why? and does word co-occurrence mean two words appear together in a certain document or just appear in the documents collection together? Thank you all in advance.
Yinqing Xu
  • 31
  • 1
3
votes
1 answer

Topic modeling for grouped data

Is there a variation of Latent Dirichlet Allocation (LDA) for grouped data? As an example, let us consider corpus of all Yahoo Q&A (where for simplicity we consider a question lumped together with all the corresponding answers as a document). While…
abhinavkulkarni
  • 876
  • 1
  • 8
  • 15
3
votes
1 answer

Latent Dirichlet Allocation - basic question about generative process

After trying to understand Blei's 2009 paper "Topic Models" and reading several websites. This blog entry is the simplest explanation I could find. I still don't understand how LDA works. According to the blog entry mentioned previously LDA consists…
J. Bend
  • 131
  • 2
3
votes
0 answers

Calculation of word-word similarity in an LDA topic model

I'm using Graphlab's LDA functionality, and I get two matrices as the result: The document topic matrix (i.e. P(Topic|Document) for each document) and the the topic-word matrix (i.e. P(Word|Topic) for each topic). What I want to accomplish is…
2
votes
0 answers

Dynamic topic models, the proper structure for alpha

In the somewhat inscrutable papers by Blei et al., Dynamic Topic Models and Continuous-time Dynamic Topic Models, there's some ambiguity regarding the role of alpha. In Dynamic Topic Models, alpha is a logistic normal prior which evolves in discrete…
Set
  • 1,453
2
votes
2 answers

Evaluation of LDA

I am looking for a C++/Java implementation for computing the perplexity of held-out document in Latent Dirichlet allocation. Can anybody suggest useful links?
1
vote
2 answers

Iteration parameter in latent dirichlet allocation model

I want to find 24 topics in 800,000 documents by using LDA model, but how many iterations should I give? It is extremely slow when the parameter is large, like 3000. Are there any strategies to ensure the stability? Seems giving the iteration a…
hw.fu
  • 11
  • 2
1
vote
0 answers

Structural topic modeling - compare groups (issue ranking)

I am trying to compare two groups (liberal news, conservative news articles) after running a structural topic modeling. I first did a structural topic modeling of the whole news article and got a document-topic-matrix. Now, I am trying to see the…
yul15
  • 11
1
vote
1 answer

How to use hierarchical Dirichlet process to predict new word's probability

Assuming that I have already obtained document's topic distribution and topic-word distribution, how to predict the probability of one document will contain some word w_n?
DuFei
  • 135
1
vote
0 answers

how can I assign topics to new document using the same model of lda?

if I want to test new document using lda model what are the inputs to the model and if there is a code in java that can help me ?
suha
  • 11
1
vote
0 answers

Topic analysis with rarely occurring topic and small document corpora. Which technique should I use?

I need to perform a topic analysis on various corpora of documents and I need a procedure that can be applied to all of these corpora independently in a standard way. These are the characteristics of the corpora: the number of documents in each…
1
vote
0 answers

How do I deal with new words in held out documents when evaluating generative models?

I'm trying out some different generative models and I want to evaluate them by holding out some documents, training the models the other documents, and then calculating the perplexity on the held out documents. However, I'm not sure what to do when…
1
2