Blog of Sara Jakša

My MeiCogSci Presentation on the Topic Modeling of Cognitive Science Abstracts

I am a bit behind on my blog posts. I mean, right now writing about my presentation, that I had about a month ago tells me, that I had not written anything for at least a month. I guess it was a busy month. :) Well, last month I had a presentation on the topic modeling of the abstracts, that were published in this very conference in the last decades and some. So that can give us a pretty good indication of what people in our study program find interesting. That was a fun project, that probably took about two months of my life.

I have spend a lot of time playing around with different models. On the end, I used the model with 21 topics. Looking back now, I think I would get a better results with less topics. That is my intuition, because I also played a lot with different number of topic models and I think the ones in the 10-15 topics were a bit clearer and more straight forward. But in the end, I ended up using the best model based on the numerical indications. Also, for most of my time, I was really annoyed, that the three topics were together: constructivism, sense-making and empirical phenomenology. To me, these were separate topics. Or at least more separate topics than reinforcement learning and neural networks, that got divided.

Well, the model was right and I was wrong. The feedback that I got was that these is how it should be. I got the info that the constructivism-phenomenology group should be together. Apparently Varela, who is one of the most prominent empirical phenomenologist, went through both constructivism and sense making. At the presentation, I found out that people have a strong opinion of why neural network and reinforcement learning were supposed to be separate. The first explanation was, that reinforcement learning is just one method of neural networks, but otherwise they are separate. The second has to do with explainability of the model? I don't think I completely understood that explanation.

Well, all this feedback came to late, so I ended up with a model with too many topics. Maybe what I can learn from that is, that if there is something that appear no matter the preprocessing and number of topic selection, it will probably be right, so don't mess up the model to try and correct it.

The visualization of the final model can be found on my page. I also included the simplified model as a web app, so everybody can check the topics of any texts (but if used on text other that cognitive science ones, I can not guarantee any sensible results). Feel free to play with it.

Now, for some interesting results, that I have gotten. Probably a lot more interesting for the people connected to the study program than anybody else. The analysis could be found on my github.

The most popular topics are constructivism, society, learning, decision making, neuroscience, language, perception, modeling, movement, neural networks and reinforcement learning. I put them so many, because of the next difference.

There are difference in which topics are popular in which place. In the study program, there are currently four universities: Ljubljana, Vienna, Bratislava and Budapest. So I also analyzed one of the years (2015) in order to see which topics are popular in which places. There were not enough abstracts from Budapest in that year, so I only used the other three.

For Ljubljana, the most topics of interest were constructivism, learning and neuroscience. In Vienna, the topics were society, decision making, constructivism and perception. And in Bratislava it was reinforcement learning, learning, modeling and language. In the general perception it is, that you go in Ljubljana in you are interesting in first-person research and neuroscience (which is shown), Bratislava if you are interested in computational modeling and maybe language (which is also shown) and in Vienna if you want something else or have no idea what you want to do, since they were supposed to have the most variety.

For the people in Ljubljana, we have to hear a lot about the connection between the first-person and third-person. So it was interesting to see that neuroscience and constructivism have the least amount of collaboration. But what it can also be seen is, that the amount of interdisciplinarity is increasing through the years.

I also checked which topic humanizes the participants and which do not (according to how one of the people in the audience described it). I only checked if the used subject or participant. The topics with most human participants were study of perception, attention, non-typicality, categorization, neuroscience and decision making. The humanizing ones (using participants) were studying of language, decision making and attention. The others (using subjects) were neuroscience, neural networks and the studying of pitch.

I also checked the differences in personality. More agreeable are people studying health, non-typicality and decision making. Less agreeable were people studying reinforcement learning, neural networks and systems. More neurotic were people studying reinforcement learning, systems and tasks. Less neurotic were people studying non-typicality, health and pitch. The most extroverted researchers were the one studying decision making, society and attention. The least extroverted researchers studied neuroscience, TMS and health.

So these are some results, that I ended up finding about this data set. There are still many interesting questions to ask, but I think I will take a bit of a break and maybe return to this topic models later.