Blog of Sara Jakša

Different Social Understanding

I am right now going through my philosophy of mind notes (I can't believe, that I am only doing it now - but it does give me perspective, to only take the most interesting things). While toward the end of the semester, we also dealt with different theory of how we act in social interactions. So here I am going to try and write, from sparse notes and memories, what these different theories were.

The first theory is the folk theory of interaction. This one is understanding people based on their beliefs. So, each person is having beliefs and desires and we can predict intentions from it. So, a person is seen walking quickly, so they must be in a hurry sort of things.

In that was, it is a bit similar to the theory of mind, where we use the systematic models and law-like knowledge of people, in order to make prediction. So, a person knows about her opinion, so he will act like that sort of way.

The next one is simulation theory, where we simulate what is going to happen. This can happen subconsciously as well. The emotions are used and the main question is what, not how. So, what would I do?

The next one is sort of embedded theory. Normal children learn interaction skills through responses, since understanding of situation is part of the interaction. That is how we can immediately recognize that smile is joy, in a first person way. Here, not understanding another person is a feeling, not lack of knowledge. But there is a default assumption, that we are similar and act in accordance to social norms.

The last one is from enactivism. It is the structure of the environment, that makes people predictable. We know, how people will act in the funeral or while waiting for the bus and so on. Here misunderstanding means, that there is a lack of mutual reciprocity feeling. It uses the narrative building to create a story. We also create beliefs in a sense-making activity through interaction with other people.

I guess, at least phenomenologically, we use all of them in some situations. Which makes it so much harder to understand.

Change Latex file to Word

I have recently tried to change my latex file into doc. I needed to send my economic thesis to somebody, and they don't know what to do with the latex file. The first time, I had send the pdf, but they prefer making comments in word.

So I figured out, that I am just going to transform pdf to word, and I already did once. This time, the results were not pretty, so I tried to find another way.

The next one was pandoc, which have the ability to transform latex to docx, but the first time I tried, there was no citations (which is a big no-no for master thesis). So I tried to include the citation.

When I was doing the transformation the first time, it just hang there, and nothing happened. When I came to check my bash history right now, to copy the one that did not work, and I figured out, why it did not work. The following one did work right now:

pandoc texfile.tex --bibliography=bibfile.bib --csl=style.csl -o finalfile.docx

Which means, that I sent the wrong version to a mentor again. That is embarrassing. Really embarrassing.

Well, while I was trying to figure out, why it was just hanging (I forgot to include the latex file), I checked the internet. One thing that they noted was, that bib file should be ASCII only. Well, mine certainly was not. So I had to find a way to find these non-ASCII characters. So I found this somewhere, which prints every line with non-ASCII characters and highlights them:

grep --color='auto' -P -n "[^\x00-\x7f]" filename

The --color tells us, when to highlight things (always, never or auto), the -P means that the expression is Perl regex expression, and -n also prints line numbers, so things are easier to find in the file.

So, if anybody want to transform latex to word, this is a way to do it.

Analysis of My Citations for Economic Master Thesis

The Jupyter-Noteboom can also be found here: My_Citations_For_Economic_Master_Thesis

I have finally sent the final version of my economic master thesis to my mentor. While I was doing this, I decided to try and analyse what kind of citations was I using in my master thesis.

Importing the libaries

import os
import re
import pandas

Regex patterns

citations_re = r"cite{.+?}"
re_entry = r"@\w*{.+?timestamp.+?}"
re_type = r"@\w*{"
re_journal = r"journal[\s]+?=[\s]+?{.+?}"
re_name = r"@\w*{.+?,"
re_year = r"year.+?=.+?{.+?\d+?.+?}"

Get all citations from tex files

In this stage, what I did was go over all my tex files and put out all the citations (\parencite{}, \cite{}, \textcite{}).

all_citations_in_my_work = set()
for filename in os.listdir("files"):
    with open(os.path.join("files", filename)) as f:
        data = f.readlines()
        data = " ".join(data)
        all_citations = re.findall(citations_re, data)
        for s in all_citations:
            s = s.replace("parencite{", "")
            s = s.replace("textcite{", "")
            s = s.replace("cite{", "")
            s = s.replace(" ", "")
            s = s.replace("}", "")
            if "," in s:
                s = s.split(",")
                for c in s:

I used 157 different citations in my work. Which I think is not bad for a master thesis.


Preparing bib for parsing

In the next stage, I parsed the bib files, so that I could search them based on what I wanted to find.

lines = ""
for filename in os.listdir("bib"):
    with open(os.path.join("bib", filename)) as f:
        data = f.readlines()
        data = " ".join(data)
        lines = lines + data
lines = lines.replace("\n", " ")
lines = re.findall(re_entry, lines)

From what scientific journals were my scientific articles

In the next step, I parsed the data to try and figure out, what scietific journuals were I using.

my_journuals = dict()
for line in lines:
    name = re.findall(re_name, line)
        name = name[0].split("{")[1].replace(",", "")
    except IndexError:
    if name in all_citations_in_my_work:
        t = re.findall(re_type, line)
        t = t[0][1:-1]
        if t.lower().strip() == "article":
            j = re.findall(re_journal, line)
            if j:
                j = j[0].split("{")[1].replace("}", "")
                if j not in my_journuals:
                    my_journuals[j] = 0
                my_journuals[j] += 1

Here I first counted the number of articles.

articles = 0
for j, n in my_journuals.items():
    articles += n

And then I counted the number of journuals, that I was using.


So I took about 1.5 articles from each journual.


I then tried to see, if there were any journuals, that I used more. I used Computers in Human Behavior the most. You can see below, which ones did I used more than twice.

my_journuals = pandas.DataFrame.from_dict(my_journuals, orient="index", columns=["Count"])
my_journuals.sort_values("Count", ascending=False, inplace=True)
my_journuals.reset_index(level=0, inplace=True)
index Count
0 Computers in Human Behavior 13
1 Personality and Individual Differences 6
2 Annual Review of Psychology 5
3 Social Media + Society 4
4 Information Systems Frontiers 3

What type were my sources

Next I wanted to see, what different types were my sources. Here is the code.

types = dict()
for line in lines:
    name = re.findall(re_name, line)
    name = name[0].split("{")[1].replace(",", "")
    if name in all_citations_in_my_work:
        t = re.findall(re_type, line)
        t = t[0][1:-1]
        t = t.lower()
        if t not in types:
            types[t] = 0
        types[t] += 1

As you can see, the articles were the most frequent (99). The books were less so, even combining the whole books and the chapters (18). The rest were used 5 times or less.

{'online': 2,
 'www': 1,
 'electronic': 1,
 'report': 3,
 'manual': 1,
 'inproceedings': 5,
 'incollection': 5,
 'book': 13,
 'article': 99,
 'thesis': 2}

From what year were my sources

Next I tried to see, from what year were my sources, that I used.

my_years = dict()
for line in lines:
    name = re.findall(re_name, line)
    name = name[0].split("{")[1].replace(",", "")
    if name in all_citations_in_my_work:
        t = re.findall(re_year, line)
        if t:
            t = t[0].split("{")[1][:-1]
            if not t in my_years:
                my_years[t] = 0
            my_years[t] += 1
my_years = pandas.DataFrame.from_dict(my_years, orient="index", columns=["Count"])
my_years.sort_values("Count", ascending=False, inplace=True)
my_years.reset_index(level=0, inplace=True)
my_years.sort_values("index", ascending=False, inplace=True)

I have used 1 source from this year. It seems that most of my sources were recent. The most sources were from last year, then the year before, then four years before (not sure, why there are not more sources from 2016).

Looking more into the past, oldest reference was from 1970. I used 4 from the 70', 1 from the 80' (so before I was born), 3 from the 90' and additional 33 from the 00'. All the rest are from the time, when I was already attending the university.

index Count
18 2019 1
0 2018 26
1 2017 15
7 2016 6
2 2015 12
9 2014 5
3 2013 8
6 2012 7
10 2011 5
8 2010 5
4 2009 8
13 2008 3
5 2007 7
22 2006 1
12 2005 3
11 2004 4
26 2003 1
14 2002 3
24 2001 1
16 2000 2
17 1999 1
23 1991 1
19 1990 1
25 1988 1
15 1977 2
21 1973 1
20 1970 1

Spending Based on Personality can Increase Your Happiness

I have recently read a pretty interesting article. I think by now it quite widespread knowledge, that more money does not make one happy. It helped until about 60.000$ (based on US standard), and after that it has minimal effect. So it was pretty interesting to see a scientific article, that had a different thesis.

The article Money Buys Happiness When Spending Fits Our Personality talks about how people with different personalities spend, how well this predicts their happiness level and if this could be manipulated to increase people's happiness. The quick answers were that people with different personalities spend money on different things, that the more a person spend in accordance to their personality, the happier they were (which was the stronger predictor than total income and total spending) and this can be manipulated by making people spend money on certain things.

Lets get first to the point of personal relevance: how can this help me spend money, to make me more happy? Well, I used the aggregate data from the article and created a small program, that helps you figure this out. Here you just input the percentile scores for each Big Five trait, and it will generate a score for each activity in the dataset. Since there are a lot of them, you can set a threshold, so only activities higher than certain values are shown.

If you don't have a Big Five percentile test, there are a couple of places, that you can get it. I recommend the SAPA test, but this one can be quite long. If you have a long text written in English or a Twitter account where you tweet in English, you can also just put it in this IBM Personality Tool, and it will also generate you a personality profile. But any test, you can find on the internet, and gives you results from 0 to 100 will work. Just search for Big Five personality tests. Or any apps, that will give you the same type of results from some other stuff. I know that there exists at least one, were it is calculated from Facebook account.






Only show activities with score higher than:

Based on my results, spending money on books would make me happier. Which I agree with, as most of my money, after spending for neccesities (rent, food, ...) goes for books, DVDs, courses and conferences. Books being one of the important parts.

Now, that I got my programming self-plug out of the way (I still hope it was useful to somebody), I will return back to the article. I will first copy the table of different categories and their correlations with different personality traits. Here you can see some of the connections.

Category Openness Conscientiousness Extraversion Agreeableness Neuroticism
Accountants’ fees −1.81 2.02 −1.40 −0.68 −0.62
Advertising services 1.98 0.70 2.04 −0.04 0.34
Airports and duty-free shops −0.50 0.96 0.34 −0.18 −0.02
Arts and crafts 2.51 0.20 1.05 1.71 −0.46
Bakers and confectioners 1.45 1.59 0.86 1.41 −0.80
Books 1.71 1.92 −0.82 1.53 −1.39
Cable and satellite TV 0.48 0.00 1.29 −0.17 0.14
Car rentals −0.53 1.39 −0.06 0.31 −0.96
Caravans and camping 1.65 0.60 1.51 1.00 −0.64
Catalogue and bargain stores −0.34 −0.27 0.35 0.54 −0.21
Charities −0.35 1.65 0.10 2.31 −1.39
Cinemas 2.30 0.22 1.75 0.71 −0.02
Clothes 0.83 0.44 0.96 0.89 −0.44
Coffee shops 0.89 1.24 0.45 1.79 −1.23
Computers and technology 1.36 2.05 0.28 0.19 −1.00
Confectioners and tobacconists 0.75 0.21 0.77 0.42 −0.06
Days out and tourism 2.19 0.57 2.25 1.10 −0.28
Dental care −1.25 1.79 −0.59 0.32 −0.59
Department stores −0.30 1.28 0.70 0.57 −0.62
Digital 1.55 1.05 0.77 0.02 −0.45
Discount stores −0.17 −0.42 0.32 0.28 0.19
DIY projects 2.22 1.37 1.20 0.98 −0.54
Eating out: pubs 1.35 −0.41 2.22 0.40 0.48
Eating out: restaurants 1.56 0.44 1.74 0.91 −0.39
Entertainment 2.67 −0.43 2.51 0.31 0.49
Family clothes −0.28 0.43 0.00 1.16 −0.96
Florists 1.69 1.38 1.13 1.87 −0.98
Foreign travel 2.54 0.65 2.15 0.85 −0.11
Gambling 1.55 −2.08 2.33 −1.81 1.98
Gardening 0.59 1.75 −0.73 1.94 −1.59
Gift shops 0.83 0.94 0.55 1.74 −0.94
Hair and beauty 1.91 0.31 1.49 0.85 0.22
Hardware −0.78 1.73 −0.61 0.04 −1.22
Health and fitness 0.32 2.22 1.29 1.00 −0.93
Health insurance −1.61 1.52 −1.11 −0.16 −0.50
Home furnishing 0.63 1.48 0.17 1.38 −1.22
Home insurance −2.05 2.40 −1.46 0.33 −1.48
Hotels −0.16 1.69 0.31 1.55 −1.63
Information technology 0.93 1.36 0.33 0.15 −0.80
Jewelry 1.60 0.73 1.43 0.96 −0.61
Life insurance −1.30 2.21 −1.02 1.11 −1.25
Mobile telephone 1.02 1.33 1.65 0.33 −0.13
Motor sports 1.34 0.09 2.32 −0.55 0.82
Music 2.61 0.12 2.33 0.94 0.15
Newsagents −0.22 0.76 1.06 −0.29 0.12
Pets 1.14 0.08 2.04 1.98 0.24
Photography 2.33 0.69 1.44 1.09 −0.33
Residential mortgages −2.10 1.98 −1.40 −0.48 −0.85
Shoe shops 0.40 1.19 0.43 0.58 −0.77
Sports 1.44 1.30 2.24 −0.41 0.77
Stationery −0.14 1.98 −0.78 1.51 −1.63
Subscriptions −0.43 1.42 −0.26 0.44 −0.86
Supermarkets −0.69 1.27 0.51 0.58 −0.73
Takeout food 0.84 −0.07 1.16 0.23 −0.19
Toys and hobbies 2.19 −0.90 1.94 0.78 −0.06
Traffic fines −2.25 0.91 −0.58 −2.33 1.34
Travel 2.51 0.24 2.37 1.18 −0.20
TV license −0.17 1.29 0.26 −0.33 −0.39
Unions and subscriptions −1.04 1.26 0.42 −0.58 0.25

One of the experiments was also quite interesting. They collected introverts and extroverts. And then they gave half of each the voucher for book and half of each a voucher for a drink in a bar. Then they asked them, when they were making purchase, about their happiness. The extroverts were satisfied with both, but there was a large difference with introverts. They were a lot more happy with the book.

What can this help us? Well, for one, if you have no idea what to give somebody for a gift, and you don't want to ask them, it help help you choose something, that they might be more happy with. It help help you see, if you are spending more than average on something that makes you unhappy, and you can try a test to spend less, to see if that makes you happier. You can increase the spending for things that make you happier. I don't know, I think there are a lot of way, this can be used, just like most self-knowledge tidbits.

I also added the Jupyter-Notebook analysis of it.

Profiling from Blogs and from Social Media

On the first of April, I had presented my cognitive science master thesis topic at the class. The slides can be found here (in Slovenian). Very short summary would be, that I am researching the individual differences in sharing the opinion on social media.

After the presentation, I was talking to my classmates. One of them mentioned, that if this means, that I am more careful with what I post on the internet. My reply was, that I am posting my blog (the one that I am reading right here). And I don't mind, if anybody tries to analyse me based on this.

But on the other hand, I don't really post things on social media (I really need to delete the last remaining ones, that I have). And for anything that I can get in stores here in Ljubljana, I am using cash. So in a way, I am more careful in what data am I leaving, just in a different way.

I also studied business informatics, and this gave me a glimpse of what people can do with the data. Once the data is cleaned in the databases/tables, there is a lot of information, that can be gleaned from comparing the users. It is how the basic recommendation systems work. You find people or groups of people, that have the similar evaluation of the same works. And then check, what other works did these people also rated high and then recommend it to people, that they did not rate it yet.

Another example if the information, that can be gleaned from the liking behaviour. There was a good article, that showed how personality, gender and other attributes can be discovered through liking. People with different individual differences like different things, and this can be used to discover things about people.

But in order for this to happen, the data has to be in sort of a structured form. The blog is not exactly the structured form (unless it includes a lot of semantic web features). It is unstructured text, with maybe some videos and podcast thrown in. Maybe a picture or more as well. This means, that there is more work needed, in order to get these data in such form (the search engines still do it). So no everybody will do it with everybody.

The other reason I will borrow is from Janor Lanier book titled Ten Arguments for Deleting Your Social Media Accounts Right Now. And this is the BUMMER principle. But the BUMMER only makes sense, if the companies can get some money out of it. On my blog, I have no advertisements and I doubt I will ever have it, so what is the point of doing it?

Does this mean, that there is no way that people will abuse it? I mean, if I have a person trying to target me directly, I am sure they will go through all the writings that I wrote, trying to find something about me. But I am not afraid of that. I just don't want to be just another entity in the database.

Which is why I don't mind sharing the info through the Indie-web, which blogs are. And there is another plus form my side, and this is that the content is under my control. Nobody can delete the blogs but me. And even if the servers go down and the country blocks my webpage, I still have my backup. I don't have thin on any website, which I do not control. I had already lost some of my data, because the website stopped working. But here it depends on whatever I want to continue paying for the domain and hosting and nothing else.