For a hobby programmer like me, Google is like a bible for fundamentalists. When they don‘t have an answer for something, they go and find it on bible. When I have a programming problem, I go and find it by googling it.
So I was wondering how many websites do I usually have to access for my to program something. I have a feeling that the number has to be quite high, but I did not know how much.
In order to figure that out, I decide to try and figure out how to program a factor analysis in the python. I wanted to get the loadings of variables for different factors. I continued until I had a working code that could print the loadings in the terminal for analysis. I only counted the websites whose solutions ended up in the code. There were more, but they were dead ends so I did not include them.
For the analysis I used the answers to the Big Five personality questionnaire that I found on http://personality-testing.info/_rawdata/.
This is the final code that I ended up with:
from sklearn.decomposition import FactorAnalysis import numpy from rpy2.robjects.packages import importr from rpy2.robjects import r, numpy2ri data = numpy.genfromtxt('data.csv', delimiter='\t') for i in [6,5,4,3,2,1,0]: data = numpy.delete(data, i, 1) data = numpy.delete(data, 0, 0) numpy2ri.activate() fit = r.factanal(data, 5, rotation="varimax") results = fit.rx('loadings') print(results)
I used 7 different websites to code this 12 lines of code. Which means that I needed to check one webpage for ever 1.7 line of code.
Here is the code with websites used above the piece of code that they were used for:
from sklearn.decomposition import FactorAnalysis import numpy #http://stackoverflow.com/questions/3518778/how-to-read-csv-into-record-array-in-numpy data = numpy.genfromtxt('data.csv', delimiter='\t') #http://stackoverflow.com/questions/24898754/delete-dimension-of-array #http://docs.scipy.org/doc/numpy/reference/generated/numpy.delete.html for i in [6,5,4,3,2,1,0]: data = numpy.delete(data, i, 1) data = numpy.delete(data, 0, 0) #http://stackoverflow.com/questions/25036588/extract-correlation-matrix-from-rs-factanal-via-rpy from rpy2.robjects import r, numpy2ri numpy2ri.activate() #http://blog.yhat.com/posts/rpy2-combing-the-power-of-r-and-python.html from rpy2.robjects.packages import importr #http://www.statmethods.net/advstats/factor.html fit = r.factanal(data, 5, rotation="varimax") #http://stackoverflow.com/questions/27575848/how-to-convert-rpy2-listvector-rpy2-robjects-vectors-listvector-to-python results = fit.rx('loadings') print(results)
I knew that I use internet as a crutch a lot, but these results were surprising. I don‘t really want to believe that I use it that much, but at least in this case the data shows that. Maybe I should rethink my way of programming…