background preloader

Processing text

Facebook Twitter

Efficient String Concatenation in Python. An assessment of the performance of several methods Introduction Building long strings in the Python progamming language can sometimes result in very slow running code. In this article I investigate the computational performance of various string concatenation methods. In Python the string object is immutable - each time a string is assigned to a variable a new object is created in memory to represent the new value.

What other methods are available and how does their performance compare? For this comparison I required a test problem that calls for the construction of very long strings out of a large number of segments. The test case I used is to concatenate a series of integers from zero up to some large number. Although this particular test problem doesn't have any real world application that I can think of, it makes a good test case because it is easy to code and simple both conceptually and computationally. Six Methods These are the methods that I tested. Method 1: Naive appending. Deep learning with word2vec and gensim » RaRe Technologies. Radim • 17.9.2013 12:52 • 28 Comments Neural networks have been a bit of a punching bag historically: neither particularly fast, nor robust or accurate, nor open to introspection by humans curious to gain insights from them.

But things have been changing lately, with deep learning becoming a hot topic in academia with spectacular results. I decided to check out one deep learning algorithm via gensim. Word2vec: the good, the bad (and the fast) The kind folks at Google have recently published several new unsupervised, deep learning algorithms in this article. Selling point: “Our model can answer the query “give me a word like king, like woman, but unlike man” with “queen“. Pretty cool. Not only do these algorithms boast great performance, accuracy and a theoretically-not-so-well-founded-but-pragmatically-superior-model (all three solid plusses in my book), but they were also devised by my fellow country and county-man, Tomáš Mikolov from Brno! Free, fast, pretty — pick any two.

Word2vec - Tool for computing continuous distributed representations of words.