background preloader

Development

Facebook Twitter

Swift on Linux

GISData Base. PostGIS. Geospatial. Tools, Services. Mobile Development. Apple. Programming. Algorithms Every Data Scientist Should Know: Reservoir Sampling. Data scientists, that peculiar mix of software engineer and statistician, are notoriously difficult to interview. One approach that I’ve used over the years is to pose a problem that requires some mixture of algorithm design and probability theory in order to come up with an answer. Here’s an example of this type of question that has been popular in Silicon Valley for a number of years: Say you have a stream of items of large and unknown length that we can only iterate over once.

Create an algorithm that randomly chooses an item from this stream such that each item is equally likely to be selected. The first thing to do when you find yourself confronted with such a question is to stay calm. Remember: Stay Calm. The second thing to do is to think deeply about the question. The third thing to do is to create a simple example problem that allows you to work through what should happen for several concrete instances of the problem. A Primer on Reservoir Sampling. Development.

Quality

Bug Tracking. System Dashboard. Version Control. Continuous Integration. Continuous Integration for Everybody — TeamCity. Hudson Continuous Integration. Welcome to Jenkins CI! | Jenkins CI. Code analysis. Data Science. Database. NoSQL. Graph. Key Value. Relational. Profiling. Build. Programming. Cross Platform. .NET. Java. Frameworks. Testing. Mock. iOC. iOS. IDE / Editors. Networking. Services.

Web. Web Servers. Security.