background preloader

Deep Learning Weekly

Facebook Twitter

The Illustrated Word2vec – Jay Alammar – Visualizing machine learning one concept at a time. Discussions: Hacker News (347 points, 37 comments), Reddit r/MachineLearning (151 points, 19 comments) Translations: Russian “There is in all things a pattern that is part of our universe. It has symmetry, elegance, and grace - those qualities you find always in that which the true artist captures. You can find it in the turning of the seasons, in the way sand trails along a ridge, in the branch clusters of the creosote bush or the pattern of its leaves. We try to copy these patterns in our lives and our society, seeking the rhythms, the dances, the forms that comfort. I find the concept of embeddings to be one of the most fascinating ideas in machine learning. Word2vec is a method to efficiently create word embeddings and has been around since 2013. In this post, we’ll go over the concept of embedding, and the mechanics of generating embeddings with word2vec.

On a scale of 0 to 100, how introverted/extraverted are you (where 0 is the most introverted, and 100 is the most extraverted)? Towards Robust and Verified AI: Specification Testing, Robust Training, and Formal Verification. This is not an entirely new problem. Computer programs have always had bugs. Over decades, software engineers have assembled an impressive toolkit of techniques, ranging from unit testing to formal verification. These methods work well on traditional software, but adapting these approaches to rigorously test machine learning models like neural networks is extremely challenging due to the scale and lack of structure in these models, which may contain hundreds of millions of parameters. This necessitates the need for developing novel approaches for ensuring that machine learning systems are robust at deployment.

From a programmer’s perspective, a bug is any behaviour that is inconsistent with the specification, i.e. the intended functionality, of a system. Testing consistency with specifications efficiently. We explore efficient ways to test that machine learning systems are consistent with properties (such as invariance or robustness) desired by the designer and users of the system. Checklist for debugging neural networks. Transformers. Using deep learning to “read your thoughts” — with Keras and an EEG sensor.

A typical technique such as this involves:a) converting to the frequency domain thenb) filtering and weighting based on some prior understanding of what features are important But from math, we also know that filtering in the frequency domain is equivalent to convolution in the time domain. What I’m leading to is that with the stacked convolutional layers of a CNN, we can a) perform the same feature recognition options directly from the time series data without having to convert to the frequency domain, and b) have the network learn filters that are able to best identify these features itself.

This does still require careful network architecture design and sufficient training data or even pre-training, but by doing this, we can take away the manual process of finding out the most informative spectra and waveforms, letting the neural network learn and optimize these filters itself. To start, we can load some sessions and then shuffle and split this into our training and test data. Launching TensorFlow Lite for Microcontrollers. I’ve been spending a lot of my time over the last year working on getting machine learning running on microcontrollers, and so it was great to finally start talking about it in public for the first time today at the TensorFlow Developer Summit.

Even better, I was able to demonstrate TensorFlow Lite running on a Cortex M4 developer board, handling simple speech keyword recognition. I was nervous, especially with the noise of the auditorium to contend with, but I managed to get the little yellow LED to blink in response to my command! If you’re interested in trying it for yourself, the board is available for $15 from SparkFun with the sample code preloaded. For anyone who didn’t catch it, here are the notes from my talk. Hi, I’m Pete Warden on the TensorFlow Lite team, and I’m here to talk about a new project we’re pretty excited about. I was even more amazed when he told me why these models had to be so small.

“Yes”. So why is this useful? There’s more documentation here: Like this: Data Science foundations: Know your data. Really, really, know it. Beyond Interactive: Notebook Innovation at Netflix – Netflix TechBlog. By Michelle Ufford, M Pacer, Matthew Seal, and Kyle Kelley Notebooks have rapidly grown in popularity among data scientists to become the de facto standard for quick prototyping and exploratory analysis. At Netflix, we’re pushing the boundaries even further, reimagining what a notebook can be, who can use it, and what they can do with it.

And we’re making big investments to help make this vision a reality. In this post, we’ll share our motivations and why we find Jupyter notebooks so compelling. We’ll also introduce components of our notebook infrastructure and explore some of the novel ways we’re using notebooks at Netflix. If you’re short on time, we suggest jumping down to the Use Cases section. Motivations Data powers Netflix. Making this possible is no small feat; it requires extensive engineering and infrastructure support. Generally, each role relies on a different set of tools and languages. To help our users scale, we want to make these tasks as effortless as possible. Use Cases. R tensorflow api. Your AI skills are worth less than you think – Inside Inovo. We are in the middle of an AI boom. Machine Learning experts command extraordinary salaries, investors are happy to open their hearts and checkbooks when meeting AI startups. And rightly so: this is one of those transformational technologies that occur once per generation.

The tech is here to stay, and it will change our lives. That doesn’t mean that making your AI startup succeed is easy. I think there are some important pitfalls ahead of anyone trying to build their business around AI. The value of your AI skills is declining In 2015 I was still at Google and started playing with DistBelief (which they would later rename to TensorFlow). In late 2016 I was working on a proof of concept to detect breast cancer in histopathological images. In early 2018 the task from above wasn’t suitable for an intern’s first project, due to lack of complexity.

I hope that you see the pattern here. Data is more important than fancy AI architectures My money would be squarely on Bob. Conclusion. Looking Back at Google’s Research Efforts in 2018. Posted by Jeff Dean, Senior Fellow and Google AI Lead, on behalf of the entire Google Research Community 2018 was an exciting year for Google's research teams, with our work advancing technology in many ways, including fundamental computer science research results and publications, the application of our research to emerging areas new to Google (such as healthcare and robotics), open source software contributions and strong collaborations with Google product teams, all aimed at providing useful tools and services.

Below, we highlight just some of our efforts from 2018, and we look forward to what will come in the new year. For a more comprehensive look, please see our publications in 2018.Ethical Principles and AI Over the past few years, we have observed major advances in AI and the positive impact it can have on our products and the everyday lives of our billions of users. Looking Forward to 2019 This blog post summarizes just a small fraction of the research performed in 2018. How big data has created a big crisis in science. There’s an increasing concern among scholars that, in many areas of science, famous published results tend to be impossible to reproduce.

This crisis can be severe. For example, in 2011, Bayer HealthCare reviewed 67 in-house projects and found that they could replicate less than 25 percent. Furthermore, over two-thirds of the projects had major inconsistencies. More recently, in November, an investigation of 28 major psychology papers found that only half could be replicated. Similar findings are reported across other fields, including medicine and economics.

What is causing this big problem? Scientific method In a classical experiment, the statistician and scientist first together frame a hypothesis. A famous example of this process is the “lady tasting tea” story. Such an experiment was done with eight cups of tea sent to the lady in a random order – and, according to legend, she categorized all eight correctly. Data problems Why can this reversion cause a big problem? Stronger analyses. Introduction to Machine Learning for Coders: Launch. Written: 26 Sep 2018 by Jeremy Howard Today we’re launching our newest (and biggest!) Course, Introduction to Machine Learning for Coders. The course, recorded at the University of San Francisco as part of the Masters of Science in Data Science curriculum, covers the most important practical foundations for modern machine learning.

There are 12 lessons, each of which is around two hours long—a list of all the lessons along with a screenshot from each is at the end of this post. They are all taught by me (Jeremy Howard); I’ve been studying and using machine learning for over 25 years, from when I started my career as an Analytical Specialist at McKinsey & Company, through to my time as President and Chief Scientist of Kaggle and founding CEO of Enlitic. There are some excellent machine learning courses already, most notably the wonderful Coursera course from Andrew Ng. Lesson 1 - Introduction to Random Forests Lesson 2 - Random Forest Deep Dive Lesson 5 - Extrapolation and RF from Scratch. How to Develop a Deep Learning Caption Generation Model in Python from Scratch. Last Updated on December 23, 2020 Develop a Deep Learning Model to Automatically Describe Photographs in Python with Keras, Step-by-Step. Caption generation is a challenging artificial intelligence problem where a textual description must be generated for a given photograph.

It requires both methods from computer vision to understand the content of the image and a language model from the field of natural language processing to turn the understanding of the image into words in the right order. Recently, deep learning methods have achieved state-of-the-art results on examples of this problem. Deep learning methods have demonstrated state-of-the-art results on caption generation problems. What is most impressive about these methods is a single end-to-end model can be defined to predict a caption, given a photo, instead of requiring sophisticated data preparation or a pipeline of specifically designed models.

After completing this tutorial, you will know: Let’s get started. Tutorial Overview. Interactive Machine Learning List. Troubleshooting Convolutional Neural Nets. A Project Based Introduction to TensorFlow.js – Knowledge-Exploration Systems. In this post we introduce how to use TensorFlow.js by demonstrating how it was used in the simple project Neural Titanic. In this project, we show how to visualize the evolution of the predictions of a single layer neural network as it is being trained on the tabular Titanic Dataset for the task of binary classification of passenger survival. Here’s a look at what we will produce and a link to the live demo: Live Demo This article assumes that you have a basic understanding of modern frontend JavaScript development and a general awareness of basic machine learning topics. Please let me know in the comments if you have any questions.

Outline Neural Networks Neural networks are finally getting their rightfully deserved day in the sun after many years of development by a rather small research community spearheaded by the likes of Geoffrey Hinton, Yoshua Bengio, Andrew Ng, and Yann LeCun. Dataset and Modeling Overview And here are the descriptions of each field: Project Setup . . . . The Code Index.js. Applications of Reinforcement Learning in Real World. Deep Learning Tips and Tricks. 1802.08195. How to build your own AlphaZero AI using Python and Keras. Connect4 The game that our algorithm will learn to play is Connect4 (or Four In A Row). Not quite as complex as Go… but there are still 4,531,985,219,092 game positions in total. The game rules are straightforward. Players take it in turns to enter a piece of their colour in the top of any available column.

Here’s a summary of the key files that make up the codebase: game.py This file contains the game rules for Connect4. Each squares is allocated a number from 0 to 41, as follows: The game.py file gives the logic behind moving from one game state to another, given a chosen action. You can replace the game.py file with any game file that conforms to the same API and the algorithm will in principal, learn strategy through self play, based on the rules you have given it. run.ipynb This contains the code that starts the learning process.

Self-playRetraining the Neural NetworkEvaluating the Neural Network There are two agents involved in this loop, the best_player and the current_player. agent.py. Embrace Randomness in Machine Learning. Why Do You Get Different Results On Different Runs Of An Algorithm With The Same Data? Applied machine learning is a tapestry of breakthroughs and mindset shifts. Understanding the role of randomness in machine learning algorithms is one of those breakthroughs. Once you get it, you will see things differently. In a whole new light. Things like choosing between one algorithm and another, hyperparameter tuning and reporting results. You will also start to see the abuses everywhere. In this post, I want to gently open your eyes to the role of random numbers in machine learning.

Let’s dive in. (special thanks to Xu Zhang and Nil Fero who promoted this post) Embrace Randomness in Applied Machine LearningPhoto by Peter Pham, some rights reserved. Why Are Results Different With The Same Data? A lot of people ask this question or variants of this question. You are not alone! I get an email along these lines once per week. Here are some similar questions posted to Q&A sites: 1. 2. 3. 4. 5. No Doubt Ouch. Mltest: Automatically test neural network models in one function call. So I got a lot of positive feedback on my last post on how to unit test machine learning code. A few people actually messaged me directly saying they caught a bug in their own code with the recommended tests, which is awesome!

But these issues are still too common, and it is just as easy to forget to write a test as it is to write the bug in the first place. We need a better, more automated solution. That is why we are introducing mltest: Automated ML testing in one function call. Check it out! Done. With incredibly little setup, we now are testing against several different common machine learning issues.

To install it, just run: The function call mltest.test_suite(…) is the main powerhouse of this library. 1. The test from my first post that helped people the most was the variables change change test. 2. It is also possible to make sure that only variables within a scope or a list are the ones that change, and that the rest do not change. 3. 4. Mltest setup.