Why Tensorflow is Awesome for Machine Learning
Machine Learning and Deep Learning has exploded in both growth and workflows in the past year. When I first started out with Machine Learning the process was still somewhat limited as were the frameworks. Data Scientist would configure and tune models on local machine only to have to recreate the work when pushing to production. This process was extremely time consuming. Google and the Google Brain team released Tensorflow in 2014. Find out why Tensorflow is awesome for Machine Learning in the video below.
Transcript – Why Tensorflow is Awesome for Machine Learning
Hi folks, welcome back to another episode of Big Data Big Questions. Today, I’m going to tackle a question around Tensorflow. I wanted to give you my feedback. I’ve been diving into Tensorflow and looking into how you set it up, and then actually playing around with it, and seeing how it differs from some of the other machine learning programs and things that I’ve used in the past like Mahout and MadLib, and some other things.
I wanted to give you my take on Tensorflow, tell you why I think it’s great. Tell you how you can get hands-on with it, and just give you some background on it. Find out more, right after this.
Welcome back. Before we jump into my thoughts on Tensorflow, I did want to encourage you to make sure you subscribe to the YouTube channel here, and then also, if you have any questions, go ahead. Send them in. You can go to my website, thomashenson.com/big-questions.
I will answer any of your answers there. You can put them in the YouTube comments section here below, or you can use the hashtag #bigdatabigquestions on Twitter. I will answer those as quickly as I can. Thank you everybody for subscribing. Now, let’s talk a little bit about Tensorflow.
Been going through and going down more of the deep learning paths. I’ve done, been doing, some research and some learning on my own. One of the first things that I’ve started really diving into is Tensorflow. I wanted to look at Tensorflow, because I have a background, when I first started out in the Hadoop ecosystem, we weren’t really doing streaming, so probably, you’ve heard me talk a little bit about the Kappa and the Lambda architecture here. Make sure you check those videos out. One of the things that we did use back when we were just using batch, more of a Lambda architecture’s workflow, is I used Mahout a good bit.
We used Mahout, and I used SVD. I wanted to see how Tensorflow differed, because a lot of people are talking about Tensorflow, like, “Hey, you know, a lot of training.” There’s a lot of training out there. There’s a lot of YouTube videos out there, and there’s just a lot of excitement for Tensorflow. Me, wanting to dig in, I looked in, and I started playing around with it.
One of the first things that I really noticed, and one of the things that I really liked about Tensorflow, was the fact that when we think about using Mahout or using some of the old, other algorithms, one of the problems that I had was, we had our data scientist, and they would look at, and they would play around, and figure out what they wanted from their data model, exactly what algorithms they were going to use.
A lot of times, they were coming, and we were still new to this, but they were coming in from using things on their machine. They were using MathLab, or Octave, or some of you have been using Excel. Once you go, and you say, “Hey, man, well, you know, I had this, and we were looking at this little sample of data. Now, let’s scale it out to, you know, terabytes and terabytes of data. I want to see how this is going to work.”
Those algorithms are totally different. What you can run on your local machine, and the way that those are processed is totally different than the way Mahout does it, or the way MadLib, or MLLib, any of the distributed machine learning algorithms. Not all the work that you did there, but there was a lot of new steps that you had to go through, versus with Tensorflow, the thing is, you can run it on your local machine. Don’t have to have a distributed environment, but those are the same processes and the same way the algorithm works. It’s going to run on your huge cluster.
Just think about it like this. To do Tensorflow, you don’t have to set up a distributed network. It’s not going to time out. It’s not going to go fast on your single machine. If we’re trying to turn over a terabyte of data that you’ve got on your laptop, have at it, but it’s not going to be as efficient as you set up in your data center, there.
The cool this is, when you’re doing sampling, and you’re doing testing, you can do that locally. You can do that on your local machine. Then, when it comes time to test it, you’re really just porting [Phonetic], because you can use Docker and some other cool tools on the back-end, to be able to just expand that into your data center.
I thought that was really cool. A little bit about Tensorflow. Tensorflow was incubated out of Google. If you’re interested in it, I would encourage you to… I’ll put this link in the show notes for on my website, but I would check out the research paper, Large-scale Machine Learning On Heterogeneous Distributed Systems, so Tensorflow. It goes into some of the research behind it, and why Tensorflow, why now?
I’m really, really heavy into it, and I know sometimes these research papers, or most of the time for me, the research papers are kind of over your head. First time you read it, you might be like, “I don’t really understand it,” but then the second time, and as you see it more, it’s going to help you. That’s my little tip about research papers, just go ahead, read them, become familiar with them. It’s okay that you don’t understand it, because it means that you’re actually learning.
Also, a lot of resources out there for Tensorflow. There’s a website that you can go to, and you can start playing around with how these neural networks go to work in Tensorflow, and different parameters you can play with, and it just gives you a visualization for how it’s going to identify image data, and then, be able to use Tensorflow in your own environment. I would encourage you to use the website, to go ahead and play, and look for all the stuff in the show notes here.
Until then, that’s all I wanted to talk about today on Tensorflow, but until the next time, I will see you on Big Data Big Questions. Thank you.