Deep Learning Archives - Thomas Henson

5 Things Every Data Team Should Know About Transfer Learning

January 12, 2021 by Thomas Henson Leave a Comment

Did you know there is a technique in Deep Learning (DL) that doesn’t require large data sets and extremely long training times? It’s called Transfer Learning and the fact if you have done any “Hello World” image detection examples or followed my Tensorflow TFLeanring Course you have already used. Data Teams with Data Engineers and Data Scientist should know Transfer Learning. Let us jump into understanding Transfer Learning.

What is Transfer Learning

Transfer Learning is a Machine Learning (ML) technique that focusing on storing knowledge gained from one problem and applying to another related problem. Data Scientist start by building one model then use that same model as the starting point for a new model. Typically the secondary model is a related problem but not always. For example, let us take a model that was built by our friend Dwight to detect images of bears. Now Dwight has used that model to try and figure out how to identify the best bear. Part of that model does image detection it identifies a bear.

Dwight can now share his model with his best friend Jim who wants to build a model to detect dogs. Since the model that Dwight has already been pre-trained Jim can reduce his time in training.

Transfer Learning speeds up time to results (does not guarantee results😊)

The second thing you need to know about Transfer Learning speeds up time to results. Think of Transfer Learning as a framework in programing languages. When I was a Web Developer in .NET community, I could build features within my Web Application quicker using .NET functions already built in. For example, connecting to a SQL data could be done using a built-in function called ConnectionString. The complicated details of building that connection to SQL server was abstracted away from me.

Using Transfer learning Data Teams are not starting from scratch which allows models to be built and trained faster. Just as frameworks allow to abstract away complexity, Transfer Learning is similar in that developers can focus on solving higher level problems. In our Bear detector example our friend Dwight has already done the hard work for building an image detector. Now Jim can change a few lines of code and build a new model.

Transfer Learning for Data Reduction

When we think of Deep Learning large data sets are what comes to mind. Transfer Learning allows Data Scientist to use smaller data sets to train models. By utilizing models already built for one task the model can then be retooled to solve a different problem. In our previous example of an image detector for Bears. How much data would need to be applied to create a new model to identify dogs? How about the Jetson Nano thumbs up or down project?

One area being impacted by Transfer Learning is Healthcare. Pretrained models are huge in helping with Healthcare models. For example, let us say there is a specific lung image detection model that is trained 80% of the way this is called a pretrained model. Data Teams can use this model to apply to their problem to take it the remaining 20% of way to train. Imagine one model trained to detect scar tissue can be used to detect other complex lung issues like Pneumothorax, Cancer, COPD, and more.

Most Computer Vision Already Incorporates Transfer Learning

For many reasons we have already discussed Object detection incorporated Transfer Learning. Edge detections is already designed. An Edge is the sharp contrast in a image. For example, the below is a photo of a Jim from the Office, notice where his brown tie meets his yellow shirt? This would be an edge. Tensorflow and other Deep Learning Frameworks come with functions ready to do object detection. Those function already incorporate models that can detect edges in images.

One example is in the Jetson Nano Getting Started Project where you can build a model to detect Thumbs Up or Thumbs Down. Out of the box we just use the pretrained model and add our data. For this model we are adding our own images of thumbs up and thumbs down. Using Transfer Learning allows for Jetson Nano users to quick build an image detection with minimal coding and data.

NVIDIA has a Transfer Learning Toolkit in it is 2^nd Generation

We all know here at Big Data Big Questions we love the NVIDIA team. Well at NVIDIA’s GPU Cloud or NGC they have catalog of Deep Learning frameworks like we have just talked about. Whether you are looking to train a model for healthcare with their Clara Framework or Natural Language Processing (NLP) with BERT. Many of these models come pretrained to apply your data to solve your problem. Here is the NVIDIA official statement on the NVIDIA Transfer Learning Toolkit:

To enable faster and accurate AI training, NVIDIA just released highly accurate, purpose-built, pretrained models with the NVIDIA Transfer Learning Toolkit (TLT) 2.0. You can use these custom models as the starting point to train with a smaller dataset and reduce training time significantly. These purpose-built AI models can either be used as-is, if the classes of objects match your requirements and the accuracy on your dataset is adequate, or easily adapted to similar domains or use cases.

By using NVIDIA’s TLT 2.0 data teams can reduce development by up to 10X. Even cutting development times in half is a huge game changer for A.I. development.

Wrapping Up Transfer Learning

Transfer Learning is a powerful technique within Deep Learning for helping put models into production faster and with smaller data sets. The key application of Transfer Learning is building off previous training just like we do as humans. The first time I learned to program with Java was hard! Object-Oriented programing was new to me. However, over time I got better, then when I switched to C# for it was a lot easier to take in the concepts and learn. See I was building off my previous training in Java to learn C#.

Want More Data Engineering Tips?

Sign up for the Big Data Big Questions newsletter to be sure and never miss a post or YouTube Episode of Big Data Big Question where I answer questions from the community about Data Engineering questions.

O’Reilly AI Conference London 2019

October 9, 2019 by Thomas Henson Leave a Comment

The Big Data Big Data Questions show is heading to London for the O’Reilly AI Conference October 15 – 17 2019. I’m excited to be a part of the O’Reilly AI Conference series. In fact, this will be my third O’Reilly AI conference in the past year. Let’s look back at those events and forward to London.

San Jose & New York

View this post on Instagram

Late night packing my conference gear for my trip to O’Reilly AI Conference this week. Most important items: 1️⃣ Stickers 2️⃣ 🎧 3️⃣ 💻 4️⃣ Bandages? (I’ll explain later) 5️⃣ 📚 (this weeks its my Neural Networking) What’s your list of must have gear for tech conferences? #programming #coding #AI #conference #techconference

A post shared by Thomas Henson (@thomas_henson) on Sep 5, 2018 at 5:09am PDT

First in 2018 I attended the San Jose conference where I spent a good portion of the time in the Dell EMC booth talking with Data Engineers and Data Scientist. One of the major themes I heard from Data professionals was they were attending to learn how to incorporate Tensorflow into their workflows. In my opinion Tensorflow was talked about in every aspect of the conference. We had a blast learning from attendees and discussing how to Scale Deep Learning Workloads. Also this was my first time attending a conference with 14 stitches in my left hand (trouble on the pull up bar)!

Next was O’Reilly AI New York. Forever this conference will be known in my head as the Sofia the Robot trip. During this conference I worked with Sofia the Robot not only at the conference but in a Dell EMC event at Time Square Studios (part of the Dell Technologies Magic of AI Series). Before the Magic of AI event, Sofia and I spent the day recording with O’Reilly TV about the current state of AI and what’s driving the widespread adoption. After a day of recording, I had a keynote for day two of the O’Reilly AI Conference where I discussed how AI is impacting future generations already. Then there was a whirlwind of activity as Sofia the Robot took questions at the Dell Technologies booth. The last thing of the day was the Magic of AI event in Time Square Studio where we had 100 people taking part in a questions and answer session with Sofia the Robot.

Keynote O’Reilly AI Conference New York

Coffee with Sofia the Robot

http://https://youtu.be/KbBvdoUOpmY

On To London

Next up is O’Reilly AI London. To say I’m excited is an understatement. During this trip I will accomplish many first time moments.

To begin with it’s my first international conference along with my first time in London. So many things to see and so little time to do it. Feel free to give me suggestions about visit locations in the comment section below.

Second at O’Reilly AI London I will give my first breakout session at an O’Reilly Conference. While I’ve been on O’Reilly TV and given a keynote I’ve yet to have a breakout session. My session is titled AI Growing Pains: Platform Considerations for Moving from POC to Large-Scale Deployments. The world is changing to innovate and incorporate Artificial Intelligence in many applications and services. However, with all this excitement many Data Engineers are still struggling with how to get projects past the Proof-of-Concept phase (POC) and into Production. Production environments present a list of challenges. The 3 biggest challenges I see when moving from POC to Production are the following:

The gravity of data is just as real as the gravity in the physical world. As Deep Learning workloads continue grow so does the amount of data stored to train these models. The data has gravity that will attract services and applications to the data. The trouble here making sure you have correct Data pipelines Strategy on place.
Once I had dinner with one of the Co-founders of Hortonworks, during which he said “Everything as Scale is exponentially harder. Have you ever moved around photos on your desktop? For the most part this is an easy task except when you accidentally move a large set of photos. Instantly after moving these large folders you are endlessly waiting for the hour glass to finish. Image doing this with 10 PBs of data. I think you get the picture here.
The talent pool today compared to early days of “Big Data” is much larger. However, the demand for skills in Deep Learning, Machine Learning, and Data Engineering is stressing the system. Which still leaves a skills gap for experienced engineers with Deep Learning and Machine Learning skills. The skills gap is one huge factor for why many projects get stuck in the POC phase instead into production.

If you would like to know more about moving projects from POC to Production make sure to checkout my session if you are attending O’Reilly AI Conference in London. AI Growing Pains: Platform Considerations for Moving from POC to Large-Scale Deployments @ 11:55 on October 16, 2019.

Want More Data Engineering Tips?

Sign up for my newsletter to be sure and never miss a post or YouTube Episode of Big Data Big Question where I answer questions from the community about Data Engineering questions.

Deep Learning Python vs. Java

October 8, 2019 by Thomas Henson Leave a Comment

What About Java in Deep Learning?

Years ago when I left Java in the rear view of my career, I never imagined someone would ask me if they could use Java over Python. Just kidding Java you know it’s only a joke and you will always have a special place in my heart. A place in my heart that probably won’t run because I have the wrong version of the JDK installed.

Python is king of the Machine Learning (ML) and Deep Learning (DL) workflow. Most of the popular ML libraries are in Python but are there Java offerings? How about in Deep Learning can you use Java? The answer is yes you can! Find the differences between Machine Learning and Deep Learning libraries in Java and Python in the video.

Transcript

Hi, folks. Thomas Henson here, with thomashenson.com, and today is another episode of Big Data Big Questions. Today’s question comes in around deep learning frameworks in Java, not Python. So, find out about how you can use Java instead of Python for deep learning frameworks. We’ve talked about it here on this channel, around using neural networks and being able to train models, but let’s find out what we can do with Java in deep learning.

Today’s episode comes in and we’re talking about deep learning frameworks that use Java, not Python. So, today the question is, “Are there specific deep learning frameworks that use Java, not Python?” First off, let’s talk a little bit about deep learning, do a recap. Deep learning, if you remember, is the use of neural networks whenever we’re trying to solve a problem. We see it a lot in multimedia, right, like, we see image detection. Does this image contain a cat or not contain a cat?

The deep learning approach is to take those images [Inaudible 00:01:10] you know, if we’re talking about supervised, so take those labeled images, so of a cat, not of a cat, feed those into your neural network, and let it decide what those features are. At the end you get a model that’s going to tell you, is this a cat or is this not a cat? Within some confidence. Hopefully not 50%, maybe closer to 99 or 97. But, that’s the deep learning approach versus the machine learning approach that we’ve seen a good bit.

We talk about Hadoop and traditional analytics from that perspective is in machine learning we’re probably going to use some kind of algorithm like singular value decomposition, or PCI, and we’re going to take these images and we’re going to look at each one and we’re going to define each feature, from the cat’s ears to the cat’s nose, and we’re going to feed that through the model and it’s going to give us some kind of confidence. While the deep learning approach we get to use a neural network, it defines some of those features, helps us out a lot. It’s not magic, but it is a little bit, so really, really innovative approach.

So, the popular languages, and what we’ve talked most about on this channel and probably other channels and most of the examples you’ve seen are all around Python, right? I did do a video before where I was wrong on C++. There was more C++ in deep learning than I really originally thought. You can check that video out, where we kind of go through and talk about that and I come in and say, “Hey, sorry. I missed the boat on that one.” But, the most popular language, one… I mean, I did a Pluralsight video on it, Take CTRL of Your Career, around TensorFlow and using TFLearn. TensorFlow is probably far and away the most popular one. You’ve seen it with stats that are out there. Also PyTorch, Caffe2, MXNet, and then some other, higher-level languages where Keras is able to use some of TensorFlow and be a higher-level abstraction, but most of those are going to use Python and then some of them have C++. Most examples that you’re going to see out there, just from my experience and just working in the community, is Python. Most people are looking for those Python examples.

But, on this channel, we’ve talked a lot about options and Hadoop for non-Java developers, but this is an opportunity where all you Java developers out there, you’re looking for, “Hey, we want to get into the deep learning framework. We don’t want to have to code everything ourselves. Are there some things that we can attach onto?” And the answer is yes, there are. It’s not as popular as Python right now, or R and C++ in the deep learning frameworks, but there is a framework called Deeplearning4j that is a Java-based framework. The Java-based framework is going to allow for you to use Java. You could still use Python, though. Even with the framework, you can abstract away and do Python, but if you’re specifically a Java developer and looking to… I mean, maybe you want to get in and contribute to the Deeplearning4j community and be able to take it from that perspective, or you’re just wanting to be able to implement it in some projects. Maybe you’re like, “Hey, you know what? I’m a Java developer. I want to continue doing Java.” Java’s been around since ’95, right? So, you want to jump into that? Then Deeplearning4j is the one for you.

So, really, maybe think about why would you want to use a Java-based deep learning framework, for people that maybe aren’t familiar with Java or don’t have it. One of the things is it claims to be a little bit more efficient, so it’s going to be more efficient than using an abstraction layer from that perspective in Python. But also, there’s a ton of Java developers out there, you know, there’s a community. Talked about how it’s been around since ’95, so there’s an opportunity out there to tap into a lot of developers that have the skills to be able to use it and so, there’s a growing need, right? There’s communities all around the globe and different little subsets and little subareas. Java’s one of those.

I mean, if you look at what we did from a Hadoop perspective, so many people that were Java developers moved to that community, also a lot of people that didn’t really do Java. It’s a lot like, like I said, at the point I was at in my career, I was more of a .NET C# developer. Fast forward to getting into the Hadoop community, went back to my roots as a Java, so I’d done some Java in the past, and went through that phase. And so, for somebody like me, maybe I would want to go back out. I don’t know. I’ve kind of gone through more Python, but a lot of different options out there. Just being able to give Java developers a platform to be able to get involved in deep learning, like, deep learning is very popular.

So, those are some of the reasons that you might want to go, but the question is, when you think about it, so if I’m not a Java developer, or what would you recommend? Would you recommend maybe not learn TensorFlow and go into Deeplearning4j? You know, I think that one’s going to depend… I mean, we say it a lot in here. It’s going to depend on what you’re using in your organization and what your skill set is. If you’re mostly a Python person, my recommendation would be continue on or jump into the TensorFlow area. But if you’re working on a project that is using Deeplearning4j then by all means go down that path and learn more about it. If you’re a Java developer and you want to get into it, you don’t want to transition skills or you’re just looking to be able to test something out and play with it, and you don’t want to have to write it in Python, you want to be able to do it in Java, yeah, use that.

These are all just tools. We’re not going to get transfixed on any tool. We’re not going to go all in and say, “You know what? I’m only going to be a Java developer,” or, “I’m only going to be this.” We’re going to be able to transition our skills and there’s always going to be options out there to do it. And in these frameworks too, right? Deeplearning4j is awesome, but maybe there’s another one that’s coming up that people would want to jump into, so like I said, don’t get so transfixed with certain frameworks. Like, Hadoop was awesome. We broke it apart. A lot of people navigated to Spark and still use HDFS as a base. There’s always kind of skills that you can go to, but if you go in and say, “Hey, I’m only going to ever do MapReduce and it’s always going to be in Java,” then you’re going to have some challenges throughout your career. That’s not just in data engineering, that’s throughout all IT. Heck, probably throughout all careers. Just be able to be flexible for it.

So, if you’re a Java developer, if you’re looking to test some things out, definitely jump into it. If you don’t have any Java skills and it’s not something that you’re particularly wanting to do, then I don’t recommend you running in and trying to learn Java just for this. If you’re doing Python, steady on with TensorFlow, or PyTorch, or Caffe, whatever you’re using.

So, until next time. See you again on Big Data Big Questions. Make sure you subscribe and ring that bell so you never miss an episode. If you have any questions, put them in the comment section here below. Thanks again.

Want More Data Engineering Tips?

Sign up for my newsletter to be sure and never miss a post or YouTube Episode of Big Data Big Question where I answer questions from the community about Data Engineering questions.

What Is A Generative Adversarial Network?

July 18, 2019 by Thomas Henson Leave a Comment

Generative Adversarial Networks

What are deep fakes? How are they generated? On today’s episode of Big Data Big Questions we tackle how Generative Adversarial Networks work. Generative Adversarial Networks or GANs work with 2 neural networks one a generator and another a discriminator. Learn about my experience with GANs and how you can build one as well.

Transcript What Is A Generative Adversarial Network?

This is going to be a cool episode, Mr. Editor. We’re going to talk about a painting that was built by AI or designed by AI that went for over $400,000. Crazy.

Hi folks! Thomas Henson here with thomashenson.com. Today is another episode of Big Data Big Questions. Today, we’re going to talk about Generative Adversarial Neural Networks. We’re going to talk about a painting, so you’ve all probably heard about a painting that was sold for, like, $400,000. It was built, actually, but a Generative Adversarial Network. We’re going to talk about that, explain what that is, and maybe even look at a little bit of code, and tell you how you can learn more about it.

Before we jump in, I definitely want to say, if you have any questions about data engineering, data science, IT, anything, put them in the comment section here below. Reach out to me at thomashenson.com/big-questions. I’ll try my best to answer them and help you out. It’s all about the community and all about having fun. Today, we’re going to have a lot of fun. I’m excited. This is something that I’ve been researching and looking into since, maybe, at least since the first part of 2019, but for sure it’s been a theme for me for a while.

I want to talk about Generative Adversarial Network, what that is. We think about that from a deep learning perspective. We’ve done some videos. We talk about deep learning, but this is a specific kind, so kind of like [Inaudible 00:01:33] neural networks, this is a little bit different. It still uses the premise of, you have your input layer, you have your hidden layers, and you have your output layer, but it’s a little more complexity to it. It’s been around since 2014. Ian Goodfellow is branded as the creator to that. If you follow Andrew Neen [Phonetic 00:01:52] on Twitter, I just saw where he took a role at Facebook. I think it was a competitive thing, and I think Andrew was saying, “Hey, great pickup for Facebook for picking him up,” but you might want to fact check that.

Like I said, that was breaking news here. Generative Adversarial Network. The way that I like to think about that and describe that is, think of it as having two different neural networks that are working. You have your discriminator and you have your generator. What’s going on is your generator is taking data. Think of, we’ve got, let’s say, a whole bunch of images of people. What’s going on is, our generator is going to take that data set and look at it, and it’s going to try to create fake data that looks like real data. Your discriminator is the one that’s sitting there saying, “Hey, wait a minute. That’s real data. This is fake data.” This is real data, that’s fake data. Just continuing on. You keep going through that iteration, until the generator gets so good, he’s able to pass fake data onto the discriminator. For our example, we’re looking at images of people. What you’re trying to do is, you’re trying to generate data of fake people and pass it through as real people. You’re probably like, “Man. How really good is that?”

Check out this website here. These are fake people. These are not real people. These are really good images, and a little bit creepy. I found this, actually, in the last week, and kind of looked at it. Been sharing it internally with some friends and some colleagues, but man. It’s really interesting when you think about it. These people do not exist. There’s no, these people don’t exist on the planet. These were all built by AI or deep learning. It’s pretty cool. Pretty creepy, too.

You’re probably wondering, “That’s pretty cool.” Been around since 2014. I’m researching it. Should I be researching it? I definitely think it’s something that’s going to be out there. There’s a lot of information around it, and a lot of use cases, kind of don’t know where it’s going to go. I can think of it being used for game development. Being able to create worlds. For somebody that’s creating a game that’s going to have multiple, multiple different levels, or even if GIS, you have to create all these landscapes and everything like that. If you can build AI to automate that, if you use a deep learning algorithm that’s going to automate, and build out those worlds, and make them lifelike, how much busy work is that going to save you? Same thing with GIS and in architecture, but also go back to the website we were just looking at, with the fake people. Oh, my gosh! You can use that in media and entertainment. Think about movies. Maybe we don’t even need actors anymore. That’s a little bit scary. For the actors, I don’t know. You still need Thomas Henson and thomashenson.com on YouTube, right?

Really cool. Something I just wanted to share with everybody, and back to what we were talking about in the first part of the show. The first art that was really sold for big ticket item around AI, over $400,000, and it was a generated image, too. I talk a little bit about it in my implementing TF Learn course, but here’s a code sample, really just showing what’s going on. If you’re looking at it, and all this is done in TensorFlow, here, using the extraction layer of TF Learn. Look here, how we’re creating that generator, and how you’re creating a discriminator. It’s a good bit of code here, but really, this is an example from TF Learn examples, where you’re actually starting to general data in here. It’s pretty cool. Pretty awesome to be able to play with if you have Tensorflow installed in your environment. You can actually do an import TF learn and start running this code from the examples here, and start tweaking with it. Really cool.

I you want to learn more, definitely love for you to check out and tell me all about. Go through my TF Learn course. Tell me all about it if you like it. You don’t have to, but I just thought sharing Generative Adversarial Networks, I thought that was pretty cool. I think it’s something that everybody should learn. At least know a little bit about it. Now, you know. Hey, important thing. I’ve got my generator. I’ve got my discriminator. My generator is making the data that’s trying to pass this real data to my discriminator.

Boom! You understand a lot. Thanks for tuning in. If you have any questions, put them in the comment section here below, and make sure you subscribe just so you never miss an episode, and get some great education around Big Data Big Questions.

Nobody can! Nobody can generate a fake image of me!

Challenge accepted?

Review Coursera’s Neural Networking & Deep Learning Course

July 17, 2019 by Thomas Henson Leave a Comment

Another Machine Learning Course?

Yet another machine learning course has caught my attention here lately. Andrew Ng has a new course available on Coursera focused on Neural Networks and Deep Learning. How did I like the course and should you take the course? Find out my thoughts on Coursera’s Neural Network and Deep Learning course.

Transcript- Review Coursera’s Neural Networking & Deep Learning Course

Hi folks! Thomas Henson here with thomashenson.com. Today is another episode of Big Data Big Questions. Today’s questions comes in around a new course that I am taking, myself. It’s not a course that I’m writing. I’ve talked about some of my Pluralsight courses. This is actually a deep learning course that I’m taking with Coursera. It’s the second course that I’ve taken with Coursera. I did another one from Andrew Ng called, I think, learning machine learning, and just went through that portion, and swore I’d never take another one, and here I am again. Find out my review on that course and how I’m doing on it here in just a second.

[Sound effects]

Today’s question is around what I’m doing from a course perspective. I’m taking a course called neural networks and deep learning. This is actually part one in a large certification series. If you go out to deeplearning.ai, it’s an Andrew Ng specific course. I did his machine learning course before, and went through it, and did some reviews with it on another channel with a group, the Big Data Beer Team. You can always check that out and find that.

I swore I’d never do another course, and here I am doing another one, because the math portion for me is a little more into the weeds than I like to be and really think, from a data engineering perspective, it probably is. Either way, my thing is to do this review and give you all the insights. You can decide if you want to take that course and find out where you are. I’m through part one. The neural networks and deep learning is part one in that course. It’s an Andrew Neen course, so he’s like, probably trained more people around machine learning and deep learning than anybody else on the planet. Worked at Badu, at Google, Stanford, and has his own company, own startup where he’s walking through driverless cars. Huge authoritative figure who’s teaching this course. It’s amazing from that aspect of it.

Little bit overwhelming, I’ll tell you. We’ll get into it a little bit, but each part of these courses are broken into, I think, four weeks. This first one was four weeks. We’re going to go through how I felt through each of the four weeks, and give you my thoughts on that.

In the first week, week one was intro to deep learning, and really it was about the why for deep learning. Why is deep learning? What’s the history of it? Is this anything new? Is this going to solve all our problems in the future? Eh, maybe.

Maybe we don’t get into that as much, but this was a pretty good one, and I actually did, with each one of these courses, there is a heroes in AI interview session. If you like watching YouTube videos like you do now, this is similar to that, but it’s behind the paywall, or behind the course wall there in Coursera. I actually went through that, when I did not get through all of them, but I did go through this one. It was pretty good. Can’t really remember who it was. Maybe shame on me for that. Should’ve put that in my notes.

Week one was pretty easy to step through and everything like that. There might’ve been a quiz or something, but no programming aspects from that perspective. Week number two, logistic regression in neural networks. Probably my least favorite portion of the course so far. A lot of math-based and somewhat of a review. Actually, when I got to this portion, I was like, “Man, this is…” I was going through the course material and watching the videos. I was like, “This is kind of a review from what I did in the machine learning course.”

I’m going to ace everything here, and I did ace the quiz. It wasn’t too hard, but when we stepped into the programming, it was a little more complicated than I thought, and I have some reasons why I think that is, and I’m going to talk about those here at the end. For the most part, week two was really just a level set. Hey, remember, this is the cross-function. This is how we use linear regression, and just walking through some of those portions, to be able to say hey, this is what’s going on behind the scenes.

If you’ve gone through like I have, and implemented networks, and played around with Tensorflow or TF Learn, you already know some of the things that are going on, which maybe you don’t understand it fully. This was a good review to start off to that perspective. If you haven’t taken the machine learning course, no problem. You can jump right into it. Like I said, he takes it from a high level here and gets you going.

Week three. My favorite week. We talked about shallow neural networks. This is the basics of how to build a neural network. What I like the most about this was, we deep dived into why non-linear functions and why we use different activation functions. It was really cool, because I actually taught a portion of this in my course, and just it was cool to see how Andrew was able to explain it. Maybe not a whole lot better than me. I don’t want to undersell myself, but it was definitely awesome to see his background, and his thought process, and just him saying, “Hey, this is why we use [Inaudible 00:05:19], these are some of the things that you’re going to see with it.” Don’t worry about it, because of these reasons. Really, my favorite portion of this course was week three, so around the shallow neural networks. Still went through and took a little bit longer to do the programming exercise than I thought would take me.

Little bit of stress there, but quizzes were good. It was easy if you follow along, and just take good notes, and you’ll be able to pass the quizzes. There’s a new thing that they’re trying out, too, called notes. I’ve started playing around with that. I’ll probably, in my next video, talk a little bit about that as I’m using it more and more, and maybe that’ll be a quick tip that you guys can use whenever you’re going through a course on Coursera.

Week four, not my favorite week. It was pretty good. We started getting into deep learning and deep neural networks and how those are working. Some of the things that we really did was talked about the matrix dimensions and how some of that works. Didn’t get into it as much as they will in future courses. It’s easy for me to look at it now and say that, because I jumped ahead a little bit. From the perspective of this course, neural networks and deep learning part one, really talks through some of the matrix portions and then starts building out your deep networks. Also, talks about parameters and hyperparameters. I was familiar with hyperparameters and parameters before, just with having been hands-on before, but it was really helpful to do those.

The quiz in this one, once again, if you paid attention, you went through it. You have to work through some math and do some other portions of it, but the quizzes are pretty simple. Make sure you’re using your own notes and everything for that. When it came to the programming exercises, I think there was two in this week, and they were somewhat difficult. I think the second one was pretty long as far as building out. You get to get hands-on with Tensorflow. Still a little bit more challenging, I guess, I think, and there’s some ways that we can make it a little bit better. Let me talk about that here just next.

Overall, I thought the course was all right. It was good for me, just some of it was a little bit of a review. Some of it went a lot deeper than I’ve dove in before, so I thought that portion was good for me. I will say, on all the programming exercises, they’re all graded. One of the things that I find challenging, and maybe it’s just the way that I learn, but I feel like they’re a little harder just because you go through, and it’s like you’re being tested day one. Whenever you’re going through the videos and everything, you’re doing everything from a math perspective on paper, or if you’re taking digital notes, but you’re not really doing any of the programming functions. If you don’t have a solid basics in programming, or it’s not something that you do every day from that perspective, I think it’s going to be a little more challenging. One of the things that could help out, I think, and broaden for the students that are coming in would be to have more coding examples that aren’t graded. It doesn’t have to be verbatim. Hey, this is really, really close to what the examples are. I get that you want to test, and you want to make it so that you’re applying what you’re learning.

Also, I think a few more coding examples where you can go through and see, “These are some of the steps.” If you understand the math portion of it, doesn’t necessarily mean that you’re going to be able to go in and be able to program it right there, and when we talk about it from a real-world perspective, whenever I look at it, yeah, you need to understand those things, and know how to implement those at a base level, but there’s so many. There’s so many other things it can do from a high level. For example, one of the biggest challenges I had going through this was, I build a whole course around TF Learn, and being able to use that abstraction layer over Tensorflow. For me, having to go through step by step, and showing how you can do this, where you can write it in TF Learn or use one of those functions, I think that would’ve been… That would be a different approach to take it, and I think that would broaden the audience, and make it a little more enjoyable, too.

If you’re having to go through, and you know that writing these 60 lines of code is something that you can write in 4, it makes it a little bit harder, especially since I already just did all the math portion, and kind of went through all those activations and everything work, versus having to go through some of the minutia on the programming. That’s just my two cents. If you’ve taken this course, please tell me. Tell me your opinion. You’re listening to mine. Let’s make this a conversation. I’d love to hear what some of your thoughts are, where you think I’m wrong if you think I should be better at math. You’re probably right. I think I’m getting the math. We’ll see.

Fair enough, my programming skills in Python, like I said, they’re all right. They’re not to the level here. I think that’s another gap that I found going through this course. All in all, I guess I would recommend it if you’re looking into using deep learning, but I don’t think that, if you’re a data engineer, that you have to go through anything like this. Like I said, it’s a good aspect of it, but there’s some other things and other skills that you probably want to get. If you’re more looking to the data science, or deep learning, or machine learning engineer, then going through something, one of these, this course would probably be pretty good. In the next video, check out, I jumped way too ahead in the next course. You might see. I jumped to, I think, the fifth portion or fourth portion when I was supposed to go to the second portion. I’ll talk about that in the next video. If you have any questions, make sure you put them in the comments section here below, or reach out to me on thomashenson.com/big-questions. Find me on Twitter or Instagram. Ask any questions. I’ll try my best to answer them. Make sure you subscribe so that you never miss an episode, and ring that bell. Thanks again.

Will AI Replace Data Scientist?

July 12, 2019 by Thomas Henson Leave a Comment

Will AI Take My Job?

Artificial Intelligence is disrupting many different industries from transportation to healthcare. With any disruptions fear begins to pop around how that will impact me! One question poised on Big Data Big Questions was if “AI Will Replace Data Scientist”. We are truly in the early days of AI and Deep Learning but let’s look forward to see if AI will be able to replace Data Scientist. Find out my thoughts on AI Replacing Data Scientist by watching the video below.

Transcript – Will AI Replace Data Scientist?

Hi folks! Thomas Henson here with thomashenson.com. Today is another episode of Big Data Big Questions. Today’s question comes in from a viewer. If you have a question, put it in the comment section here below, or you can reach out to me on thomashenson.com /big-questions. I’ll do my best to answer your question. This one came in around, “Hey, you know, is software going to replace data science?”

Whenever I think about software, specifically we’re probably talking about artificial intelligence. Artificial intelligence, or machine learning, or deep learning, or any of those models, are we going to be able to build models that can replace the data scientist?

This is a common theme, if you go out and Google anything right now, you can see, “Will AI replace lawyers?” Will AI replace doctors? All kinds of different things. Unequivocally, I think the short answer is no, but I’m going to talk about what I think are some of the reasons that I don’t think that AI is going to replace data scientists. Also, at the end, I’m going to give you some industry experts on what they think and what they’ve said about that whole concept.

Let’s jump in. Let’s talk a little bit about what a data scientist is, and then, talk about how we would even begin to look at how AI would replace that. Remember before, when we talked about data scientists in the past. These are the types of people that are trying to work on finding data that can build a model that might be able to predict an outcome. If we can predict the outcome, then maybe we can do something prescriptive. Hey, this is what’s going to happen, so let’s do this portion here after something happens. Think of if you’re creating, building a model to detect insider threats. You want to be able to decide, “Okay, does this user, maybe they’re potentially an insider threat.” Once you’ve identified that, maybe you can drop their access. Be prescriptive [Inaudible 00:02:04] it. Drop access that they have to certain directories, certain folders, and then also alert security.

We’re wanting to be able to build applications or models like that, that can be able to help. Can artificial intelligence do all that, kind of take the data scientist out? I don’t believe that’s the case. That’s very, very hard. If we really look at AI, and what’s going on right now, any time you hear the word AI, replace that with automation, and you’re like, “Okay, now I understand what’s going on.” Really, we’re not at the point where we’re actually building these super intelligent systems, kind of like what you see in Hollywood. I’m going to give you three different reasons around why I think that AI is not going to replace or software is not going to replace data scientists.

The first thing is, when we think about it, artificial intelligence has been around for quite some time. The term has, we’re getting better with our models. If you listen and read some of the books that I’ve read, we’re in that implementation phase where we’re putting these things out there. If you really look at it, even in the past, when we talk about the world’s best chess player versus artificial intelligence, we got to a point in the late ’90s where the world’s best chess player could win, or I’m sorry, the machine would beat the world’s best chess player. However, if you took a medium machine or artificial intelligence that was pretty good at chess, you paired it with a pretty good or an advanced human chess player, they could beat the world’s best machine learning model, or deep learning, or AI chess player. Same thing. What we’re doing, I think, the tools and the skills that you’re seeing being implemented for data scientists are about how we can help, right? What are the types of tools that can help us identify quickly maybe some complex algorithms that would work. Should I use a Generative Adversarial Network here? Should I used a convolutional neural network, or different types of things there?

Same thing that we’re seeing in the medical industry. Doctors aren’t going to be taken out of the loop, but doctors are going to be given maybe a voice assistant that you can prescribe and give the different, these are some of the symptoms that we’re seeing. What are some of the latest journal articles, and giving a summary to that, versus your data scientist or your medical, somebody in the medial field, they’re having to go out, and there’s always research, and research papers that they could be reading, and could be intaking, same thing here. You’re going to have assistants as a data scientist, to be able to say, “Look, what are…?” Run some stats on this, and let’s see what models might be good indicators here. I’m still in the loop. I’m still deciding what we’re going to do from that model, but it’s going to help me streamline and get faster, what we’re doing.

Number two, really simple, just go out there and look at the talent gap. We’re still looking for data scientists. That’s, go do a Google search, and you’re finding that there’s a ton of different open job applicants. If you go to any kind of symposium. There was a symposium over at Georgia Tech. One of the people from Google there was talking, and they were like, “Hey, man, I will take every PhD or even Master’s level candidate you have around data science and statistics,” and everything like that. There’s still a huge, huge talent gap there, and I don’t think it’s going to be cured by AI. Like I said, I think it’s going to be about automating, and then maybe AI can help us to train better humans that can fill those roles, but I think that’s another indication that, man, I don’t even know that we’re at our peak in data science. Just from a hype cycle perspective, either.

Number three, the industry experts. If you look at Andrew Neen, you look at Kai [Inaudible 00:05:41], you look at what their predictions are, data science is in one of those quadrants where it’s like, “Hey. It’s not a simple task that can be repetitive.” You’ve all seen the videos where it’s like, hey, robots, and AI can help on assembly lines. It’s a controlled environment. Data science is not controlled. It’s out there. It’s in the wild, and you’re having to, “This model,” or even ETL. We can’t even fix ETL. We’re still having to rely on human beings to help and automate, and make sure that we’re curating the right data sets, too. We’re still not at that point, and even if we do get to that point from an ETL perspective, still going to have to have data scientists. No, AI will not replace data scientists in the near future. All that’s subject to change. There could be advances in technology in 10 years that I don’t foresee. I’m not a futurist yet. Maybe, I don’t l know. I don’t have enough education, I guess, or understanding to be that. If you have any questions, put them in the comment section below. Make sure you subscribe, so that you never miss an episode of Big Data Big Questions. Ring that bell. Until next time, see you again. Big Data Big Questions.

Learning Tensorflow with TFLearn

February 11, 2019 by Thomas Henson Leave a Comment

Recently we have been talking a lot about Deep Learning and Tensorflow. In the last post I walked through how to build neural networks with Tensorflow . Now I want to shift gears to talk about my newest venture into Tensorflow with TFLearn. The lines between deep learning and Hadoop are blurring and data engineers need to understand the basics of deep learning. TFLearn offers an easy way to learn Tensorflow.

What is TFLearn?

TFLearn is an abstraction framework for Tensorflow. An abstraction framework is basically a higher level language for implementing lower level programming. A simple way to think of abstraction layers is it reduces code complexity. In the past we used Pig Latin to abstract away Java code for Tensorflow we will use TFLearn.

TFLearn offers a quick way for Data Engineers or Data Scientist to start building Tensorflow neural networks without having to go deep into Tensorflow. Neural Networks with TFLearn are still written in Python, but the code is drastically reduced from Python Tensorflow. Using TFLearn provides Data Engineers new to Tensorflow an easy way start learning and building their Deep Neural Networks (DNN).

Pluralsight Author

Since 2015 I’ve been creating Data Engineering courses through Pluralsight. My latest course on TFLearn titled Implementing Multi-layer Neural Networks with TFLearn is my sixth course on Pluralsight. While I’ve developed courses in the past this course was in two major areas: Implementing Multi-layer Neural Networks is my first course in the deep learning area. Second this course is solely based on coding in Python. Until now I had never done a coding course per say.

Implementing Multi-layer Neural Networks with TFLearn

Implementing Multi-layer Neural Networks with TFLearn is broken into 7 modules. I wanted to follow closely with the TLearn documentation for how the functions and layers are broken down. Here are the 7 modules I cover in Implementing Multi-layer Neural Networks with TFLearn:

TFLearn Course Overview – Breakdown of what is covered in this course around deep learning, Tensorflow, and TFLearn.
Why Deep Learning – Why do Data Engineers need to learn about deep learning? Deep dive into the basic terminology in deep learning and comparison of machine learning and deep learning.
What is TFLearn? – First start off by defining TFLearn and abstraction layers in deep learning. Second we breakdown the differences between Tensorflow and TFLearn. Next we run through both the TFLearn and Tensorflow documentation. Finally we close out the module by building your TFlearn development environment on you machine or in the cloud.
Implementing Layers in TFLearn – In deep learning layers are where the magic happens so this where we begin our Python TFLearn coding. In the first example we build out neural networks using the TFLearn core layers. Our second neural network we build will be a Covolutional Neural Network (CNN) with out MNIST data source. After running our CNN it’s time to build our 3 neural network with a Recurrent Neural Network (RNN). Finally we close out the module by looking at the Estimators layers in TFLearn.
Building Activations in TFLearn – The activations module give us time to examine what mathematical functions are being implemented at each layer. During this module we explore the different activiations available in Tensorflow and TFLearn.
Managing Data with TFLearn – Deep learning is all about data sets and how we train our neural networks with those data sets. The Managing Data with TFLearn module is all about the tools available to handle our data sets. In the last topic area of the data module we cover the implications and tools for real-time processing with Tensorflow’s TFLearn.
Running Models with TFLearn – The last module in the Implementing Multi-layer Neural Networks with TFLearn Pluralsight course in all about how to run models. During the course we have focused mainly on how to implement Deep Neural Networks (DNN) but in this module we introduce Generative Neural Networks (GNN). Finally after comparing DNNs and GNNs we look to the future of deep learning.

Honest Feedback Time

I would love some honest feedback on this course:

How did you like?
Would you like to see more deep learning courses?
What could be better?

Feel free to put these answers in the comment section below or send me an email.

Hello World Tensorflow – How This Data Engineer Got Started with Tensorflow

January 28, 2019 by Thomas Henson 2 Comments

My Tensorflow Journey

It all started last year when I accepted the challenge to take Andrew Ng’s Coursera Machine Learning Course with the Big Data Beard Team. Now here I am a year later with a new Pluralsight course diving into Tensorflow (Implementing Neural Networks with TFLearn) and writing a blog post about how to get started with Tensorflow. For years I have been involved on the Data Engineering side of Big Data Projects, but I thought it was time to take a journey to see what happens on the Data Science side of these projects. However, I will admit I didn’t start my Tensorflow journey just for the education, but I see an opportunity for those in the Hadoop ecosystem to start using the Deep Learning frameworks like Tensorflow in the near future. With all that being sad let’s jump in and learn how to get started with Tensorflow using Python!

What is Tensorflow

Tensorflow is a Deep Learning framework and the most popular one at this moment. Right now there are about 1432 contributors to Tensorflow compared to 653 Keras (which offers abstraction layer for Tensorflow) from it’s closet competitor. Deep learning is related to machine learning, but uses neural networks to analyze data. Mostly used for analyzing unstructured data like audio, video, or images. My favorite example is trying to identify cats vs. dogs in a photo. The machine learning approach would be to identify the different features like ears, fur, color, nose width, and etc. then write the model to analyze all the features. While this works it puts a lot of pressure on the developer to identify the correct features. Is the nose width really a good indicator for cats? The deep learning approach is to take the images (in this example labeled images) and allow the neural network to decide which features are important through simple trial and error. No guess work for the developer and the neural network decides which features are the most important.

Source – KDNuggets Top 16 DL Frameworks

Tensorflow is open source now, but has it’s root from Google. The Google brain team actually developed Tensorflow for it’s use of deep learning with neural networks. After releasing a paper on disbelief (Tensorflow) Google released Tensorflow as open source in 2017. Seems eerily familiar to Hadoop except Tensorflow is written in C++ not Java but for our purposes it’s all Python. Enough background on Tensorflow let’s start writing a Tensorflow Hello World model.

How To Get Started with Tensorflow

Now that we understand about deep learning and Tensorflow we need to get the Tensorflow framework installed. In production environments GPUs are perferred but CPUs will work for our lab. There are a couple of different options for getting Tensorflow installed my biggest suggestion for Window user is use a Docker Image or an AWS deep learning AMI . However, if you are a Linux or Mac user it’s much easier to run a pip install. Below are the commands I used to install and run Tensorflow in my Mac.
$ bash commands for install tensorflow
using env

Always checkout the official documentation at Tensorflow.

Tensorflow Hello World MNIST

from __future__ import print_function
import tensorflow as tf

a = tf.constant(‘Hello Big Data Big Questions!’)

#always have to run session to initialize variables trust me 🙂
sess = tf.Session()

#print results
print(sess.run(a))

Beyond Tensorflow Hello World with MNIST

After building out a Tensorflow Hello World let’s build a model. Our Tensorflow journey will begin by using a neural network to recognize hand written digits. In the deep learning and machine learning world the famous Hello World is to use the MNIST data set to test out training models to identify hand written digits from 0 – 9. There are thousands of examples on Github, text books, and on the official Tensorflow documentation. Let’s grab one of my favorite Github repo for Tensorflow by Americdamien.

Now as Data Engineers we need to focus on being able to run and execute this Hello World MNIST code. In a later post we can cover behind the code. Also I’ll show you how to use a Tensorflow Abstraction layer to reduce complexity.

First let’s save this code as mnist-example.py

“”” Neural Network.
A 2-Hidden Layers Fully Connected Neural Network (a.k.a Multilayer Perceptron)
implementation with TensorFlow. This example is using the MNIST database
of handwritten digits (http://yann.lecun.com/exdb/mnist/).
Links:
[MNIST Dataset](http://yann.lecun.com/exdb/mnist/).
Author: Aymeric Damien
Project: https://github.com/aymericdamien/TensorFlow-Examples/
“””

from __future__ import print_function

# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets(“/tmp/data/”, one_hot=True)

import tensorflow as tf

# Parameters
learning_rate = 0.1
num_steps = 500
batch_size = 128
display_step = 100

# Network Parameters
n_hidden_1 = 256 # 1st layer number of neurons
n_hidden_2 = 256 # 2nd layer number of neurons
num_input = 784 # MNIST data input (img shape: 28*28)
num_classes = 10 # MNIST total classes (0-9 digits)

# tf Graph input
X = tf.placeholder(“float”, [None, num_input])
Y = tf.placeholder(“float”, [None, num_classes])

# Store layers weight & bias
weights = {
‘h1’: tf.Variable(tf.random_normal([num_input, n_hidden_1])),
‘h2’: tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),
‘out’: tf.Variable(tf.random_normal([n_hidden_2, num_classes]))
}
biases = {
‘b1’: tf.Variable(tf.random_normal([n_hidden_1])),
‘b2’: tf.Variable(tf.random_normal([n_hidden_2])),
‘out’: tf.Variable(tf.random_normal([num_classes]))
}

# Create model
def neural_net(x):
# Hidden fully connected layer with 256 neurons
layer_1 = tf.add(tf.matmul(x, weights[‘h1’]), biases[‘b1’])
# Hidden fully connected layer with 256 neurons
layer_2 = tf.add(tf.matmul(layer_1, weights[‘h2’]), biases[‘b2’])
# Output fully connected layer with a neuron for each class
out_layer = tf.matmul(layer_2, weights[‘out’]) + biases[‘out’]
return out_layer

# Construct model
logits = neural_net(X)
prediction = tf.nn.softmax(logits)

# Define loss and optimizer
loss_op = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
logits=logits, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
train_op = optimizer.minimize(loss_op)

# Evaluate model
correct_pred = tf.equal(tf.argmax(prediction, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# Initialize the variables (i.e. assign their default value)
init = tf.global_variables_initializer()

# Start training
with tf.Session() as sess:

# Run the initializer
sess.run(init)

for step in range(1, num_steps+1):
batch_x, batch_y = mnist.train.next_batch(batch_size)
# Run optimization op (backprop)
sess.run(train_op, feed_dict={X: batch_x, Y: batch_y})
if step % display_step == 0 or step == 1:
# Calculate batch loss and accuracy
loss, acc = sess.run([loss_op, accuracy], feed_dict={X: batch_x,
Y: batch_y})
print(“Step ” + str(step) + “, Minibatch Loss= ” + \
“{:.4f}”.format(loss) + “, Training Accuracy= ” + \
“{:.3f}”.format(acc))

print(“Optimization Finished!”)

# Calculate accuracy for MNIST test images
print(“Testing Accuracy:”, \
sess.run(accuracy, feed_dict={X: mnist.test.images,
Y: mnist.test.labels}))

Next let’s run our MNIST example

$ python mnist-example.py

…results will begin to appear here…

Finally we have our results. We get a 81% accuracy using the sample MNIST code. Now we could better and get closer to 99% with some tuning or adding different layers but for our first data model in Tensorflow this is great. In fact in my Implementing Neural Networks with TFLearn course we walk through how to use less lines of code and get better accuracy.

Learn More Data Engineering Tips

Sign up for my newsletter to be sure and never miss a post or YouTube Episode of Big Data Big Question where I answer questions from the community about Data Engineering questions.

17 Deep Learning Terms Every Data Scientist Must Know

September 17, 2018 by Thomas Henson Leave a Comment

What does a Data Scientist do?

Data Scientist are changing the world but what do they really do? Basically a Data Scientist’s job is to find correlation in data that might be able to predict outcomes. Most of the time their job is spent data cleansing and building models using their heavy math skills. The development and architecture cluster management is ran by the Data Engineer. If you like math and love data then Data Scientist might be the right career path for you. Recently Deep Learning has emerged as a hot field within the Data Science community. Let’s explore some of the basic terms around Deep Learning.

What is Deep Learning

Deep Learning is a form of Artificial Intelligence where Data Scientist use Neural Networks to train models. Neural networks are comprised of 3 layers that allow for models to be trained by mimicking the way our brains learn. In Deep Learning the features and weights of the features are not explicitly programmed, but learned by the Neural Network. If you are looking to compare Machine Learning to Deep Learning just remember, machine learning is where we define the features, but Deep Learning is where the Neural Network will decide the features. For example a Machine Learning dog breed detection model would require us to program the features like ear length, nose size, color, height, weight, fur, etc. In Deep Learning we would allow the Neural Network to decide the features and weights for those features. Most of the Deep Learning environment will use GPUs to take advantage GPUs ability to quickly compute computations versus CPUs.

Must Know Deep Learning Terms

So are you ready to take on the challenge of Deep Learning? Let’s start out with learning the basic Deep Learning terms before we build our first model.

#1 Training

The easiest way to understand training in Deep Learning is to think of it as testing. In software development we talk about test environments and how you never code in production right? Deep Learning we refer to test as our training environment where we allow our models to learn. If you were creating a model to identify breeds of dogs, the training phase is where you would feed the input layer with millions and millions of images. During this phase both forward and backward propagation allow the model to be developed. Then it’s on the #2…

#2 Inference

If the training environment is basically test/development then inference is production. After building our model and throwing terabytes or petabytes to get as accurate as we can, it’s time to put it in production. Now typically in software development our production environment is larger than test/development. However, in Deep Learning it’s the inverse because these models are typically deployed on edge devices. One of the largest markets for Deep Learning has been in autonomous vehicles with the goal to deploy these models in vehicles around the planet. While most Data Engineers would love to ride around in a mobile data center it’s not going to be practical.

#3 Overfittiing

Data Scientist and Machine Learning Engineers can get so involved in solving a particular problem that the model create will only solve that particular data set. When a model follows too closely to a particular data set the model is overfitted to the data. The problem is more common because as Engineers we know what we are looking for when we train the models. Typically overfitting can be attributed to making models more complex than necessary. One way to combat overfitting is to never training test set. No seriously never train on testing set.

#4 Supervised Learning

Data is the most valuable resource, behind people for building amazing Deep Learning models. We can train the Data in two ways in Deep Learning. The first way is the Supervised Learning. In Supervised Learning we have labeled data sets where understand the outcome. Back to our Dog Breed detector we have millions of labeled images of different dog breeds to feed in our input layer. Most of Deep Learning training is done by Supervised Learning. Labeled data sets are hard to gather and take a lot of time from the Data Science team. Right now Data Wrangling is something we still have to spend a majority of time doing.

#5 Unsupervised Learning

The second form of learning in Deep Learning is Unsupervised Learning. In Unsupervised Learning we don’t have answer or labeled data data sets. In our dog breed application we would feed the images without label sets identifying the breeds. If Supervised Learning is costly on find labeled data then Unsupervised Learning is the easier form. So why not only use Unsupervised Learning? The reason is simple…we just are quite there from a technology perspective. Back in July I spoke with Wayne Thompson, SAS Chief Data Scientist, about when we will achieve Unsupervised Learning. He believes we are still 5 years out from significant break through in Unsupervised Learning.

#6 Tensorflow

Tensorflow is the most popular Deep Learning framework right now. The Google Brain team released Tensorflow to the open source community in 2015. Tensorflow is a Deep Learning framework that package together execution engines and libraries required to run Neural Networks. Both CPU and GPU processing can be run with Tensorflow but GPU is the chip of choice in Deep Learning.

#7 Caffe

Caffe is an open source highly scalable Deep Learning Framework. Both Python and C++ are supported as first class in Caffe. Caffe is another framework developed and still supported heavily by Facebook. In fact a huge announcement was released in May 2018 about the merging of both Pytorch and Caffe2 into the same codebase. Although Caffe is widely popular in Deep Learning it still lags behind Tensorflow is adoption, users, and documentation. Still Machine Learning Engineers should follow this framework closely.

#8 Learning Rate

Learning Rate is parameter used to calculate the minimal loss of function. In Deep Learning the learning rate is one of the most important tools for calculating the weights for feature in your models. Using a lower value learning rate in general provides more accurate results but takes a lot more time because it slows the steps down to find the minimal loss. If you were walking on a balance beam, you can take smaller steps to ensure foot placement, but it also increases your time on the balance beam. Same concept with Learning rate except we just taking longer time to find our minimal loss.

#9 Embarrassingly Parallel

Embarrassingly Parallel commonly used term in High Performance Computing for problems that can be parallelized. Basically Embarrassingly Parallel means that a problem can be split into to many many parts and computed. An example of Embarrassingly Parallel would be how each image in our dog breed application could be performed independently.

#10 Neural Network

Process that attempts to mimic the way our brains in that of computing. Neural Networks often referred to as Artificial Neural Networks are key to Deep Learning. When I first heard about Neural Networks I imagined multiple servers all connected together in the data center. I was wrong! Neural Networks is at the software and mathematical layer. It’s how the data is processed and guided through the layers in the Neural Network. See #17 Layers.

#11 Pytorch

Pytorch is an open source Machine Learning & Deep Learning framework (sound familiar?). Facebook’s AI group originally developed and released Pytorch for GPU accelerated workloads. Recently it was announced that Pytorch and Caffe2 would merge the two code bases together. Still a popular framework to be followed closely. Both Caffe & Pytorch were heavily used at Facebook.

#12 CNN

Convolutional Neural Network (CNN) is a type of Neural Network typically used visualization. CNNs use a forward feed processing that mimics the human brain which makes it optimal for visualizing images like in our dog breed application. The most popular Neural Network is the CNN because of the ease of use. Images are broken down pixel by pixel to process using a CNN.

#13 RNN

Recursive Neural Networks (RNN) differ from Convolution Neural Networks in they are a recurring loop. The key for RNNs is the feedback loop which act as the reward system for hitting desired outcome. During training the feedback loop helps train the model based on previous runs and desired outcome. RNNs are primary used with time series data because of the ability to loop through.

#14 Forward Propagation

In Deep Learning forward propagation is the process for weighting each feature to test the output. Data moves through the neural network in the forward propagation phase. In our example of the for dog bread assume feature of tail length and assign it a certain value for how much it matters for determining dog breed. After assigning a weight of the feature we then calculate if the assumption was correct.

#15 Back Propagation

Back propagation or backward propagation in training is moving backward through the neural network. This allows us to review how bad we missed our target. Essentially we used the wrong weight values in and the output was wrong. Remember the forward propagation phase is about testing and assigning weight thus the back propagation phase test why we missed.

#16 Convergence

Convergence is the process of moving closer to the correct target or output. Essentially convergence is where we are find the best solution for our problem. As Neural Networks continue to run through multiple iterations the results will begin to converge as reach the target. However, when results take a long time to converge it’s often times called poor convergence.

#17 Layers

Neural Networks are composed of three distinct layers: input, hidden, and output. The first layer is the input which is our data. In our dog breed application all the images both with and without a dog are our input layer. Next is the hidden layer where features for the data are given weights. For example, features of our dogs like ears, fur, color, etc are experimented with different weights in the hidden layer. Also the hidden layer is where Deep Learning received it’s name because the hidden layer can go deep. The final layer is the output layer where find out if the model was correct or not. Back to our dog breed application, did the model predict the dog breed or not? Understanding these 3 layers of a Neural Network is essential for Data Scientist to using Neural Networks.

Want More Data Engineering Tips?

Sign up for my newsletter to be sure and never miss a post or YouTube Episode of Big Data Big Question where I answer questions from the community about Data Engineering questions.

Rise of the Machine Learning Engineer

April 27, 2018 by Thomas Henson Leave a Comment

What is a Machine Learning Engineer?

Move over Data Scientist the Machine Learning Engineer is now the best role in Big Data Analytics. The Machine Learning Engineer is a hybrid mix of half Data Engineer and half Data Scientist, who can implement the data models and even make recommendation for new data sets. Find out why the Machine Learning Engineer is getting a lot of attention in 2018 by watching the video below.

Make sure to subscribe to my YouTube channel to never miss an episode of Big Data Big Questions.

Transcript – Rise of the Machine Learning Engineer

Hi, folks! Thomas Henson here, with thomashenson.com. Today is another episode of Big Data Big Questions. Today’s question comes in from a user, and this all are going to be about the machine learning engineer. What is a machine learning engineer? How does it differ from a data engineer or data scientist? We’re going to jump into all that right after this.

Welcome back. Today’s question comes in from a user, so before we jump into the question, make sure that you go and click on the subscribe, so that you never miss an episode. Also, if you have a question and you would like for me to answer it, about data engineering, about books, about business, anything around IT and specifically probably data analytics, make sure you put those in the comments section here below. Go to my website, thomashenson.com/bigquestions or use the hashtag #BigDataBigQuestions on Twitter. I will try my best to answer those as quickly as I can.

I’ve been getting a lot of questions in, and I’m really thankful for all the questions, and I am working through them as well. Today’s question comes in from a user. From the comments section on YouTube, Andrew Wiley [Phonetic]. He says, “Is it possible to learn both data science and data engineering?” This question stems off of the Cloudera certification. I’ve answered some questions around what is a data engineer, what is a data scientist, but this question is specifically, “Okay, is there a blended of two?” Is there one position that’s a blend of two?

I’ll say, for a while, there’s been a lot of confusion around, “Okay, if you’re a data scientist, you know how to stand up a Hadoop cluster, or if you know how to stand up a Hadoop cluster, you must be a data scientist. You’re a wizard, right?” This question is about, what about the blending of the two skills? Think about it from a web development perspective. For a long time, we had our web developers, and we had our back-end developers, and then we had the full-stack web developer. Now, we have a full-stack data engineer, and those are called machine learning engineers.

On a recent podcast out there, that O’Reilly did at Strata, they had a couple quests on talking about the rise of the machine learning engineer, and so I would say that if you’re looking to have skills with data science and data engineering, that position is going to be called a machine learning engineer. My view on how the machine learning engineer has come to fruition is in two parts. If you’re working in a small development or small analytics shop, most likely the data engineer, the person who’s putting together the code and running the system, there’s going to be one or two people on that. It’s going to be a really small team, who are going to be filling that role of a data scientist.

There’s a lot. There’s a big skills gap for data engineers and even more so with data scientists, too. You might be able to go through and look at some of the prescribed analytics and machine learning algorithms that you want to use, and you, as the data engineer, will understand how to use those. It’s not just willy-nilly, like, “Hey, I’m just going to pull this one down and have it.” You need to have a background in statistics, and probability, and heavy on math. One of the things, one of my gaps in skills that I’ve been working on is the math part.

You can follow along as, watch me learn how machine learning… The machine learning course, with Andrew Ng’s course, and you can see some of the things, especially if you’re a data engineer, that you need to shore up, so that you can fit into that machine learning engineer.

Think of the machine learning engineer in the small shop as, you’re the full-stack developer, you’re the full-stack engineer. It’s kind of doing everything. Then, in larger corporations, what you’re going to have is, like I said, we’ve got it on both sides of the spectrum. You’ve got your data engineer, that are really good at setting up, administrating an environment, maybe even doing the software development, running Hive, creating the MapReduce jobs or the Spark jobs, but then you have your data scientists who are, maybe have some SQL skills, really good at math, but not really good at the technical. The machine learning engineer is that person in the middle, to kind of bridge the gap. In bigger shops, you’re going to have your machine learning engineer who’s working with your data scientist, and then starts to be able to pick up on, “Okay, this is the way that we like to do some of the things here, and you’re really owning that part of the stack, and so, you’re not so much worried about developing and doing what I would call the Hadoop administration, or even the Hadoop development.

When I say Hadoop, remember, we’re just talking about anything in that ecosystem. Your machine learning engineer is your specialization of that. I did a little research, too, just to look at it. Just pulling it up, just some preliminary research, just looking for jobs out there. A lot of times, we’ll say, “Yeah, this is, you’re an Excel guru, and you say, ‘Excel guru?'” You go look, and there’s nobody with a job title excel guru. You’re giving it to yourself.

Looking at machine learning engineer, quick search on Google for jobs, there are a lot of different postings from companies all the way from IBM to Facebook, Lyft, a lot of different postings out there, just in my quick search. Also, looking at Glassdoor, and some of the other places, the salary ranges are right there with what a data engineer is, so anywhere from the low 80s, which I wouldn’t think that, that’s probably not really a true machine learning engineer, or maybe it’s in a different part of the country, all the way up to the 160s. That’s salary range per year. I thought that was pretty good mix, there.

Really fit in line with what we see as the data engineer and the data scientist, so those roles are out there. If you’re excited to go out and learn those, remember what I was saying. Want to have a solid background as a data engineer with understanding how the Hadoop administration works. Also, the workflows, and some of the development skills. Want to be able to implement, if you’re using Mahout, if you’re using TensorFlow, any of those frameworks, you want to be able to implement those, but then you also want to have the math portion too, so make sure you understand the algorithms from a math level, and how to tweak, and how to tune those.

That’s all for today. Hope I answered your question. If you have any questions, anybody out there, make sure that you first go and subscribe, and then ask your question. I’ll try to answer them here. Have a good day.

What is Transfer Learning

Transfer Learning speeds up time to results (does not guarantee results😊)

Transfer Learning for Data Reduction

Most Computer Vision Already Incorporates Transfer Learning

NVIDIA has a Transfer Learning Toolkit in it is 2nd Generation

Wrapping Up Transfer Learning

Want More Data Engineering Tips?

San Jose & New York

Keynote O’Reilly AI Conference New York

Coffee with Sofia the Robot

On To London

Want More Data Engineering Tips?

What About Java in Deep Learning?

Transcript

Want More Data Engineering Tips?

Generative Adversarial Networks

Transcript What Is A Generative Adversarial Network?

Another Machine Learning Course?

Transcript- Review Coursera’s Neural Networking & Deep Learning Course

Will AI Take My Job?

Transcript – Will AI Replace Data Scientist?

What is TFLearn?

Pluralsight Author

Implementing Multi-layer Neural Networks with TFLearn

Honest Feedback Time

My Tensorflow Journey

What is Tensorflow

How To Get Started with Tensorflow

Tensorflow Hello World MNIST

Beyond Tensorflow Hello World with MNIST

First let’s save this code as mnist-example.py

Next let’s run our MNIST example

Learn More Data Engineering Tips

What does a Data Scientist do?

What is Deep Learning

Must Know Deep Learning Terms

#1 Training

#2 Inference

#3 Overfittiing

#4 Supervised Learning

#5 Unsupervised Learning

#6 Tensorflow

#7 Caffe

#8 Learning Rate

#9 Embarrassingly Parallel

#10 Neural Network

#11 Pytorch

#12 CNN

#13 RNN

#14 Forward Propagation

#15 Back Propagation

#16 Convergence

#17 Layers

Want More Data Engineering Tips?

What is a Machine Learning Engineer?

Transcript – Rise of the Machine Learning Engineer

NVIDIA has a Transfer Learning Toolkit in it is 2^nd Generation