I read a post from a Linkedin connection about a week ago. It read: “The first step in becoming a data scientist: forget about Windows.” Even if you’re not a programmer, that’s pretty controversial. The first nerdy thought I had was, that’s not true. The first step to Data Science is not choosing an OS, it’s statistics! Anyway, I kept wondering what’s wrong with doing data science on Windows, exactly. Why is the legacy product (Windows), created by one of the leaders in Data Science and Artificial Intelligence, not suitable to support the very thing it is driving?
As a publishing professional and having worked with a lot of authors, one of the main issues I’ve faced while collaborating with them is the compatibility of platforms, especially when it comes to sharing documents, working with code, etc. At least 80 percent of the authors I’ve worked with have been using something other than Windows. They are extremely particular about the platform they’re working on, and have usually chosen Linux. I don’t know if they consider it a punishable offence, but I’ve been using Windows since I was 12, even though I have played around with Macs and machines running Linux/Unix. I’ve never been affectionately drawn towards those machines as much as my beloved laptop that is happily rolling on Windows 10 Pro.
Why is data science on Windows is a bad idea?
When Microsoft created Windows, its main idea was to make the platform as user friendly as possible, and it focused every ounce of energy on that and voila! They created one of the most simplest operating systems that one could ever use. Microsoft wanted to make computing easy for everyone – teachers, housewives, kids, business professionals. However, they did not consider catering to the developer community as much as its users.
Now that’s not to say that you can’t really use a Windows machine to code. Of course, you can run Python or R programs. But you’re likely to face issues with compatibility and speed. If you’re choosing to use the command line, and something goes wrong, it’s a real PITA to debug on Windows. Also, if you’re doing cluster computing with other Linux/Macs, it’s better to have one of them yourself. Many would agree that Windows is more likely to suffer a BSoD (Blue Screen of Death) than a Mac or a Unix machine, messing up your algorithm that’s been running for a long time.
[box type=”note” align=”” class=”” width=””]Check out our most read post 15 useful Python libraries to make your Data science tasks easier. [/box]
Is it all that bad?
Well, not really. In fact, if you need to pump in a couple more gigs of RAM, you can’t think of doing that on a Mac. Although you might still encounter some weird stuff like those mentioned above, on a Windows PC, you can always Google up a workaround. Don’t beat yourself up if you own a PC. You can always set up a dual boot, running a Linux distribution parallely. You might want to check out Vagrant for this. Also, you’ll be surprised if you’re a Mac owner and you plan some heavy duty Deep Learning on a GPU, you can’t really run CUDA without messing things up. CUDA will only work well with NVIDIAs GPUs on a PC.
In Joey Tribbiani’s words “This is a moo point.” To me, data science is really OS agnostic. For instance, now with Docker, you don’t really have to worry much about which OS you’re running – so from that perspective, data science on Windows may work for you.
Still feel for Windows?
Well, there are obviously drawbacks. You’ll still keep living with the fear of isolation that Microsoft tries to create in the minds of customers. Moreover, you’ll be faced with “slowdom” if that’s a word, what with all the background processes eating away your computing power! You’ll be defying everything that modern computing is defined by – KISS, Open Source, Agile, etc. Another important thing you need to keep in mind is that when you’re working with so much data, you really don’t wanna get hacked! Last but not the least, if you’re intending to dabble with AI and Blockchain, your best bet is not going to be Windows.
All said and done, if you’re a budding data scientist who’s looking to buy some new equipment, you might want to consider a few things before you invest in your machine. Think about what you’ll be working with, what tools you might want to use and if you want to play safe, it’s best to go with a Linux system. If you have the money and want to flaunt it, while still enjoying support from most tools, think about a Mac. And finally, if you’re brave and are not worried about having two OSes running on your system, go in for a Windows PC.
So the next time someone decides to gift you a Windows PC, don’t politely decline right away. Grab it and swiftly install a Linux distro! Happy coding! 🙂
*I will put an asterisk here, for the thoughts put in this article are completely my personal opinion and it might differ from person to person. Go ahead and share your thoughts in the comments section below.