5 min read

I read a post from a Linkedin connection about a week ago. It read: “The first step in becoming a data scientist: forget about Windows.” Even if you’re not a programmer, that’s pretty controversial. The first nerdy thought I had was, that’s not true. The first step to Data Science is not choosing an OS, it’s statistics! Anyway, I kept wondering what’s wrong with doing data science on Windows, exactly. Why is the legacy product (Windows), created by one of the leaders in Data Science and Artificial Intelligence, not suitable to support the very thing it is driving?

As a publishing professional and having worked with a lot of authors, one of the main issues I’ve faced while collaborating with them is the compatibility of platforms, especially when it comes to sharing documents, working with code, etc. At least 80 percent of the authors I’ve worked with have been using something other than Windows. They are extremely particular about the platform they’re working on, and have usually chosen Linux. I don’t know if they consider it a punishable offence, but I’ve been using Windows since I was 12, even though I have played around with Macs and machines running Linux/Unix. I’ve never been affectionately drawn towards those machines as much as my beloved laptop that is happily rolling on Windows 10 Pro.

Why is data science on Windows is a bad idea?

When Microsoft created Windows, its main idea was to make the platform as user friendly as possible, and it focused every ounce of energy on that and voila! They created one of the most simplest operating systems that one could ever use. Microsoft wanted to make computing easy for everyone – teachers, housewives, kids, business professionals. However, they did not consider catering to the developer community as much as its users.

Now that’s not to say that you can’t really use a Windows machine to code. Of course, you can run Python or R programs. But you’re likely to face issues with compatibility and speed. If you’re choosing to use the command line, and something goes wrong, it’s a real PITA to debug on Windows. Also, if you’re doing cluster computing with other Linux/Macs, it’s better to have one of them yourself. Many would agree that Windows is more likely to suffer a BSoD (Blue Screen of Death) than a Mac or a Unix machine, messing up your algorithm that’s been running for a long time.

[box type=”note” align=”” class=”” width=””]Check out our most read post 15 useful Python libraries to make your Data science tasks easier. [/box]

Is it all that bad?

Well, not really. In fact, if you need to pump in a couple more gigs of RAM, you can’t think of doing that on a Mac. Although you might still encounter some weird stuff like those mentioned above, on a Windows PC, you can always Google up a workaround. Don’t beat yourself up if you own a PC. You can always set up a dual boot, running a Linux distribution parallely. You might want to check out Vagrant for this. Also, you’ll be surprised if you’re a Mac owner and you plan some heavy duty Deep Learning on a GPU, you can’t really run CUDA without messing things up. CUDA will only work well with NVIDIAs GPUs on a PC.

In Joey Tribbiani’s words “This is a moo point.” To me, data science is really OS agnostic. For instance, now with Docker, you don’t really have to worry much about which OS you’re running – so from that perspective, data science on Windows may work for you.

scene from Friends, TV show

Still feel for Windows?

Well, there are obviously drawbacks. You’ll still keep living with the fear of isolation that Microsoft tries to create in the minds of customers. Moreover, you’ll be faced with “slowdom” if that’s a word, what with all the background processes eating away your computing power! You’ll be defying everything that modern computing is defined by – KISS, Open Source, Agile, etc. Another important thing you need to keep in mind is that when you’re working with so much data, you really don’t wanna get hacked! Last but not the least, if you’re intending to dabble with AI and Blockchain, your best bet is not going to be Windows.

All said and done, if you’re a budding data scientist who’s looking to buy some new equipment, you might want to consider a few things before you invest in your machine. Think about what you’ll be working with, what tools you might want to use and if you want to play safe, it’s best to go with a Linux system. If you have the money and want to flaunt it, while still enjoying support from most tools, think about a Mac. And finally, if you’re brave and are not worried about having two OSes running on your system, go in for a Windows PC.

So the next time someone decides to gift you a Windows PC, don’t politely decline right away. Grab it and swiftly install a Linux distro! Happy coding! 🙂

*I will put an asterisk here, for the thoughts put in this article are completely my personal opinion and it might differ from person to person. Go ahead and share your thoughts in the comments section below.

I'm a technology enthusiast who designs and creates learning content for IT professionals, in my role as a Category Manager at Packt. I also blog about what's trending in technology and IT. I'm a foodie, an adventure freak, a beard grower and a doggie lover.

4 COMMENTS

  1. Thanks for the article, but I am a data scientist and I must respectfully disagree that Linux or Macs are even needed. Microsoft has put together the best system for those who want to learn how to code and get involved in all types of computer technology.
    I have been a computer user since the Radio Shack (TRS-80!) days :). I learned to program on a Commodore 64, wrote machine language code for it, and was a devout supporter of the Atari Falcon and its predecessors. But since reluctantly moving to the Windows ecosystem, never once have I had the need to go back to arcane command-line work, and I refuse to do it now unless absolutely necessary. Certainly other OS’s have their good points, but I just don’t need them.

    • Thanks for taking the time to pen down your thoughts, Ron.

      I totally respect your views about not needing another OS apart from Windows to perform data science. In fact, I’m glad you shared your experience with Windows here. I guess it’s all about what kind of data you’re working with and what tasks you’re performing on your machine. I’ve spoken with quite a few people who work a lot with data on a daily basis, and I can say 4 in 5 people preferred using a Linux based system for their work.

      Every system has its own pros and cons and I believe the fact is that everyone adopts what fits them the best, and in this case, I’m glad Windows has offered you a great platform to accomplish your tasks. 🙂

      I’m curious to know what you love about the Windows platform. Would you mind sharing a few points here, so that those who’re planning to get into Data Science can get more insight about the platform, from a user?

LEAVE A REPLY

Please enter your comment!
Please enter your name here