I get asked a lot how I can find reading fun - I've gotten caught at social gatherings flipping through pages of a book (I know, I'm a party animal) and asked if I genuinely enjoy it. In all honesty, it's astounding to me that less people enjoy reading, since it's something that I rarely have to think about while I'm doing it. It's surprising that most people find it a chore. Something that that's recently given me a fair amount of insight towards others' opinions on my reading habits is developing my own skills as a programmer. One of the worst things I have to do everyday is read other people's code. It's boring, I often get lost from one line to the next, I don't know what everything means, etc. In other words - it's probably exactly what the majority of people who remark that my reading habits are strange are thinking when they tackle lengthy online articles or novels. There's obviously some differences - code is semantically different from plain English - but making that comparison has led to me to believe that there's a way to make reading code and rough technical papers more manageable.
What's that way? The same way that I grew to love reading everyday in a way that other people haven't: practice. Reading's an under-appreciated skill. I think it would surprise folks to learn that widespread literacy is a relatively recent phenomenon. World illiteracy didn't start rapidly declining until the 1970s. It has historically been a mark of the elite and well-educated (i.e. those with access to resources), but in modern times it's relatively commonplace in the Western world. You'd be surprised to find someone who grew up speaking English fluently with full access to decent public education and used to surfing the web who wasn't able to read above a fifth grade level. The age we live in is termed the 'Information Age'. The point I want to make is that people have begun to see reading as a given, instead of a skill - which I worry is hindering a lot of people's success, not only in actually reading, but in jobs and skills that require it peripherally, like coding. Another example I can think of is giving up on sight-reading in my piano lessons when it got hard. I was so used to the ease of regular reading that appreciating that sight reading was a different skill entirely was a foreign concept to me. I gave up on it because it wasn't easy. I don't want to do the same with code.
Before I explain how to improve your code, it's essential that I clarify why it's important to know how to read code well. As of August 2019, I'm the solo leader on a startup programming project - I built our system from scratch and I only had to clarify inputs and outputs for the other team members working on our system. I still have to read other people's code all the time. I have to be able to quickly and fluently understand the inner workings of a variety of different programming packages, GitHub open source hacks, and workarounds developed by other folks in the community everyday just to get my stuff to function properly. If you're a professional software engineer working with a team and a deadline, this is even more essential. A typical software development team has a variety of programmers from all kinds of backgrounds. You're going to have to sort through their code, understand it just as well as something of your own, and actually produce results from that understanding. All probably on a deadline. It can't be something that you view as a chore, or something that's a pain to get done. It has to be something you're well versed in and can get excited to do. In other words, it has to be as exciting as reading a semi-decent book. I'm not saying get caught at social gatherings reading code, but try and get your skills to a point where reading code becomes more of a fun exercise than something you wouldn't touch with a five foot pole.
That begins my main premise: begin to view reading code as a skill that you should practice like you practiced reading in grade school, and here's how to start. One of the first things that most programmers complain about is the 'style' of other people's code. We're all trained in a different way - you might have learned it from an online boot camp, you might be self taught, or you might have had a professor who was hell bent on teaching algorithms in his self-designed programming language and gave the bird to everything else (that might just be a personal experience, idk). Trying to just get an excellent sense of what's going on, what the programmer's mental model of the code was just from looking at it is insufficient. It's more like sight reading for piano than actual reading because it exists in a different dimension than regular books. It's functional and takes in different objects (i.e. datatypes). Therefore, you need different tools.
First and foremost - RUN THE CODE!. Never, ever just look at it blankly and assume you don't understand how it works. RUN IT! Download it to your local machine and see what it does! You'd be surprised how much information you can get out of something when the initial build doesn't even work. I've seen too many of my students and fellow researchers struggle to get something to work that they download from somewhere else and never launched the original code in the first place. One of the main complaints from my Master's program was how all of our professor-designed programs took ages to just get up and running. It's easier than you think to get around this hurdle, just run the program and see what you're missing. It's usually just as simple as installing the right packages.
One thing I'm always shocked at is how new programmers aren't taught how to debug their own programs. Like parallelization, Linux, working with packages, communicating their results, and developing comprehensive tests - debugging is just one of those concepts that CS faculty just believes students can get on their own. Here's how to actually go about understanding code by debugging. Step One - find an IDE (Integrated Development Environment) that you can use to hold the code. Nobody really covers this in introductory programming classes - a lot of online tutorials I've found insist on using the Python native shell and script (which is confusing if you don't understand how computer architectures work) or their own unique environment which just compounds the problem by making people reliant on using that system until they can break out on their own (yes I view Udacity and Treehouse's in-house programming environments as jails, despite their excellent tutorials). If you're working with Python, I recommend using PyCharm. Their debugger is pretty easy to understand once you get everything working (I cover starting a project in PyCharm in another post, it can get a bit wonky - which is probably why so many beginners don't use IDEs for their projects). The main concept to understand when you're starting out are breakpoints. You usually place them wherever you think the error is - but you can also just use it to step through the code and see what everything is. It's extremely helpful when you're given a piece of foreign code that you want to tool through to get a sense of what's going on. Please see my whole post on debugging with PyCharm here, because the whole debugging process is a bit beyond the scope of this post. The point of using debugging is to get a sense of what all the various inputs and outputs of a piece of code are. Once you get that sense, you can do anything with the code.
Moving on, here are some other tips and tricks I've come up with in my last few years as a developer that might be useful towards understanding others' code. First, read through the documentation. Documentation can be its own beast to work with - which is why I cover it in another post here, but a quick skim of what the author's talking about can be very useful towards getting your mind where it needs to be to understand the code. Let this be a note to developers in general - we all find it so obnoxious when we hit code that's undocumented, yet we find documenting a pain. We can't have it both ways, so document your damn code! Second, start with the main function. I know I mentioned that everybody gets taught differently, but one thing that seems fairly constant among most professionals is the use of a main function to launch the code when it's run from the command line (i.e. using it on terminal or part of a pipeline). This is one very essential way that code is different from books. In books, you'll read from left to right, top to bottom (unless you're reading and writing in a Middle Eastern language like Arabic or Hebrew, in which case it's right to left). You should be reading code from bottom up (paying attention to which packages are imported at the top while you do it). All of the essential building blocks of the program will be described in the main function. You'll be able to divide the different parts of the program based on reading through the lines of main. Once those parts are divided, you can begin splitting the code into smaller and more manageable pieces. That's the whole theory of understanding code - it's all about breaking everything down into smaller and smaller pieces until you can manage it. The main provides the best guide to the code - think of it like a table of contents.
Once you've identified this 'table of contents,' split up the code into helper functions and classes. Classes are the larger of the two, so I'll start with those. Identify what classes it inherits from (if it even does inherit - a small enough codebase won't have a lot of overlap). Identify the methods and attributes of the class. The attributes will be variables that every instance of the class gets when it's called. I find most initial attempts to describe classes a bit frustrating - they limit it to simple objects like characters in a video game (i.e. every character has X quality), which I haven't found all that useful to transfer over to advanced coding. I go into a better description of how to understand classes here, but for now, just understand that looking into classes and dividing up the attributes and methods is the best way to tackle them. Methods are just the helper functions that are unique to the instance of the class. Second, the helper functions are the additional processes outside of any classes that are used by the main function to do something to whatever the inputs were (I've found they're usually used in preprocessing, but they're flexible enough to apply to anything).
Anytime you hit something you don't understand - what datatype a variable is, where a certain method came from, etc. - use the tools that IDE gives you to clue you in. Find every instance of the method and see where it was defined. Use the IDE to strip the code bare. No part of the code is hidden to you - proper use of an IDE will reveal everything. This is a very short introduction to reading and understanding code. It gets difficult, especially when you start working with others' packages that will be thousands of lines of code. This is why programmers get paid a lot of money - it's a hard job and can get complicated. The good news is that if you regard reading code as a skill instead of a given, like most CS professors seem to believe, you can stay positive and motivated to continue learning. This is a difficult thing, and don't let anybody convince you that you "need to have a certain mind for programming," or "you're just not good at it." Anybody can be a programmer, they just need to practice.
If you have your own strategies or inputs on how to read code - let me know in the comments below. I can always improve these posts.