Review: John Hopkins R Programming on Coursera

R Programming is the second course in the Johns Hopkins Data Science Specialization; which follows on from the much maligned Data Scientist’s Toolbox.

Overall though I enjoyed this course, it was well paced for an individual like me who is looking for something to complete in their spare time. If you’re reading between the lines you could interpret that to mean – it is not particularly challenging.

I believe however you have to think of this course as laying the foundation for other modules in the Data Science Specialization. I’m not sure it really stands on its own as a course to learn R, as it is little more than an introduction. So far as a single resource I have gotten more out of Machine Learning for Hackers while getting to grips with R. I think those uninterested in the Specialization will likely find the same.

That said, within the realms of the Data Science Specialization it makes sense to look at the R language at its rudimentary level before getting into the more advanced concepts to come. The majority of the assessments are concerned with making sure students know how to load data from disk, construct data frames and loop over elements. Having taken Stanford Universities excellent Machine Learning Offering I can say this is entirely sensible. Although a fantastic course, it does throw you to the wolves with regards to the learning curve. Making you work with Octave and construct linear regression models straight off the bat.

Lectures

These are distinctly average to be honest. That is not meant as a slight, but rather in the true sense that there is nothing particularly good or bad about them. The first week is heavily loaded with around an hour and half worth of material, which winds down to half an hour in week four.

The content itself is largely talking over slides. Although this is fine for talking over high level concepts I prefer a live or white board demonstration for writing code constructs. I think seeing someone write a function requires the student pay attention rather than just having the complete example presented to them on a slide.

Quizzes

These were fairly straight forward and providing you have viewed the lectures; there shouldn’t be any great surprises here. I do like the way you can only take the quizzes three times, although I’d venture twice is sufficient. Previously I had taken part in a course on Coursera that had allowed unlimited; which I think cheapens them significantly. There was slight hiccup one week were the quiz featured material from the following week’s lectures. The faculty rectified this by giving extended time and additional retries to students. Mistakes happen and I think the course administrators remediated as best they could.

Assignments

I was quite surprised with the amount of work required for the assignments they were fairly involved and go beyond the functions that are described in lectures. This is a good thing as it required you to solve the problems on your own with Google and other resources. This opinion was not universal however, I could see from the course message board a lot of people struggled with this. Certainly there was a lot of criticism that course was “substandard” or “incomplete”.

Personally I think the only criticism that could be levelled at the course is that the 3-5 hours a week should be caveated to be significantly larger if you are from a non-programming background.

I like the way you submit assignments trough the IDE, this is tidy and the way you get live feedback for your submission via the test script suits this kind of MOOC offering. Another plus point for this course is that you are given a number of examples outputs to test against first; which is not always the case with similar courses.

There are a few minor quality gripes, with sample code referring to function name in previous question due I assume to the content being copied and pasted, a little sloppy; but not disastrous.

Optional Assignment

You are given the option of completing an additional assignment over the four week period. From this you can gain five extra point grades, for me this helped to pad out the odd marks dropped in quizzes. The additional assignment itself is delivered by an R library called Swirl; which gives you many small problems to solve. Arguably this is a better way to get novices into R rather than throwing them in the deep end, as with the main assignments.

Peer Assessment

Probably my biggest gripe lies here, not that I have a problem with peer assessment. In fact with MOOC’s this is the only opportunity you get for human feedback. In this case it was the content that was chosen for assessment I had issue with. A fair portion of the grade involves assessing whether or not scripts are “well documented”. The fact of the matter is there is no “correct” way to document code. There are just a number of subjective opinions on how it should be done.

To have individuals marking each other on something so ambiguous is not sensible in my opinion. From the work I graded some people commented to my taste and others didn’t. Some individuals felt the need to write a comment about every line of code. Personally I find this abhorrent, yet it is understandable how a novice may consider this to be “well documented”. Is documenting “return(n) #returns n” useful? I know what the return function does, but what is n? Measures around meaningful variable names as a guideline would be better instead.

As a Java developer typically I proscribe to the notion that if you have to write a line comment your code probably smells; if it is not clear what is going on, you should rewrite it. Admittedly this relies on an understated benefit of Java, in that it is inherently descriptive; which tends to get negatively branded as being “verbose”. R on the other hand is not naturally descriptive, so I concede that a number of line comments are necessary to convey intent, even if it is not to my taste.

Onwards to Getting and Cleaning Data!

Review: John Hopkins R Programming on Coursera