Data Oriented Design and Programming – Truly understanding data analysis programming in R and pandas

Data Analysis Programming – it is not object oriented or procedural programming!

Understanding any data analysis language like R or pandas the right way, one needs a shift in one’s thought process. For e.g. programmers with background in languages like C++ or Java may find R similar to the C family but that is a deception. Though syntactically this is the case from language semantics point of view it is a different world altogether.

The approach of learning R as another C family language will backfire as you start using it to solve real life practical problems. Also the cookbook approach of learning R will not give you much confidence in designing complex programs. It will not get you much far!

Data Oriented Design and Programming

If for mastering Java or C++ you must know object oriented programming paradigm then similarly for mastering data analysis systems like R and pandas , you must familiarize yourself with the data oriented design (DOD) and data oriented programming (DOP) paradigms.

Even if you get a passing familiarity with the DOD and DOP paradigms it would significantly reduce your thinking-friction. You will see the right approaches for designing data engineering solutions.

In series of posts on this site I would make an attempt to walk you through the DOD and DOP paradigms. Primarily we will use the R programming language to illustrate the points but it should not be much different for pandas – python’s data analysis library. So stay glued!