R, the open source statistical software, is gaining popularity in the world of analytics very fast. Over last few years companies, irrespective of size of their business, have adopted R as their analytical engines. Being open source and knowing analytics is the going to the booming industry, lot of products companies have made their product to be able to integrate with R. For example, we can pass data from Tableau to R, run some analysis in R and send back the result to Tableau for visualization.
Different R Products
Hence with four various R products available (including open source R or CRAN R) on the market of which 3 of them are free and one (Microsoft R Server) is licensed one might be confused of the differences among these products and might end up not using the most suitable R product.
The Microsoft products being comparatively new, there is not much documentation is available on the net other than some on Microsoft websites. Though the products are very well described there, I felt the need to summarize those as a comparative view of the four versions of product.
Comparison between Different R Products
Microsoft R Open
Microsoft R Client
MRO is a free software for Windows where we can use the above proprietary functions. These function names starts with a suffix ‘rx’. For example, glm() function is CRAN R function to fit a generalized linear model, but rxGlm() does the same thing but uses parallelization. But, in MRO parallelization can go only up to two threads.
Microsoft R Server
This licensed product has its support services and we can run R code as a standalone web service too. It is possible to operationalize the MRS engine for multi-server topologies with clustered web nodes and compute nodes using DeployR package.
A Diagrammatic Representation
Which Version of R Suits You
Below I am trying to summarize the above and try to see the best R product for different scenarios. Hopefully this will help one to decide.
The Suggested R Product
|Your data is small enough to fit your machine memory. You will primarily need to run some ad hoc analysis, which includes more statistical operations.||Though there is no harm in going with MRO or MRC, but make your like simple by just using CRAN R|
|Your data is small enough to fit your machine memory. You need to do primarily statistical modeling. Your process need to be repeated over time and you need to ensure consistent result every time.||Better you go for MRO as it does the version control for the packages you use. It is not necessary to install Intel MKL|
|Your data is small enough to fit your machine memory. Your analysis includes some higher order matrix manipulations.||MRO is the best bet, with Intel MKL library installed. There is no harm in using MRC, but you do not need it.|
|Your data is reasonably big, but can fit to your memory. You need to do some operations for which MS proprietary functions are available.||MRC is the product for you if you are using in Windows environment. Else you need either MRS, or if you want to go for free version, go for CRAN R or MRO|
|Your data is reasonably big, but can fit to your memory. But the operations you need to perform, there is no MS proprietary function is available.||You better use MRO or CRAN R. The speed and capacity of MRC or MRS will be same.|
|You are using huge data, than cannot run in fit memory (around 25% of the total RAM used). You need to do some operations for which MS proprietary functions are available.||MRS is the only option here with clustered server environment. You can use MRC as a front end platform to connect MRS.|
|Your data is huge and data resides in a data base (SQL or Teradata or Hadoop).||MRS in-database run will be a good option.|
|You want to develop an analytical engine as a standalone service, where people will upload data and do various analysis.||MRS is the best option here.|