I have a data which contains 10 millions records. I have an R code which requires to estimate the coefficient of a model using 3000 iterations. Running the R code on this data is very time consuming and sometimes my system got hang. I am using windows 8.1-64 bit version with 4 GB ram. In order to reduce the time, I want to integrate R with Python. Though I have moderate knowledge in R, but I am completely new in Python. I found out that rpy2 can be used to call R from python (I have python version 3.4.1). I have done the following:
import rpy2
import rpy2.robjects as robjects
But is is giving the following error:
Traceback (most recent call last): File "C:\Python34\lib\site-packages\rpy2\rinterface__init__.py", line 29, in 0, win32con.KEY_QUERY_VALUE ) pywintypes.error: (2, 'RegOpenKeyEx', 'The system cannot find the file specified.')
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "", line 1, in import rpy2.robjects as robjects File "C:\Python34\lib\site-packages\rpy2\robjects__init__.py", line 15, in import rpy2.rinterface as rinterface File "C:\Python34\lib\site-packages\rpy2\rinterface__init__.py", line 32, in except ImportError(ie): NameError: name 'ie' is not defined
I cannot understand why I am getting error. How to overcome the error.
But, if I do the following, its working:
from rpy2 import *
It will be very helpful if someone explain how to call R from Python elaborately and give a solution for my problem. Any other solution regarding how to run big data in R with lesser time will also be appreciated. Thanks in advance!