0

I'm basically running some code as follows. Basically I'm just retrieving pairs of stocks (laid out as Row 1-Stock 1,2, Row 2-Stock 1,2 and so on, where Stock 1 and 2 are different in each row) from a CSV File. I then take in data from Yahoo associated with these "Pairs" of Stocks. I calculate the returns of the stocks and basically check if the distance (difference in returns) between a pair of stocks breaches some threshold and if so I return 1. However, I am running into the error below and Im unable to figure out why, as I know that the key ADP_PAYX which is associated with the first "stock pair" in the CSV File does infact exist.

Distancefunc(self, tickers, begdate, enddate)
    111             data = Returns(ticker,begdate,enddate)
    112             key = ticker[0]+'_'+ticker[1]
--> 113             R1 = data[key]['Returns'][0]
    114             R2 = data[key]['Returns'][1]
    115             distance = sum[(R1-R2)^2]

KeyError: 'ADP_PAYX' 




from datetime import datetime
import pytz
#import zipline as zp
import csv
import pandas as pd
import pandas.io.data as web
import numpy as np
from matplotlib.pyplot import *
from matplotlib.finance import quotes_historical_yahoo


def Dataretriever():
        Pairs = []
        f1=open('C:\Users\Pairs_0420.csv') #Enter the location of the file
        csvdata= csv.reader(f1)
        for row in csvdata:          #reading tickers from the csv file
            Pairs.append(row)
        return Pairs

tickers = Dataretriever()
tickersasstrings = map(str, tickers)

def PricePort(tickers,begdate,enddate):
    """
        Returns historical adjusted prices of a portfolio of stocks.
        tickers=pairsd   """
    final=pd.read_csv('http://chart.yahoo.com/table.csv?s=^GSPC',usecols=[0,6],index_col=0)
    final.columns=['^GSPC']
    data = {}
    for ticker in tickers:
        #print ticker        
        key = ticker[0]+'_'+ticker[1]
        data1 = quotes_historical_yahoo(ticker[0], begdate, enddate,asobject=True, adjusted=True)
        data2 = quotes_historical_yahoo(ticker[1], begdate, enddate,asobject=True, adjusted=True)
        #url1 = 'http://chart.yahoo.com/table.csv?s=ttt'.replace('ttt',ticker[0])
        data[key] = {'Data': (data1,data2)}
    return data    

def Returns(tickers,begdate,enddate):
    begdate=(2014,1,1)
    enddate=(2014,6,1)    
    p = PricePort(tickers,begdate,enddate)
    for ticker in tickers:
        key = ticker[0]+'_'+ticker[1]
        data1 = p[key]['Data'][0]
        data2 = p[key]['Data'][1]
        ret1 = (data1.close[1:] - data1.close[:-1])/data1.close[1:]
        ret2 = (data2.close[1:] - data2.close[:-1])/data2.close[1:]
        p[key]['Returns'] = (ret1,ret2)

    return p

class ThresholdClass():    
    #constructor
    def __init__(self, Pairs,begdatae,enddate):
        self.Pairs = Pairs
        self.begdate = begdate
        self.enddate = enddate

    def Distancefunc(self, tickers, begdate, enddate):
        for ticker in tickers:
            data = Returns(ticker,begdate,enddate)
            key = ticker[0]+'_'+ticker[1]
            R1 = data[key]['Returns'][0]
            R2 = data[key]['Returns'][1]
            distance = sum[(R1-R2)^2]
            return distance

    def MeanofPairs(self, tickers, begdate, enddate):
        sum = self.Distancefunc(tickers, begdate, enddate)
        mean = np.mean(sum)
        return mean

    def StandardDeviation(self, tickers, begdate, enddate):
        sum = self.Distancefunc(tickers, begdate, enddate)
        standard_dev = np.std(sum)
        return standard_dev 

    def ThresholdandnewsChecker(self, tickers, begdate, enddate):
        threshold = self.MeanofPairs(tickers, begdate, enddate) + 2*self.StandardDeviation(tickers, begdate, enddate)
        if (self.Distancefunc(tickers, begdate, enddate) > threshold):
            news = self.newsfunc(binaryfromnews)
            return 1       

begdate=(2013,1,1)
enddate=(2013,12,31)      
Threshold_Class  = ThresholdClass(tickers[:1],begdate,enddate)   
Threshold_Class.ThresholdandnewsChecker(tickers[:1], begdate, enddate)

Edit: Adding a print p before data1 = p[key]['Data'][0] showed that the keys are A_D and P_A instead of the required 'ADP_PAYX'. So, this is what I am looking to resolve at this point. Thanks.

Jojo
  • 895
  • 8
  • 20

1 Answers1

1

In the method Distancefunc change this:

    for ticker in tickers:
        data = Returns(ticker,begdate,enddate)
        ...

to this:

    data = Returns(tickers,begdate,enddate)
    for ticker in tickers:
        ...

Note the change from ticker to tickers in the argument of Returns. The reason is that Returns is designed to loop over all tickers, but currently you are passing it a string instead of a list of tickers. That's why the key is set to A_D and the data dictionary: it's the first and second letter of the tickername ADP_PAYX.

Olaf
  • 1,899
  • 14
  • 17