1

I am working with tick-data on some of the more liquid futures and options which gets millions of ticks every single day. Can I use the normal statistical techniques (like correlation, regression, cointegration, etc) to tick data due to the high noise. If not, can someone please suggest some other techniques which help me do these things?

+-----+---------+---------+----------+------------+ | Seq | Bid | Ask | Mid | IV | +-----+---------+---------+----------+------------+ | 1 | 25.25 | 25.40 | 25.325 | 0.325569 | | 2 | 25.25 | 25.45 | 25.350 | 0.325769 | | 3 | 25.25 | 25.30 | 25.275 | 0.325959 | | 4 | 25.25 | 25.30 | 25.275 | 0.325909 | | 5 | 25.25 | 25.45 | 25.350 | 0.325769 | | 6 | 25.25 | 25.35 | 25.300 | 0.325779 | | 7 | 25.20 | 25.35 | 25.275 | 0.325799 | | 8 | 25.25 | 25.35 | 25.300 | 0.325639 | | 9 | 25.25 | 25.45 | 25.350 | 0.325799 | | 10 | 25.30 | 25.45 | 25.375 | 0.325639 | | 11 | 25.30 | 25.50 | 25.400 | 0.325509 | | 12 | 25.30 | 25.45 | 25.375 | 0.325619 | | 13 | 25.30 | 25.45 | 25.375 | 0.325669 | | 14 | 25.30 | 25.45 | 25.375 | 0.325699 | | 15 | 25.30 | 25.35 | 25.325 | 0.325629 | | 16 | 25.35 | 25.40 | 25.375 | 0.325679 | | 17 | 25.35 | 25.40 | 25.375 | 0.325849 | | 18 | 25.35 | 25.40 | 25.375 | 0.325799 | | 19 | 25.35 | 25.40 | 25.375 | 0.325999 | | 20 | 25.35 | 25.40 | 25.375 | 0.325989 | | 21 | 25.40 | 25.45 | 25.425 | 0.325839 | | 22 | 25.45 | 25.60 | 25.525 | 0.325969 | | 23 | 25.45 | 25.55 | 25.500 | 0.325899 | | 24 | 25.45 | 25.65 | 25.550 | 0.325789 | | 25 | 25.45 | 25.50 | 25.475 | 0.325879 | | 26 | 25.40 | 25.50 | 25.450 | 0.325849 | | 27 | 25.40 | 25.50 | 25.450 | 0.325869 | | 28 | 25.45 | 25.50 | 25.475 | 0.325689 | | 29 | 25.45 | 25.60 | 25.525 | 0.325599 | | 30 | 25.45 | 25.60 | 25.525 | 0.325649 | +-----+---------+---------+----------+------------+

PS: Sorry if I posted data in wrong format, took me a lot of time to find a good way to do this but I could only find this: https://meta.stackexchange.com/questions/156729/how-to-display-data-in-table-structure-in-stack-overflow

Regards

nimbus3000
  • 207
  • 2
  • 8
  • 1
    I'm not sure what you want to do. What do you want to correlate? What do you want to regress? –  Jan 04 '17 at 08:16
  • I actually do all of it overall, but at this particular time, I'm looking to to do a cross correlation and granger causality test on the data set. – nimbus3000 Jan 04 '17 at 08:18
  • Perhaps you can edit your question to provide us with a sample of your data, specifically indicating what type of noise you wish to screen out or, alternatively, which type of signal to detect. For example, in my last project, I was detecting heart rate variability from monitors strapped to patients for 10 days. During that time, the patient's heart beat 1,080,000 times and I was isolating waveforms two beats in duration. –  Jan 04 '17 at 08:25
  • Unfortunately, I am not authorized to share actual data. I'm searing online to see if i can find a sample dataset in the public domain otherwise I will create some hypothetical dataset which would help me put my point across. – nimbus3000 Jan 04 '17 at 08:55
  • Thanks. The more specific the question, the more meaningful the answer. –  Jan 04 '17 at 09:04
  • So, above is the sample bid and ask for a hypothetical option on a stock. As we see, from row 3 to row 4, the mid doesn't change but the IV changes. Although unlikely but possible, it might happen that mid changes but the IV doesn't. The first instance is very common and can happen (hundreds of) thousands time a day. Now I want to see if there is a relation between IV and Mid. I want to check if there is a granger causality between the two or some lead-lag relationship. Can I use standard econometrics (used on GDP and similar low frequency data) here? – nimbus3000 Jan 04 '17 at 10:58

0 Answers0