0

Supose I have a dataset containing mean temperatures for each month. In my region this means -Celsius in winter months and positive values in other months.

I need to inverse this data. Have no negative data at all, and have high values for winter months (which have originally low values) and low values for summer months (whch originaly gave high celsios)

S o I need some sort of transformation. The reason is that I want to use these data for pearsons correlation with other data.

Stephan Kolassa
  • 123,354
  • 2
    Why do you need to transform your data in order to calculate a correlation? – Stephan Kolassa May 03 '16 at 15:02
  • It sounds like you have "data" which is all screwed up, and for which you have no idea what the relation, if any, is to what the data supposedly represents. Shouldn't you attempt to track down why your data is screwed up, or get correct data, rather than making up a transformation to make the data not look so obviously screwed up? – Mark L. Stone May 03 '16 at 16:45
  • @Mark I don't understand your comment. The OP is simply describing temperature data measured in C, which has negative values for winter and positive values otherwise. Nothing is screwed up with the data that I can discern (though the description is a little confusing). The problem is that the OP (mistakenly) seems to think that you need to have positive values to compute a correlation. The reason for wanting to flip the sign of the correlation is also not made clear but I expect stems from another misunderstanding. – Glen_b May 04 '16 at 00:04
  • @Glen_b , F vs. C was 1st thing I thought of. But that doesn't explain "have high values for winter months" and "low values for summer months". Maybe I am misinterpreting what he meant and perhaps the OP's English is not so good, but it didn't just sound like F vs. C to me. The low values for summer months was the clincher for me. Summer numbers in F would be higher than in C – Mark L. Stone May 04 '16 at 00:10
  • @Mark I'm sure the intent was that all values are in Celsius, but I can see how you could read it otherwise – Glen_b May 04 '16 at 00:19
  • I want to use google correlate http://www.google.com/trends/correlate/nnsearch.pdf – Rigolletto May 04 '16 at 06:05
  • and from my understanding it needs positive values. Also the values I am looking for correlation to are the winter temperatures, not the summer temparuteres that have higher value in Celsius – Rigolletto May 04 '16 at 06:06

1 Answers1

2
  1. There's no need to have all-positive values to compute correlation; indeed the correlation calculation itself will subtract the mean of both variables during the calculation, so any variable that actually varies will have both positive and negative values at that point; it won't matter what signs the variable had before subtracting the mean -- e.g. imagine I add 100 to all my points, shifting my mean up by 00 as well; the correlation calculation then subtracts the mean, leaving me with the same mean corrected values I'd have started with if I'd done nothing. Indeed, correlation is not affected by linear transformation, aside from possibly its sign (leaving aside the degenerate case of multiplying by 0).

  2. You can flip the sign of a correlation by multiplying all the values in one of the two variables by a negative number (-1 will do). This is the "high values for winter months" you mention -- but it's not at all clear what you need to do this for. Why would a negative correlation on the original variable carry any less information than a positive correlation with a "flipped" temperature. The latter seems more likely to lead to confusion in a typical audience, rather than enlightenment.

    Is there some technical reason you need the temperature variable to be reversed? (Does something require the correlation to be positive?) It's not at all clear from your question.

Glen_b
  • 282,281
  • yeah, there is a specific tool - google correlate – Rigolletto May 04 '16 at 06:06
  • The Google correlate tutorial shows an example which has both negative and positive numbers, so you don't need to worry about that - but yes you're correct that this tool looks for positive correlations and it explains that to find variables that will be negatively correlated with your data, simply multiply your data by -1 -- which is the same as described in the second paragraph of my answer. – Glen_b May 04 '16 at 13:48
  • what If I want to do the same for revenue data - I would like to search for correlation when our revenues are low – Rigolletto May 05 '16 at 06:19
  • I don't understand what you think the problem is. If you expect two variables to move the same direction, just use your numbers. If they move the opposite direction, negate your numbers. It won't matter at all if they go negative – Glen_b May 05 '16 at 06:20
  • its really an additional question. Now I have company revenue data. In some month revenue is lower than others. I know want to use the google correlate tool to search correlation with months that have lower revenue. So I was thinking that I need to switch the values somehow. No revenues are negative, just some are lower. – Rigolletto May 05 '16 at 06:26
  • If it's a new question you should probably post a new question (which also explains the limitations of the tool you want to use). Its not quite clear to me what you mean by "correlation with months that have lower revenue" ,,, if you were to plot whatever the other variable is against revenue, what would the plot look like? – Glen_b May 05 '16 at 06:30