Tuesday, May 17, 2016

Seeing things


As I noted before, I'm trying to develop the VBA code to "lag" data on a graph in Excel. But after working for a week and a half on it, I got so bogged down that I decided to set the whole thing aside and start fresh. Maybe what I've learned so far will lead me down a better path this time.

Part of the trouble I'm having is that I have to see what I want to do before I actually do it. To help me see what I want, I decided to write about it. So here we are.

At the moment I have this graph:


which is based on this made-up data:


There's another column on the sheet, for "Lagged" data. When I put numbers there I get a blue line on the graph. Right now that column is blank, so there is no blue line. I want to take the gray line numbers, "lag" them, and put them in the blank column. The graph will then have a blue line showing the lagged data.

But it's easier said than done.

I also have a table of Lag Dates:


This is where I describe the lagging I want to do. For example, the high point of the red line on the graph is at 2007. The high point of the gray line is 2006. I want to lag the 2006 data one year to make the blue line peak in 2007 with the red. Likewise, I lag the low gray point from 2010 to 2011 for the blue line.

I did a couple other things to make it interesting. The table doesn't say anything about lagging after 2011, even though the data goes out to 2015. (And yes, the lag-able data stop at 2015 even though the graph continues on to 2016.)

And in the first row of the table, the lag-able data begins in 2002, but I lag it back to 2001. I can picture that. But what would happen if I did it, and there was already data for 2001? The original 2001 data would get pushed off the graph. That's how I see it, anyway. And like I said, I have to see it before I can make it happen.


According to my table of lag dates, I want to take the value for 2002 and move it back to 2001. And I want to take the value from 2006 and move it forward to 2007. So data from the five-year period 2002-2006 has to stretch out to cover the seven-year period 2001-2007. I have to figure seven equally-spaced values that give me a line that looks like the line I get from from the five "given" values I started with. It's a little messy, but I did have that working.

According to that table also, five values for 2006-2010 move as a group to the five years 2007-2011. There's no stretching this time. I can just copy the values from the "given" column to the "lagged" column.

Now we're at the end of the Lag Dates table. But we are still in the midst of the data. How do I handle the remaining data? Maybe I should shift all the remaining values, lagging them one year as 2010 is lagged to 2011.

Or maybe I should assume that the last given value does not move, because the lag table does not tell me to move it. Then I will have to squeeze the 2010-2015 data into the years 2011-2015. That means I have to calculate values again. Yeah, this is the way to go, even though it is more work.

I can't justify lagging data when a lag is not specified in the Lag Dates table. If I want to move the 2010-2015 data to 2011-2016, all I have to do is add a couple dates to the table. So I can do it that way if I want. But if I don't specify any lag, the code shouldn't create any lag. The code shouldn't take the initiative.

At least, that's how I see it.


The interesting thing doesn't happen until after I get the lag code working. At that point I'll start using actual econ data and I may end up with a dozen or more lines in my Lag Dates table.

The interesting thing will be to make a graph of the lag times, to see how the lag changes over time. I expect to find that some lags are related to the level of private debt. I expect to find that a changing lag describes the changing metabolism of the economy. I expect to find all kinds of fascinating things.

Time to go write code.

No comments: