Friday 24 June 2016

Getting The Factory Working

After a less then productive week, with 2 drives failing on me and so a lot of windows reinstalling, I have finally started getting some real progress.

Welcome TimeSeries Factory

With the factory class mostly there but not actually working, I needed to spend some time debugging.
The issue there is that there is a lot of rather complex code in the factory, not least because I stole… erm… Burrowed code form the Map Factory and it is a rather complex process to follow.

Essentially the factory takes a series of arguments like filepaths, data/header pair, folders and/or the kitchen sink; then returning either a time series or a list of time series’.
To do this it has to evaluate each input, then call the relevant source class constructor.
In contrast to (image) maps, lightcurves and time series data are often stored in a diverse array of filetypes, some which don’t make it possible to auto detect the instrument source of the file.
So in this case we had to manually declare a source and change the Map Factory code a bit so it doesn’t try to read files it can’t.

With this done I still had issues and so I got to simple debugging.
So basically throwing in print statements at just about every other line of code to figure out what each variable holds and where my code stops.
It was messy, but it worked and hopefully by the time you read this the code will be clean and print statement free!


Making Source Classes Work

So after the factory was running correctly and able to select the appropriate source class I then needed to tweak the instrument source class files so that the read in the files correctly.
Again a few bugs emerged, but primarily the this was pretty simple as most of the work was already done with the original LightCurve class.

The result is that now we have the ability to generate a TimeSeries from any of the listed sources in a tiny amount of code, for example, using the sample data:
  >>> ts_goes = sunpy.timeseries.TimeSeries(sunpy.data.sample.GOES_LIGHTCURVE, source='GOES')

And because I had the LightCurve code, I could also use the old peek() method with no additional changes:
  >>> ts_goes.peek()


Meeting


So with this working I showed my advisers and we went through some more of the general TimeSeries functionality, where I showed them how it can concatenate, truncate and resample data just like this:
  >>> ts_goes_trunc = ts_goes.truncate('2012-06-01 05:00','2012-06-01 06:30')
  >>> ts_goes_trunc.peek()

  >>> downsampled = ts_goes_trunc.resample('10T', 'mean')
  >>> downsampled.peek()

  >>> upsampled = downsampled.resample('1T', 'ffill')
  >>> upsampled.peek()

And with (almost) no bugs along the way I’m left feeling pretty happy that the project is moving in the right direction.

No comments:

Post a Comment