Friday 10 June 2016

First Week – First Meeting

Kicking the project off to a controversial start, in a meeting with my supervisors Stuart and Danny we decided to shock all by throwing away the idea or refactoring the LightCurve and simply creating a totally new datatype, called the TimeSeries.
OK, to clarify, we actually discussed this at the SunPy developers meeting a while back, it was pretty unanimously agreed that using the term lightcurve as the name for all possible time series data (including non-light data) was confusing. The actual decision of if a LightCurve class will be defined as a child of the TimeSeries class is an open debate, primarily for it’s SEO and documentation advantages (lightcurve is a well known scientific term), but that can be easily implemented at a later stage.

Further to this, the meeting discussed a lot of details of the project, potential directions for it to head and the general areas of expertise of my supervisors. Though admittedly Nothing much more special came from it as it was more about orientation.

Meet The GenericTimeSeries

But after the meeting the fun (or maybe just the) coding started.
I started with creating a basic class to hold 3 items:

  1. Data: as a Panda DataFrame,
  2. Meta: as an ordered dictionary,
  3. Units: as another dictionary to store the AtsroPy Quantity related to each column with index keys matching the corresponding column name.

I then went to work replicating the main functionality of the ideal Lightc… erm… TimeSeries API as specified here:
    github.com/sunpy/sunpy/issues/1520
Using direct access to the data DataFrame.

Once this was complete, and I had some basic tests for the general functionality, I tried to incorporate the functionality within methods for the TimeSeries class, matching the old LightCurve class interface where possible.
Note: we expect to radically alter the interface away from that of the old LightCurve, but it’s a good place to start.

Generally the coding was pretty easy, a little bit of thought made implementing truncation using either SunPy time ranges, Pandas date strings or basic integer slicing all work together. I also spent some time on the basic object parameters, now the date of the TimeSeries defaults the date of the first entry, for example.
So now we have a new, if rather rudimentary, class to start building upon.
Note: the use of the GenericTimeSeries name is to keep in-line with the SunPy GenericMap, this allows us to call the factory using TimeSeries(args) and leads to a simple user interface.

Next step, making some sub-classes for the instruments.

3 comments:

  1. Love it! I'm looking forward to see time series of light, particles and anti-particles :)

    The only thing I'm not clear is in the distinction in Pandas within dataframe and a series, I thought the later were more appropriate for time... but I've never got the time to get my head around them.

    Thanks for the update!!

    ReplyDelete
  2. The Pandas Series, if I'm correct, only generally contains a single set of data (1D) as a column, where the DataFrame is able to hold as many as you want.
    I think functionally they are otherwise equivalent.

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete