Designing For Time

When it comes to modeling and storing data I find that the concept of 'time' is easy to overlook.  Here are a few tips I've gathered from 15 years of professional software development and personal data management.

Chronological Storage

Time is a rolling juggernaut.  It stops for no-one and is a primary factor in deciding what is relevant.  As such it is accurate and even necessary to reflect 'when' data is captured.  In fact, it is quite safe to physically store information in chronological order.

For example, my digital photo collection is over 10 years old and I use no photo organizing software.  The photos are grouped by event into folders that have a date prefix.  These are gathered into yearly folders.

Using the right date prefix for a file is the key.  Specifically, the 'yyyy-mm-dd' format is unambiguous and will sort into chronological order.  For a really long-term outlook group these folders into years and then into decades.  You can quickly look into the distant past without scrolling through folders with excessive numbers of files. EG:

|- 200x
  |- 2000
    |- 2000-04-01 April Fool's Day Gag
    |- 2000-12-25 Christmas Day BBQ
  |- 2001
...
|- 201x
...

Organise For Change

It is natural for data to be partitioned.  For example, I have scanned receipts in addition to my photos.  You might think it good to have a couple of top level folders called 'receipts' and 'photos'.  Unfortunately, this leads to debris or inaccuracy as time goes by and things change.

For example, nowadays I also shoot video.  So do I keep my videos under 'photos'?  Or do I rename it to 'photos & videos'?  As you can expect, there might come a time when photos or receipts are no longer stored here.  In which case I will forever be looking at an obsolete 'photos' or 'receipts' folder.

Instead, partition the data under a broader time period.  For me it's yearly.  That way every year I get a chance to re-jig my partitions; keeping them lean, accurate and useful.  e.g:

|- 200x
  |- 2000
    |- Photos
      |- 2000-04-01 ....
    |- Receipts
  |- 2001
    |- Audio Diary
    |- Media     
      |- 2000-01-01 ...
    |- Receipts

Cross Reference

'But I want to arrange photos by subject?' you say, or some other organizing strategy.  This is a valid question and indeed most data collection is not for archiving purposes only.  This is where a cross-reference is useful.

Any number of cross-reference strategies can be employed.  They just need to reference the information in it's physically stored location, which remains unchanged.

For example, your 'scenery' folder will only hold 'shortcuts' to the photos in the yearly folders.  That is, until you decide you have too many and need to organize the shortcuts.  Copying the original around is one option but this wastes space and introduces problems if the master is modified.

Data Life-cycle

Files do not make sense for most software applications.  These usually make use of a database to store all application 'state'.  But time still needs to be considered.

I like to think of the life-cycle of any and all items of data in the system.  IE:
- How does it enter the system
- How is it used and modified
- How does it leave the system

The last point is often overlooked and so systems carry along old, obsolete data that is never removed.  The effect is a bloated and under-performing database about two years into the life of the application.

A payroll application for example, would quite obviously have a way to obsolesce former employees so they don't pollute lists and searches.  But obsolete work-type or premium codes might hang around indefinitely.

The best answer is NOT to engineer clean-out operations for every part of the system.  Instead, decide how long information should live for and have an archiving process.  Usually everything traces back to time and the system can periodically identify data beyond a certain age and systematically move it from the front-line system to an archive.

For example, in a payroll system one could start with old payslips, which would then free up worker lists and payroll codes to be archived without violating foreign key constraints.

If the archiving does not happen frequently enough then an 'is obsolete' flag can be added to certain tables.  A user make mark a record obsolete which will exclude it from searches until the archive process runs.

General Event Logging

Sometimes it is necessary to capture noteworthy data although it may never actually be used.  This is the roll of a log.

The issue tracker at my workplace is one such example.  I perform units of work organized into a 'ticket', and I write any and all potentially useful bits of information on this ticket.  If and when we decide to clean out the system, it will be the oldest tickets that will get purged.  Meanwhile, I can search old tickets for useful bits of information.  Or my co-workers can monitor my progress or take over if I get sick.

A general purpose solution for documenting noteworthy events in my life is a dedicated email address that I send messages to.  For example, let's say I troubleshoot a problem with the TV and I find a useful piece of information online.  I would send a message to this logging email address with the necessary information, a decent description and a hyperlink so that if I encounter the problem again I know what to do.



Naturally, every rule has it's exceptions.  But in general it's difficult to go wrong if you consider 'time' when making fundamental design decisions.

Comments

Popular Posts