Later, Data

These days we live in an age where we never, ever, delete anything. Email, photos, blog comments, music, documents…it all follows around, much of it living on the Internet, and moving from one service to another is an incredible pain. Jono Bacon and Stuart ‘Aq’ Langridge explore whether we need to re-seat the balance and develop a culture in which we delete things we don’t need anymore, and whether this is even possible or not.
Of course, we are the very start of the discussion! What do you think? Can you see a world in which we get rid of data we don’t need? Do you like to have data lying around? Is it really that much of a problem to move between services? Would you be suspicious of someone who deletes data? Share your thoughts in the shot comments below…
30 Comments to “Later, Data”
Leave a Reply



Although I tend to keep my data it naturally gets lost in disk breakages, misformats, sometimes by stupid rm -rf *. Of course I almost never look back at it, except some personal emails or photos. I think this is a quite natural way of keeping things⦠(and yes, I have some 386 CPU’s and a VESA Local Bus graphics card in my drawer
I think its great that I can store/save/share my stuff all over the place. Got a document someone has to read quickly, google-doc’s, looking for a mail 5years ago, google-mail. Pictures of the holidays or latest rugby match, flicker/picasa/facebook. I love being able to get to my “stuff” from anywhere on any computer.
If you want to delete data, why did you put it up on the cloud in the first place?
I don’t think it’s necessary to completely delete old email. It does make sense to delete truly unnecessary stuff that’s stored elsewhere (mailing list digests, etc) and archive older email such that it doesn’t show up with the rest.
Whenever I open my GMail account’s Sent folder in Thunderbird, I see slightly embarrassing subject lines from 2004…I don’t necessarily want to delete them permanently, but I don’t want to see them mixed in with current emails either.
You should be able to write a pretty quick script (eg, with Python’s imaplib) to create monthly archives that you stash away and forget. You can still dig through them in a CYA emergency, but your actual email account would be squeaky clean.
I’m a great one for hoarding old stuff ‘just in case’. I was gutted when I realised I’d offloaded some PC cables that I later needed.
I hate throwing stuff away if someone else could use it. I don’t always need to get money for it, so a lot goes on Freecycle.
As for emails and files, I’d like to see a way to tag them (preferably using automatic methods) according to how long I want to keep them. e.g. weekly newsletters I may want to keep for a month, but the weekly list of gigs from LemonRock is redundant after a week. I should be able to override this for any item.
I’ve thought of doing something similar for my paper files, so I can easily clear out the redundant stuff. I have a box for credit card receipts with 3 compartments for the last 3 months. When the next month starts I chuck the oldest batch. I’m not so methodical with other stuff.
Actually I find tagging really useful whether I intend to keep the data or not. Maybe the semantic web/desktop will see us have more of that in the near future.
Wearing my I-get-mail hat: I try to clear out really unnecessary emails (like ones from my last job, or the job before that) when I come across them. I don’t clear out personal email all that much, just because I like being able to look back at it occasionally, and I keep my work email around because it’s always useful to be able to look back at old threads, and searching in GMail is a lot easier than trying to find the relevant thread in mailman, say.
Wearing my photograher hat: I never delete images unless they’re actively bad (i.e. entirely black or white frames). My image archive is now ~150GB, mostly RAW files. I know that I’ll need to archive those away somewhere sooner or later and just keep the edited JPGs around, because after a couple of years the RAW files that you don’t use aren’t worth holding on to. However, I always forget to do this, so my archive will likely stay at its current size for some time.
I’m a paradox when it comes to this: I throw out physical stuff as soon as it’s not relevant anymore, because I hate clutter. I do clean up my file system every now and then, and the stuff that I do keep is all well organized.
Email on the other hand, I don’t delete at all. And I must agree with Jono: it’s just handy to find that 5yo conversation with x on y subject. Documents sent through gmail are handy to relocate as well, so I have a lot of that stored there as well, with no local backup of it. Certain other documents end up in a Dropbox directory, spread over several pc’s at home and at work.
Not all that well organized when you come to think of it, maybe a bit worrying. Maybe a good idea to try and store everything in one place. And then have several backups of that in different physical places and different physical formats.
You guys have it all wrong !!! … just joking. I’m really happy you made this shot. My favorite shot ever actually
I personally delete allot of stuff. All the old emails that are not relevant anymore I get rid of them. For example if I leave a mailing list I usually delete all the emails I got from it. Also whenever I loose interest in something I remove all the stuff I had on it. That goes hand in hand with having everything labeled in my gmail account. Also all the usual spam and funny mails I get rid of them regularly. That’s helped by the fact that I only have a few people sending me those. And everything that’s left after that and still really old goes into the “Old stuff” label which I have hidden so I don’t even remember it’s there. It would be the same as removing all labels and archiving the mail. I also moved from yahoo to gmail a while back and deleted everything I had on the old account in the process. I kept just a few things that I thought were important and removed everything else.
I don’t delete pictures or e-mails that are work related(current job). And I wouldn’t delete comments either but that’s a matter of censorship which doesn’t entirely apply to personal data. But one thing I would like to remove but really can’t is the stuff that’s out there in the cloud and I really don’t want to have anything to do with. For example on facebook I regularly get invites or suggestions to add people I don’t want to see anymore, for one reason or another. Actually I deleted my facebook account too because the virtual relationships started to bug me out. But with everything going social these days whenever you create a new account on anything you’re sucked right back into the past. Why can’t I just move on if I want to? And no I’m not talking about an ex girlfriend here.
Also from what you guys said I take it you keep lots of unread emails in your inbox. I used to do that a while back but then I saw this lecture by Randy Pauch on time management. I think it’s the best thing I’ve seen on the subject and I highly recommend it: http://www.youtube.com/watch?v=oTugjssqOT0
Oh and why do we assume things are always going to be there. You never know what might happen to flicker in the future…
As someone applying for to take a bar exam in July, I can say that old emails I thought I’d never look at again have been a lifesaver for determining the dates of events I never would otherwise have thought important enough to memorialize. This is especially helpful where the person on the other end of the interaction is someone I didn’t end on the best of terms with.
What I would like to see is an AI program that would “live” on my desktop and be trained to deal with my files they way I want them dealt with.
For example, I’ve got stacks of JPEGs covering photos of me, photos I’ve taken, stuff I’ve saved from the ‘net, scanned in stuff. I would like photos I’ve taken to be sorted into a timeline of sorts, allowing me to see when and if possible, where they were taken.
I’d want random junk from the ‘net collated into a group of folder by themes such as “amusing”, “impressive”, “WTF”, etc..
If I could have documents be treated that way also, I’d be very pleased.
By having an AI sort through it, context could be used as a determinant, or I could be prompted with questions such as “What do these photos have in common?” or “These photos were all taken within 24 hours of each other, what was the event?”.
Perhaps it’s a lofty dream, but I think that would be incredibly helpful.
This is a cool idea, something I know I would like, the problem is, do you really want something to pop up everytime you save some random photo?
If yes then fine, but most would rather not, but what happens if it put something in a wrong place, can you forgive the system and correct it?
I think the biggest problem with this is it is all or nothing (At the moment).
As it would be running all the time and be contextually sensitive, it shouldn’t need to ask immediately. I would personally want it to index the new file/s and make predictions as to where the file is to go, but not bother me with it immediately.
In terms of bugging me about when to advise it with pop-ups of “how about now?” I would want the software to monitor when I have said no previously, and attempt to ask at times when I’ve previously said yes. Perhaps there’s some time on Wednesday afternoon when I normally have nothing on and do stuff like that – by monitoring when I do that and learning, it could ask me at a time that is likely to work for me.
As for mis-filing, there’s no real reason it couldn’t create temporary symlinks or shortcuts in multiple places, for a suitable period (perhaps two days, perhaps longer, depending on access and usage by the user). That way it could move the file into the structure but keep it being accessible for the time being, until I don’t need it any more.
There’s plenty of stuff that would need to be worked out, but in general I think it could work quite well. I would imagine that something like ALICE could be used to provide a learning, talking AI to attempt to provide a more natural frontend.
I had copies of all the SoJ podcasts on my hard drive (over 650 MB). I took this podcast to heart and have now deleted them all.
I think the reason that a lot of people keep all their data is because it’s easier to keep it than to throw it away. The downside of this is that if you ever do need that data again, you will probably have forgotten that you ever had it in the first place. The stuff that matters gets lost in with all the stuff that doesn’t matter.
IMO keeping data around which is just stuff that you downloaded from random internet sites is completely pointless. If you need it, just download it again.
I am an aspiring minimalist and all of my data is periodically purged and reorganised, normally when I have nothing else to do. Old emails are deleted, inbox is almost always completely empty. Ditching a large amount of crap, be it digital or physical is always extremely liberating, if anyone here is lugging around tuns of junk, ditch it, you will feel much better.
All of my file-system data is version controlled, so I still have access to old files, without them junking up the file system. With things like photographs, I only keep the best ones. I don’t have a “desktop” so there’s no crap on it and my physical table is normally clean also.
I recommend tacking a look at the following links:
http://zenhabits.net/2007/01/email-zen-clear-out-your-inbox/
http://becomingminimalist.wordpress.com/benefits-of-minimalism/
http://www.farbeyondthestars.com/?p=991
http://zenhabits.net/2009/05/how-to-create-a-minimalist-computer-experience/
Well that might be a bit extreme…for most people anyways. But thanks for the links.
You can’t “ctrl+f” your attic. The amount of data you can store and the ease with which you can manage it are both increasing, but not at a steady rate and not in step. Right now we are waiting for the manageability to catch up to our storability.
If people got in the habit of deleting things from the internet after some period of time, individuals would get in the habit of saving local copies of things that they thought they might need someday. This would make things worse.
If anything, the whole point is to make this gigantic pile of data easier to work with, and make the concept of deleting or partitioning data go away.
The quickest way to get rid of data is hard drive failure. I have piles of CD backups from even 5 years ago and I’ve only ever gone back to just one of them.
Maybe some good rules of thumb for keeping old data should be floated? I kinda think that if the data isn’t worth sharing, its going to head for the bin, no? All my own intentions of keeping old coursework or old projects around keep appearing moot: most of this data ages so quickly or is something that’s now better on Wikipedia, it’s hardly worth keeping. But if it’s worth sharing, then it proves value to someone else.
Out of sight, out of mind holds true for me and electronic interfaces make this easy. As long as you have a system of categorizing and archiving your documents you shouldn’t have a problem.
-Unlikely to refer to it again? Zip it all and slap it in your storage location. -Probably refer to it in the future. Just label and tag it. -Don’t keep drafts unless they have some unique value. Don’t duplicate documents. That’s what backups are for. -Don’t be a tagging/labeling perfectionist. If you didn’t geotag your photos from 5 years ago don’t bother now or do it at a very high level. -Only categorize/archive and backup your own data. Backing up old podcasts or lolcats images is pointless. Stick them in a folder if you want to but don’t try and maintain it for order or quality. Storage is cheap your time isn’t.
I’ve found that in applying for residency in another country having old emails to confirm flight dates, communications and other things have been invaluable. Without these I would have struggled to even complete the application form. Other things like paylsips, tax forms, etc are obvious must keeps for up to 10 years depending on where you live.
For me physical items are far more burdensome than electronic so I scan everything that I don’t HAVE to keep the original of then bin it. The result is that I have one file containing all my paperwork ever. It was liberating trashing reams of paper. I’ve started doing the same with other ’stuff’ in my life and have found that it’s an invaluable process. As a result I’m less materialistic and don’t waste money on whims.
As far as wanting to forget a part of your life then I suppose some people would find it very liberating; knowing that no one else would ever find out about x or y. I’d rather, foolishly perhaps, hope that with most people having this large life history would result in introspection, personal development and a less judgemental world.
Well I find deleting old e-mails and cleaning up my inbox extremely satisfying.
Im the same with my PC. Im always deleting old stuff and stuff I don’t need.
Im not too bothered about deleting data that is personal, so my gmail stores up all my e-mails, I just keep it down to 0 inbox of new e-mails, but I never delete anything.
I think the main problem is public information that you later forget about, these sorts of trails I do try and clear, just out of paranoia.
At the end of the day, I’m constantly warned by lecturers how possible recruiters will search for me on Facebook and Flickr etc and I make sure I cut off Facebook from everyone apart from close friends and everything public I filter to make sure only what I want people to see will be found.
The only issue is if I later want to filter this data differently, that it becomes a bit of an issue.
Personally its Public vs Private in terms whether I “tidy up” my digital stuff or not.
The problem is that there’s two types of data (public and private). If you delete public data (say, comments), people do become suspicious, because it looks a lot like 1984’s document revising. Nobody regards you with suspicion if you delete old emails.
A few years ago I set up a Subversion store for all my data, and took quite some time finding and copying stuff off old backup CDs, and recreating the original state of my personal “docs” folder, to recreate the original history in the SVN repo. Perhaps I’ll never look at it – but maybe my biographer will
)
Gerv
Cool, someone else who stores there data in a VCS.
I keep all my emai for 2 reasons: 1. my business emails, never know if I’ll need to reference to something in the past 2. personal emails it’s like real paper letters – it’s memories that I whish to keep and remember.
It’s like ones history. With disk space being so cheap and hard drives getting bigger and bigger, why deleting anything.
Of course this is all my personal data kept on my computer. I don’t like leaving trails on the internet because I don’t have control over it….just as I don’t have control over the destiny of this comment
This is a problem for businesses, if they do what most do is keep everything forever, then this has consequences. For example if you are served with a court order for records, if you have the records you have to produce them. Producing records from backup tapes 5 years old, berried in a data archive somewhere, and then proving to the court that you have provided all the relevant records can be very expensive and time consuming. If you have a data destruction policy then you can just waive this and say it’s been deleted.
I have been trying to implement this but getting people to agree to delete data is very hard, the “I just might need that” argument is very strong.
Some data that isn’t linked to me or just isn’t all that important I would get rid of. But some data is tied to memories, and that I think should be kept around. Unfortunately like most people, that if there some file or folder that I have no idea what to do with, it just stays around.
Painless solution for me.. All files are in folders on a server share (sysadmin makes his backups). I auto backup to an external hard drive which I can take home about four times a year to copy to my little home system NAS. Current projects I keep in Dropbox so I don’t have to carry them on a flash drive with me. If you are not using Dropbox you really should. Works great in Ubuntu and Windows.
Im a person who deletes nearly all emails and gets rid of all old crap from my computer. Disk space isnt cheep for me really being a student and all.
I dont get as many emails as jono anyway so thats not so much a problem. I sort everything very well and scrub lots of emails regularly but they are still backed up on gmail anyway. For my blog ill post almost any comment but I review comments from new people just in case.
Well i keep deleting old data from the web and from my disks, but old hardware? I have that old stuff way too much, even if they do not work i still seem to keep them and i even gather it more.