Guest post by David Haynes, author of Metadata for Information Management and Retrieval, 2nd edition: Understanding metadata and its use
Use of metadata by the security services
“Metadata tells you everything about somebody’s life. If you have enough metadata you don’t really need content” (Schneier 2015, p.23)
If anyone wondered about the importance of metadata, this quote by Stuart Baker of the US National Security Agency should leave no-one in any doubt. The Snowden revelations about the routine gathering of metadata about international telephone calls to or from the United States continues to have repercussions today (Greenwald 2013). Indeed Privacy International (2017) has identified the following types of metadata that is gathered or could be gathered by security agencies:
- Device used
- Length of call
“Metadata in aggregate is content” as Jacob Appelbaum observed when the Wikileaks controversy first blew up (Democracy Now 2013). In other words when metadata from different sources is aggregated it can be used to reconstruct the information content of individual communications.
Invasion of privacy or personal benefit?
These concerns extend well beyond the use of metadata by Governments and the security services. The social media giants prosper by exploiting personal data and targeting digital advertising. Personal profiles of targeted individuals are based on metadata about online use and are the basis of online behavioural advertising. Cookies and other tracking technologies can monitor the online activity of an individual to predict future behaviour. Metadata about online sessions reveals a great deal about an individual and his or her life. This may extend to gathering information about friends, family, colleagues and other contacts.
The upside of this is that metadata is a powerful tool to facilitate use of online services, by remembering users’ preferences and delivering content that is more likely to be of interest or relevance to them. This has to be balanced against the risks associated with online disclosure of personal data.
Metadata describes an information object whether that be raw data or more descriptive information about an individual. This is important because the treatment of metadata has become a political issue. Personal data, especially data that reveals opinions, attitudes and beliefs is potentially very sensitive. Use of this personal data by service providers or by third parties can expose users to risks such as nuisance from unwanted ads, harassment from internet trolls or fraud through identity theft, if the data is not held or transmitted security. Many digital advertisers would say that because the data is aggregated it is not possible to identify individuals – i.e. the data is anonymised. However this is no protection against privacy breaches as has been demonstrated by Narayanan and Shmatikov (2009) and others.
Daniel Rosenberg (2013) makes a nice distinction between data, facts and evidence. Data if true may be a fact, but if false ceases to be a fact. Samuel Arbesman (2012) in his book ‘The Half Life of Facts’ introduced the idea that in a given period half the certainties that we had are shown to be false or are superceded by new understandings and that they cease to be ‘facts’. Data, whether it is true or not, continues to be data, but is only factual if true. Perhaps there is some way of recording the reliability of information or data so that it can be exploited appropriately. Many of the arguments and counter-arguments on climate change for instance centre on the quality and veracity of the evidence used by each side of the debate. This idea is not new, as medical researchers have for some time evaluated the quality of research used to make clinical decisions. This information about the quality and reliability of data is metadata.
Metadata is political
Metadata has become a political issue because of its use by security agencies and because of wider privacy issues in the commercial world. Anyone who had asked the question ‘What does metadata matter?’ prior to 2013 will realise just how important a bearing it has on current political issues. The Fourth Amendment to the U.S. Constitution protects ‘The right of the people to be secure in their persons, houses, papers, and effects, against unreasonable searches and seizures’ (United States 1791). A lot hangs on the interpretation of privacy as Solove (2011) has so eloquently discussed in his book ‘Nothing to Hide’. ‘Fake news’ is not new, but the phenomenon has reared its head in recent elections and is unlikely to go away any time soon. Good governance also depends on a good understanding of metadata and accountability for past actions.
Metadata for information management and retrieval
In the new edition of Metadata for Information Management and Retrieval, published in January 2018 I consider the origins of metadata and look at the ways in which it is used for managing information resources. The ethical dimensions of metadata are explored and issues such as governance, privacy, security and human rights are considered. The book also discusses the digital divide and the potential that metadata has for making information accessible to wider audiences.
Metadata has an important role in politics and ethics. How then do we manage it to best effect?
Haynes, D (2018) Metadata for Information Management and Retrieval: understanding metadata and its use. ISBN 9781856048248. Facet Publishing. London, 2018, 267pp. http://www.facetpublishing.co.uk/title.php?id=048248
You can follow David on Twitter @JDavidHaynes
Arbesman, S., 2012. The half-life of facts : why everything we know has an expiration date,
Democracy Now, 2013. Court: Gov’t Can Secretly Obtain Email, Twitter Info from Ex-WikiLeaks Volunteer Jacob Appelbaum. Available at: https://www.democracynow.org/2013/2/5/court_govt_can_secretly_obtain_email [Accessed March 21, 2017].
Greenwald, G., 2013. NSA Collecting Phone Records of Millions of Verizon Customers Daily. The Guardian. Available at: http://www.theguardian.com/world/2013/jun/06/nsa-phone-records-verizon-court-order [Accessed July 7, 2014].
Narayanan, A. & Shmatikov, V., 2009. De-anonymizing Social Networks. In 2009 30th IEEE Symposium on Security and Privacy. IEEE, pp. 173–187.
Privacy International, 2017. Privacy 101. Metadata. Available at: https://www.privacyinternational.org/node/53 [Accessed March 23, 2017].
Rosenberg, D., 2013. Data before the Fact. In L. Gitelman, ed. “Raw Data” is an Oxymoron. Cambridge, MA: MIT Press, pp. 15–40.
Schneier, B., 2015. Data and Goliath: the hidden battles to collect your data and control your world, New York, NY: W.W.Norton.
Solove, D.J., 2011. Nothing to Hide: the false tradeoff between privacy and security, New Haven, CT: Yale University Press.
United States, 1791. U.S. Constitution Amendment IV, United States.
Join our mailing list
Sign up to our mailing list to hear more about new and forthcoming books.
Guest blog by the co-authors of The Chief Data Officer’s Playbook, Caroline Carruthers (Group Director of Data Management, Lowell Group) and Peter Jackson (Head of Data, Southern Water).
Gartner predicted that by 2019, 90% of large organisations will have hired a CDO – but only 50% of these will be a success. Much of what determines your success or failure going forward will take place in the first 100 days. Essentially it is about getting the basics right now and building firm foundations for the future.
What do you expect when you start?
The first hundred days are important to set the expectations for the CDO you are going to be going forward; now from one CDO to another, expect a real rollercoaster of a ride, there will be amazing highs followed by moments where you sit with your head in your hands wondering what on earth you have done. Basically a microcosm of the rest of your role as a CDO just crammed into a shorter time period.
Case for change
The very first thing you need to do is understand your organisation’s case for change; if it’s not there, create it; if it needs help, redefine it. But whatever you do make sure you have a clear easy-to-describe case for change. In order to be an effective CDO you will be changing the organisation, and no change starts without a burning platform or an absolutely massive benefit at the end. If you can’t find the case for change then you might as well go home at this point.
What you are aiming for
The case for change helps you set the vision for what benefits you are aiming for, whether they are saving the organisation from repeating mistakes or gaining insight to derive more value. It’s the compelling argument that makes people want to help create the future you are selling. It also helps to set your scope out and start to set expectations about what you will and won’t be doing. People often forget about the ‘not doing’ part of a scope but it’s equally important as what you are doing, if not more so, without it people can overlay their own expectations and just assume they are getting everything they’ve always wanted just because they misinterpreted what you meant. Whilst you need to create a compelling vision, it’s best to be realistic about where you can go, what it will feel like, and how long it is going to take to make a difference.
There is no point in starting a journey without having an idea of your destination. You don’t need a fixed point you are trying to drag the company to, rather an idea in mind of where you are leading them. A bit like giving them a treasure map where you might not have buried the treasure yet but you know what island you are burying it on, they will get more maps the closer to the goal they get.
We are going to assume you have a team in place, knowing how long this process can take, unless we assume you have a team in place the whole story of your first 100 days will be taken up by fighting to get people to come and help you against departments who practice the dark arts and refuse to let you see the play book. There is a need to have people around you to help as no one person will ever be able to change the company without a lot of support. Apart from the need for skills and experience that are varied and wide ranging, you also need the support when you have some of your rollercoaster lows to help you get back on the upward track.
Then you need to look at what basics you are trying to get right, what materials are going to make up your foundation?
To keep it simple we’ve broken these down into three main areas
Let’s face it, you will be making changes to the organisation and you might not always get it right first time – remember the old saying ‘if you never make a mistake you aren’t trying hard enough!’ so what must be in place is a way of letting people know what is expected of them, what are they really accountable for; be that policies, standards, procedures or whatever your company used to help everyone understand their responsibilities, as well as a control mechanism for managing those policies. How do you make decisions on how the organisation needs to treat its data and information? Who is involved in this process? If you are smart you get people involved who cover large parts of your company – the plot for ‘buy in’ starts here.
Next let’s look at your information architecture, not the vast swathes of detail that sit in your data dictionary (at least not at this point) but the big headings. What are the top 5 to 10 ish headings which describe all the information in your company and (most importantly) who is the one person who could make a decision on each one. This is not about playing the blame game, that just makes individuals hide from any kind of accountability and leads to a kind of company wide whack a mole game. Remember the quote from above ‘if you aren’t making mistakes….’ Your information domain owners are accountable experts in their fields who understand specific areas of information within your business and can give firm direction and decisions in their area. Once you have the highest conceptual level agreed then it’s time to move onto the next level, adding richer detail as you go.
Lastly and definitely not least, how are you going to engage with the company? Where is your network of evangelists coming from who will sell your message? It’s great that you know who can make decisions about the information and that you have clear instructions on how people should treat your company’s data but it really is pointless unless you tell them. Naturally we are talking about mass company wide emails that of course everyone reads every detail of, inwardly digests and miraculously and immediately changes their behaviour…….. in our dreams! This is hearts and minds time here, what is your compelling argument to change, how are you making their life better and what is in it for them that makes it worth changing their behaviour? At the very least tell them what you expect from them.
Get all that right and at least you know you have covered off your basics while you start your journey.
The Chief Data Officer’s Playbook will be published in November by Facet Publishing.
Sign up to our mailing list to hear more about new and forthcoming books. Plus, receive an introductory 30% off a book of your choice – just fill in your details below and we’ll be in touch to help you redeem this special discount:*
*Offer not available to customers from USA, Canada, Australia, New Zealand, Asia-Pacific
This blogpost by Facet author Alan MacLennan was originally published on the CILIP website last year. We have re-published the post today as information security is back in the news following the cyber attack on TalkTalk last week.
There’s a lot of concern at the moment about the threat from GOZeus and Cryptolocker – the first of which is a piece of malware which steals banking details, whilst the second encrypts your data, after which you are held to ransom for its recovery.
The two threats appear to operate together, and have been scaring lots of people this month. They appear to be confined to Windows systems, which is no great consolation if that’s what you have, and there’s no guarantee that even paying the ransom will result in your data being recovered, so it’s a pretty bleak picture, if your system becomes infected.
Tips for individuals
- Backup your data
Just as well you can restore from your backups, then. You do have recent backups, don’t you? Oh, dear. Pity. Better kiss your system goodbye, then, until someone works out the decryption, if it’s possible.
It’s a good time to emphasise the importance of a good backup procedure for your data. Don’t worry about applications, you can re-install them from the installation media, but get a good backup procedure in place.
You might have to wipe and re-build the whole system. There are several ways to go about it – full, incremental, differential, mirroring – and you need to find which suits you best, but a good first step is to copy all of your data to a removable medium that you can keep separated from your system. That gives you a bit of breathing space, and you can then just back up what changes day-to-day, until you get a proper system in place. But start it copying right now.
- Look at passwords
It’s also a good time to look at passwords – the sort of target that GOZeus has in its sights. Do you let Windows, or your browser, remember passwords for you? That’s right – bad idea. Do you keep them, unencrypted, anywhere on your system? Another hostage to fortune.
Consider using a service like LastPass, which gives you access from anywhere to your passwords, which are stored in encrypted form on their server and in a “vault” on your machine. It will also provide hard-to-crack passwords, and remember them for you. Other, similar services are available.
- Make sure your system is patched and updated
Now, with some holes in the dam patched, temporarily, what can we do to avoid these nasties? If your system is connected to the network, you’re a target. Even if you’re not running Windows, there are other “exploits”, though not nearly as many in number, because Windows’ popularity makes it the most lucrative target.
So, first make sure you have your system patched and updated – that can be done automatically by Windows Update, or there are system update tools for Linux. If you’re still running Windows XP, you’re a hopeless optimist.
Keep the antis-virus and anti-malware programs updated. If you don’t have them, there are good free versions readily available, and Windows own Defender and Security Essentials come with the OS.
- Don’t open email attachments, unless you’re sure they’re safe
Don’t open email attachments, unless you’re absolutely sure that you know the source, and you’re expecting the attachment, and you can confirm that the source sent it.
That’s probably the main way these bad things get spread, but apply the same principles to hyperlinks in emails, even if it means you miss out on those millions of dollars waiting for you to look after them, or the promised revealing photos.
And speaking of revealing photos, web sites with “flesh-coloured images” (thanks to Bruce Royan for that term) aren’t the sort of thing you should be consulting at work, but are a really good source of more nasties.
Excuse me – I think my backup’s finished <ahem!>
Tips for organisations
Now, I’m not concerned about the machine I use at work because Robert Gordon University is a fairly big university with a wonderful IT Services department and infrastructure in place.
Lots of organisations aren’t that fortunate, and if you’re in the information profession, you might well be the most knowledgeable person around.
Maybe there’s a technician for the hardware, maybe even an applications supervisor for looking after the software, but it could be that you’re the “go to” person for anything more “information-y”, which is flattering, but comes with a burden of responsibility. Might be that paragraph in the job description that you airily glossed over at the interview?
Ad hoc advice is great, and will raise your profile as an all-round helpful type, but if you really want to be effective, and not to have to repeat yourself endlessly, and to work in a better environment, where the network isn’t at the mercy of the next cyber-hooligan, it’s time to think about policies.
- Create a policy
Policies are good, because they’re explicit, in the knowledge management sense – they’re the captured wisdom, the tablets of stone, the things you can point to and say, “That’s how it’s done” which is immediately more impressive than “Well, what I do is …” Policies can be encoded, made part of induction programs, produced as evidence of good practice – they tick another box, if you will, but you’ll rarely be criticised for having too many.
So, what goes on the shopping list? A backup policy would be good – either take responsibility for your data, or save it to as shared drive, which can be backed up centrally. Patches and updates, antivirus – it depends on your systems what will work best, but to write the policy, you have to think about that, which is what counts.
How else can our systems get infected by malware? What about a BYOD (Bring Your Own Device) policy? If people can connect their phones, tablets and Google glasses to the network, or bring in USB sticks, that’s another vector of infection, to adopt the medical metaphor which viruses so neatly match.
I’m not telling you what your policy should be, but those are at least some of the areas you should address.
- Educate people about email
Email behaviour is more a matter for education: “did you hear what happened to so-and-so? Clicked on a link in an email and … I’d be so embarrassed if that happened to me.”
And you may be dealing with customers, colleagues, your customers may be colleagues – there will be lots of possibilities to exercise your skills in user education. However, if you can be the unseen hero(ine) who saves the system from a fate worse than usual, well, it’s just another day as an information professional.
So, think about what you know, and about how you can best apply it to your organisational context. Critically evaluate the situation regarding this aspect of information security in your organisation. Think about your role as an individual or a department, and how that can be influential in shaping policy.
It’s not unlike a scenario exercise from an Information course, but it’s real, and you don’t have a long time until the submission date. Good luck.
About the author
Alan MacLennan MA, MSc PhD has been a lecturer in Information Management at Robert Gordon University since 1993. His previous experience includes periods as an analyst/programmer and as an assistant librarian. In 2007, he was awarded a PhD for a piece of research into user preferences regarding virtual worlds for information retrieval. He is the author of Information Governance and Assurance.