Guest blog by the co-authors of The Chief Data Officer’s Playbook, Caroline Carruthers (Group Director of Data Management, Lowell Group) and Peter Jackson (Head of Data, Southern Water).
Compared to most of the C-Suite colleagues the CDO is faced with a set of unique problems. There are similarities, the CDO is a subject specialist, and in that respect is similar to the Chief Finance Officer, Chief Investment Officer or Chief Risk Officer. The CDO also operates across the organisation so has similarities to the Chief Operating Officer or Chief Accounting Officer. However, the CDO does have a unique set of challenges. More than anything, the role is still being defined and in the absence of certainty the assumption that the role will solve all the problems the organisation is facing. The Chief Data Officer in many organisations is a new role (the number of people in CDO roles doubled from 2013 to 2014, and probably doubled again in 2015 – Karl Greenberg, MediaPost2015), whilst the other C-Suites executives have roles and responsibilities which the organisation recognise and understand.
The Chief Data Officer is bringing a new dimension and focus to the organisation, ‘data’. All organisations will have used and depended on data for a long time, but the arrival of the CDO will be the signal that the business intends to be data driven, that data will have a new importance in the business, and that it will be pivotal to the future of the business. Most organisations will be demonstrating poor practices and bad habits in their collection, use, storage and command of data. So, the CDO will be bringing a new culture and regime and any change brings with it a level of fear.
To achieve this difficult task of changing culture across an organisation, and changing the way individuals and the business use and view their data, the CDO needs some unique qualities.
The CDO has to be a skilled communicator, able to speak to all levels of the business from the board to office floor. The real ability in the communication is two-fold: first, the ability to translate quite complex ‘data’ concepts and technology into the appropriate language for every level and face of the business; and second the ability to use communication to win hearts and minds.
The CDO needs to be a master at relationship building, they will need the support of fellow C-Suite to deliver the data strategy vision. The CDO will rely on other parts of the business to deliver much of the data strategy; IT to deliver the technology, Customer Support to deliver improved data entry. At times the CDO will need to go toe-to-toe with colleagues, but the most effective results will be achieved through good relationships.
These good relationships will be built on credibility. The new CDO must be credible to the board, colleagues and the business. The business must trust and have confidence in the new CDO. The CDO will be leading big, new ideas, and therefore must be credible.
Much of the credibility is founded on specialist data knowledge. The new CDO must know ‘data’ and have a thorough understanding of data governance, data management, data quality, data science, advanced analytics, data strategy and data technology. Perhaps not the detail that the data team will bring, but enough to develop the data strategy and create the bridge between the specialists and the board.
The CDO must be the cheerleader for data and have a driving passion that convinces other people of the value of data and a good data strategy.
The new CDO must be able to shift gear between tactical delivery and strategic planning for two reasons: first to avoid the ‘Hypecycle’, more of that in another article, it is important that the CDO delivers incremental value to the business; and second because they will need to identify the quick wins and easy fixes in the current data environment to stabilise and rationalise the current data environment whilst the data strategy is being rolled out.
The CDO will also need a sprinkling of luck. They will be faced with unexpected situations, difficult people, organisational resistance, institutional muscle memory, the proportions of these will depend on their luck.
Finally, and this probably falls across all of the above qualities, is the ability to recruit good people.
The Chief Data Officer’s Playbook will be published in November by Facet Publishing.
Sign up to our mailing list to hear more about new and forthcoming books. Plus, receive an introductory 30% off a book of your choice – just fill in your details below and we’ll be in touch to help you redeem this special discount:*
*Offer not available to customers from USA, Canada, Australia, New Zealand, Asia-Pacific
As Love Your Data Week draws to a close we’ve got one final open access chapter to share. The chapter, A pathway to sustainable research data services by Angus Whyte, is part of Delivering Research Data Management Services. You can download the chapter here.
For one last chance to win one of our research data management books, share a tweet about why you (or your institution) are participating in Love Your Data Week 2017 using #WhyILYD17. More details about the prize draw are available here.
Sign up to our mailing list to hear more about new and forthcoming books:
Guest post by Starr Hoffman, editor of Dynamic Research Support for Academic Libraries.
Similar to the confusion between open access as opposed to open source, the terms research data and secondary data are sometimes confused in the academic library context. A large source of confusion is that the simple term “data” is used interchangeably for both of these concepts.
What is Research Data?
As research data management (RDM) has become a hot topic in higher education due to grant funding requirements, libraries have become involved. Federal grants now require researchers to include data management plans (DMPs) detailing how they will responsibly make taxpayer-funded research data 1) available to the public via open access (for instance, depositing it in a repository) and 2) preserve it for the future. Because there are often gaps in campus infrastructure around RDM and open access, many academic libraries have stepped in to provide guidance with writing data management plans, finding appropriate repositories, and in other good data management practices.
This pertains to original research data–that is, data that is collected by the researcher during the course of their research. Research data may be observational (from sensors, etc), experimental (gene sequences), derived (data or text mining), among other type, and may take a variety of forms, including spreadsheets, codebooks, lab notebooks, diaries, artifacts, scripts, photos, and many others. Data takes many forms not only in different disciplines, but in different methodologies and studies.
Example: For instance, Dr. Emmett “Doc” Brown performs a series of experiments in which he notes the exact speed at which a DeLorean will perform a time jump (88 MPH). This set of data is original research data.
What is Secondary Data?
Secondary data is usually called simply “data” or “datasets.” (For the sake of clarity, I prefer to refer to it as “secondary data.”) Unlike research data, secondary data is data that the researcher did not personally gather or produce during the course of their research. It is pre-existing data on which the researcher will perform their own analysis. Secondary data may be used either to perform original analyses or for replication (studies which follow the exact methodology of a previous study, in order to test the reliability of the results; replication may also be performed by following the same methodology but gathering a new set of original research data). Secondary data can also be joined to additional datasets, including datasets from different sources or joining with original research data.
Example: Let’s say that Marty McFly makes a copy of Doc Brown’s original data and performs a new analysis on it. The new analysis reveals that the DeLorean was only able to time-jump at the speed of 88 MPH due to additional variables (including a power input of 1.21 jigowatts). In this case, the dataset is secondary data.
Reuse of Research Data
Another potential point of confusion is that one researcher’s original research data can be another researcher’s secondary data. For instance, in the example above, the same dataset is considered original research data for Doc Brown, but is secondary data for Marty McFly.
Data Services: RDM or Secondary Data?
The phrase “data services” can also be confusing, because it may encompass a variety of services. A potential menu of data services could include:
- Assistance locating and/or accessing datasets.
o This might pertain to vendor-provided data collections, consortial collections (such as ICPSR), locally-produced data (in an institutional repository), or with publically-accessible data (such as the U.S. census).
o Because this service specifically focuses on accessing data, it by default pertains to secondary data.
- Data management plan (DMP) assistance.
o Typically only applies to original research data.
- Data curation and/or RDM services.
o These may include education on good RDM practices, assistance depositing data into an institutional repository (IR), assistance (or full-service) creating descriptive or other metadata, and more.
o Typically only provided for original research data. However, if transformative work has been done to a secondary dataset (such as merging with additional datasets or transforming variables), data curation / RDM may be necessary.
- Assistance with data analysis.
o This service is more often provided for students than for faculty, but may include both groups.
o Services may include providing analysis software, software support, methodological support, and/or analytical support.
o May include support for both original research data and secondary data.
You Say “Data Are,” I Say “Data Is” …Let’s Not Call the Whole Thing Off!
So in the end, what does all this matter? The primary takeaway is to be clear, particularly when communicating about services the library will or won’t provide, about specific types of data. In many cases this will be obvious–for instance, “RDM” contains within it the term “research data” and is thus clear. Less clear is when a library department decides to provide “assistance with data.” What does this mean? What kind of assistance, and for what kind of data? Is the goal of the service to support good management of original research data? Or is the goal to support the finding and analysis of secondary data that the library has purchased? Or another goal altogether?
Clarity is key both to understanding each other and to clearly communicating emerging services to our researchers.
Starr Hoffman is Head of Planning and Assessment at the University of Nevada, Las Vegas, where she assesses many activities, including the library’s support for and impact on research. Previously she supported data-intensive research as the Journalism and Digital Resources Librarian at Columbia University in New York. Her research interests include the impact of academic libraries on students and faculty, the role of libraries in higher education and models of effective academic leadership. She is the editor of Dynamic Research Support for Academic Libraries. When she’s not researching, she’s taking photographs and travelling the world.
Sign up to our mailing list to hear more about our books:
Useful as both a teaching text and day-to-day working guide, Digital Curation outlines the essential concepts and techniques that are crucial to preserving the longevity of digital resources.
In this revamped and expanded second edition, Gillian Oliver comprehensively revises Ross Harvey’s original text; widening the scope to address continuing developments in the strategies, technological approaches, and activities that are part of this rapidly changing field.
The key topics covered include:
- the scope and incentives of digital curation, detailing Digital Curation Centre’s (DCC) lifecycle model as well as the Data Curation Continuum
- key requirements for digital curation, from description and representation to planning and collaboration
- the value and utility of metadata
- considering the needs of producers and consumers when creating an appraisal and selection policy for digital objects
- the paradigm shift by institutions towards cloud computing and its impact on costs, storage, and other key aspects of digital curation
- the quality and security of data
- new and emerging data curation resources, including innovative digital repository software and digital forensics tools
- mechanisms for sharing and reusing data, with expanded sections on open access, open data, and open standards initiatives
- processes to ensure that data are preserved and remain usable over time.
The American Archivist said that the first edition was, “…clearly written, useful, and fascinating. If you are new to this subject or even if you think you know a lot about it already, this book will provide you with new insights.”This book will be essential reading for any information professional, records manager or archivist, who appraises, selects, organizes, or maintains digital resources and has responsibilities as a digital curator.