Repository
https://github.com/utopian-io/utopian.io
Introduction
As part of my role as a @utopian-io Community Manager, I have been set a task to research into how a company/organisation would evaluate a piece of data analysis. Ideally, this would involve finding out how these organisations review analysis work so that 'we' can collate the best practices used.
The goal of this is to provide input into the sets of guidelines for a @utopian-io contributor, moderator, and the community as a whole. Additionally, this research would also then feed into the moderation questionnaire which is used to evaluate and score an analysis contribution to @utopian-io.
An online approach
This is where I started my research into the subject, and due to not owning any relevant publications on the matter (not too surprising I hope!), it is also where the research for this blog ended. Using a variety of search terms on Google, such as:
'how to evaluate data analysis'
'evaluating data analysis'
'how to write a good data analysis'
'how to analyse an analysis' 😃
It quickly became clear that finding out how companies and organisations evaluate this type of work was not going to be easy. By far and away the most popular links returned from my searches were articles produced via educational institutions such as colleges and universities.
Thinking about it, it stands to reason that a business is unlikely to share the ins and outs of how they quantify and judge their internal analysis work, as this may well include tasks specific to an organisation. This in turn, may give clues on how they do business, and provide details of information that they would rather not share to competitors.
An educational establishment however, will be happy to provide insight into best practices, for its staff and students, in order to give them the best chance of producing a solid analysis. The drawback here is that these 'guides' are quite general and discuss more about best practices on how to produce the work, with less focus on how to evaluate the work done.
Despite not finding anything on how a large corporations score analytical endeavors, I think there is still much value to be gained from detailing what is sought after with regards to a good piece of analysis work. In presenting the general approach to producing a solid analysis, this should hopefully give insight into how it will then be evaluated. And I suppose the lead question a reviewer of the work will ask is, is this useful to the project it concerns, or not?
What is data analysis?
First though, a couple of definitions of data analysis:
Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making source
Data Analysis is the process of systematically applying statistical and/or logical techniques to describe and illustrate, condense and recap, and evaluate data. source
The process of evaluating data using analytical and logical reasoning to examine each component of the data provided. This form of analysis is just one of the many steps that must be completed when conducting a research experiment. Data from various sources is gathered, reviewed, and then analyzed to form some sort of finding or conclusion. There are a variety of specific data analysis method, some of which include data mining, text analytics, business intelligence, and data visualizations. source
Taking these three definitions of 'data analysis', there are a bunch of synonymous words that pop up frequently. 'Inspecting/examine', 'cleansing/condense', transform, discovering/finding, and an overlapping phrase including the word 'conclusion'.
From this, we can deduce that a successful or 'useful' analysis will involve carrying out at least some of these techniques. Depending on the type of analysis, covering all of this criteria may not be possible, but as most data analyses tend to follow the same course, it makes sense that the majority could be involved - the more the merrier, one might suggest?
Considerations/potential issues in data analysis
In order to help form further ideas of what may constitute a successful analysis, it is worth considering some of the potential pitfalls which can lower the quaility of the work undertaken. The following has been adapted from https://ori.hhs.gov
Drawing biased inferences: Leading into a piece of analysis work with an unbiased to potential outcome approach, should reduce the sway towards poorly formed conclusions.
Inappropriate subgroup analysis: Breaking down the analysis into smaller and smaller subgroups, in an attempt to find something deemed worthy of reporting, is likely not to yield anything of great interest, and sway further from the initial goals of the work.
Lack of clearly defined and objective outcome measurements: Poorly defined outcome objectives will set the tone for the work, even if the methods used in the analysis are of good quality.
Manner of presenting data: Full clarity on what is being presented, including clear headings, legends, and reasoning for displaying this particular piece of information will aid the work greatly.
Reliability and Validity: This relates to using the right 'stable' data sources and providing details to allow the analysis to be reproduced.
Summary
Whilst I did not find exactly what I set out to, there is a wealth of information available on the World Wide Web with regards to producing a successful analysis contribution. This information, as it has to some extent in the past, should be of some use in helping build a solid set of guidelines, and enable the expansion of the review questionnaire.
I can envision writing a couple more blog contributions with the #iamutopian tag once the questionniare/guidelines have been re-assessed, on this specific topic, and perhaps something like 'how to produce a successful analysis contribution to @utopian-io'.
Thanks for your time,
Asher [Analysis - Community Manager]
Resources
https://ori.hhs.gov/education/products/n_illinois_u/datamanagement/datopic.html
https://en.wikipedia.org/wiki/Data_analysis
http://www.businessdictionary.com/definition/data-analysis.html
https://github.com/utopian-io/moderation-guidelines/blob/master/categories/analysis.md