Some guidance from the #SDW13 London panel on how to get a social data project running successfully:
(1) Map the social Data ecosystem: what are the data sources; who are the users; who will analyse the data; what data is important?
(2) Get an early client (internal or external) and iterate quickly.
(3) Build slowly and try not to over-think it; generate some early value.
Apparently some of these were learnt the hard way, so pay attention!
A compelling presentation showing how easy it is to take off-the-shelf software (COTS in the old jargon) to go right from extracting social data to sorting, querying and presenting it.
For those of us from an IT Architecture background I’ve illustrated the ETL/Data Warehouse type steps that these three offerings bring together (they have already built the integrations), it really did look very straightforward.
What was missing, for me, was the ability to explore or analyse the social network. I spoke to Datasift a few weeks ago about Twitter data and they explained that they did not provide follower data so, at this stage, those of us wanting to look into the networks are still going to have to write a bit of code.
A very interesting day. In the introduction the chairman asked for a show of hands as to who worked in IT or other parts of organisations: the split was about 50/50.
My main observations are:
- Hadoop is the standard; think of it as ETL on steroids: you will probably still want to feed the results into traditional databases and analytical tools. Hive provides a SQL-like language over the top. You can use Hadoop to make your archives ‘active’
- Organisations need to know what is being said about them, too often people find out what is happening in their own organisation on social media first.
- Think of the value in the data. For example car manufacturers are increasing the number of sensors in cars and collecting the data: they understand how you drive, maybe they could offer you insurance?
- Context is very important when looking at a piece of unstructured data.
- Decision makers need to be given a relevant subset of data.
- Organisations need to monitor global mega-trends. Take a look at http://www.news-spectrum.com/
- If you are analysing email content the disclaimers often placed at the end of the message can cause a lot of misleading conclusions
- “See Lots – Know Little – Do Less” (David Ackroyd, Telefonica); in other words too much information is not useful
- When you have a lot of data you can start looking for hidden patterns
- Prediction: can you spot customers who are about to depart?
- A Big Data initiative needs to offer value. Look for the sweet spot: a conjunction of revenue, cost and risk.
- Make sure Big Data thinking includes an outside-in perspective
- Data Art is the next big paradigm?