Combining Data to Weight Social Connections in an Organisation

SNA Data Sources

In the above diagram red nodes are from the division of the organisation under study; green and blue are from two other divisions and grey nodes are uncategorised or central functions.

I’ve previously described determining a ‘score’ for social connections taking data from email, meetings, directory and timesheets. The question is how to combine them to produce as complete a picture as possible from the data at hand. I’m fairly sure the best answer is not to simply add the scores together but I’ve not found any guidance that would help do anything except that so that’s exactly what I have done…add them up. To summarise the score (or more correctly weight) given to each edge is made from:

  • 1 point for each email exchanged (where there are a maximum of 10 recipients)
  • 1 point for each minute in a one-to-one meeting, reducing rapidly as the number of attendees in the meeting increases
  • 300 points for being in a manages/managed by relationship
  • 1 point for every hour spent on a project divided by the number of people on the project

I’ve pulled this data together over the following periods:

  • Email: 6 months
  • Meetings: 2 years, 3 months
  • Corporate Directory: 6 months (but this is very slow to change so probably reflects the vast majority of the last 2 years)
  • Projects: 1 year

The coverage of the data also varies:

  • Email: for the core of the organisation being studied this is excellent as the data comes from Exchange Server Logs, for the periphery there is limited coverage as only emails being exchanged with the core are captured
  • Meetings: probably less than 50% as not all rooms are visible and not all meetings are booked in rooms; also teleconference information is not captured
  • Corporate Directory: very good, 90%+ but data is limited to corporate hierarchy
  • Timesheets: good but the system is not used universally as not everyone works on projects.

Some Observations using this approach:

  • Email dominates the structure of the network, the others add very little for those in the core; however for those outside the core the others provide additional insight into the structure.
  • There is overlap in these sources, for example we expect a manger will share emails with their reports and that people on a project will have meetings together but, as the coverage of each source is not compete, this is a small price to pay for seeing the whole network.

Despite the rather simplistic approach the results appear to work quite well but I’d love to hear from anyone who has implemented, or read about, a smarter way to combine these types of SNA sources.

3 thoughts on “Combining Data to Weight Social Connections in an Organisation

  1. Pingback: Calculating Influence | Robert Gimeno's Adventures in Data Science

  2. Pingback: Email vs. Instant Messaging for Social Network Analysis, Round 3 | Robert Gimeno's Adventures in Data Science

  3. Pingback: Ageing SNA Data | Robert Gimeno's Adventures in Data Science

Leave a Reply

Your email address will not be published. Required fields are marked *