Technical overview of e-mail network-based Insights
unsplash.com

Technical overview of e-mail network-based Insights

Article Series on Organizational Network Analysis and Communication Content Analysis


This is a follow-up of the Part 1 article, where we mentioned few of the key pain points we want to address.

Here, we dig deeper in two most interesting insights:

  • Internal Succession Planning
  • What-If Employee Leaves

Internal Succession Planning

It is often the case for internal candidates to be promoted as successors. Internal promotion is often based on performance and social relationships with departing leader and surrounding influence groups, without data-driven analysis. Today's systems are not doing an optimal job in succession planning because they are biased and don't take into account the data-driven networks and relationships (ONA).

In our approach, in addition to leader-successor communication, we look at the overall social state which indirectly involves other factors and confounders. We have borrowed the concept of Social Dispersion, developed by Facebook. This article, among others, states that:

An individual’s network neighborhood — the set of people to whom he or she is linked — has been shown to have important consequences in a wide range of settings, including social support and professional opportunities.

The sentence "including social support and professional opportunities" is an important base to explore succession planning aspects, in addition to the mutual friends communication paradigm. The downside of this approach is that it is not easy to perform such analysis for external candidates, because departing leader is in another communication domain, hence almost impossible to perform email network-based analysis.

Our implementation can recommend to departing leaders few potential internal successors.

This is done by understanding if successors’ network is also communicating with leader’s network. It can be deducted that high dispersion is a good base for successful internal succession planning. This means that successor’s network highly interconnects with leader’s network. Obviously, the departing leader will take other factors into account when determining the successor, perhaps not very data-driven but rather gut-based. Our approach is effectively a recommender rather than an absolute match.

However, if a successor is promoted, she will most probably leave a role gap in the organization. Therefore, we have gone one step further, we provide a further down the line successor. The diagram below explains the concept in more details.

To implement this, we have used the e-mail network-based fields of 'from' and 'to', without any content involved (in a later stage we will utilize timestamps to address relationship decay). The dataset contains 8000 employees and quarter million emails. In regards to technologies and algorithms, we have used Python packages such as: NetworkX, pandas, and NumPy. The code snippet below is the key part of Internal Succession Planning algorithm implementation:

df = pd.DataFrame(data, columns = ['Sender','Recipient'])
G = nx.from_pandas_edgelist(df, source = 'Sender', target = 'Recipient', create_using = nx.DiGraph())
dis = nx.dispersion(G, normalized = True)

For our show-time audience we have implemented a 3D network visualization, shown below. However, the product implementation is based on API as a Service.

(Note that names do not represent our employees)

What-If Employee Leaves

Understanding communication risk once an employee is not on the same role is key to business operations. This feature is especially useful to line managers, who need to maintain normal communication and workforce throughout change. High betweenness centrality means higher cost for employee replacement. If an employee is highly connected and serves as a communication bridge, introducing a new candidate in such role with cost in time and money.

Betweenness centrality is defined as the share of times that a node 'i' needs a node 'k' (whose centrality is being measured) in order to reach a node 'j' via the shortest path. The research base for our implementation comes from the research paper Centrality and Network Flow.

A snippet of core implementation of the algorithm is shown below:

def betweenness_centrality_chunks(l, n):
    l_c = iter(l)
    while 1:
        x = tuple(itertools.islice(l_c, n))
        if not x:
            return
        yield x
def _betweenness_centrality_map(G_normalized_weight_sources_tuple):
    return nx.betweenness_centrality_source(*G_normalized_weight_sources_tuple)
def betweenness_centrality_parallel(G, processes=None):
    p = Pool(processes=processes)
    node_divisor = len(p._pool)
    node_chunks = list(betweenness_centrality_chunks(G.nodes(), int(G.order() / node_divisor)))
    num_chunks = len(node_chunks)
    bt_sc = p.map(_betweenness_centrality_map,
                  zip([G] * num_chunks,
                      [True] * num_chunks,
                      [None] * num_chunks,
                      node_chunks))
    bt_c = bt_sc[0]
    for bt in bt_sc[1:]:
        for n in bt:
            bt_c[n] += bt[n]
    return bt_c
    btwn_cent = betweenness_centrality_parallel(G)

Again, for show-cases we have implemented a 3D network visualization.

There are a lot of implementations related to probability of employee leaving known as risk-flight, employee churn, etc. These approaches are mostly based on machine learning and analysis of similar cases e.g. k-means clustering. Our approach differs because it uses ONA metrics to achieve such conclusions that are much more relevant to the actual employee rather than learning from other cases. A better approach would be a weighted combination of metrics from ONA and machine learning, on which we are working.

__________________________________________________________________________

Most of companies address these issues through written official processes. This will not set these companies ready for the future, because processes are static, hardly updated, not easily available, and not dynamic to reflect ever-changing events. We want to change this by introducing a dynamic approach. This approach is not dependent on written documents but rather on current information flows, context, trends, communication, and actual employee data - basically a VUCA response.

Our further development goes into the direction of machine learning, anomaly detection, and time-series analysis.

This article is part of the series on ONA@Haufe

__________________________________________________________________________

  1. How to use corporate e-mail analysis to reveal hidden stars and ensure equal opportunities (Part 1)
  2. Technical overview of e-mail network-based Insights (Part 2)
  3. Deep dive on e-mail network-based Recommendations (Part 3)
  4. How to use trends to find hidden stars and work on a perfect project? People analytics will make you a star (Part 4)
  5. How to implement e-mail content-based analysis (Part 5)

__________________________________________________________________________

To view or add a comment, sign in

More articles by Agron Fazliu

Insights from the community

Others also viewed

Explore topics