Competitive intelligence automation offers value creation opportunities: Q&A with Partha Bhattacharjee of Cambridge Semantics

May 3, 2017

For this Q&A we spoke with Partha Bhattacharjee, who has written about automation in competitive intelligence in a series titled “Riding the Competitive Intelligence Automation Wave”. Partha is a Senior Solutions Engineer at Cambridge Semantics, Inc. (CSI), an enterprise analytics and Big data management software company. He has been a CI professional for over five years and holds graduate degrees in Systems Engineering and Technology and Policy from MIT.

Emerging Strategy: Taking the title of your series into account, “Riding The Competitive Intelligence Automation Wave”, I will just get right to the point: should competitive intelligence professionals worry about their jobs being automated any time soon?

Partha Bhattacharjee: No, CI professionals will not lose their jobs. Instead, their jobs will evolve and it will be critical that they keep up. It’s true that parts of a CI professional’s job will be automated going forward. What’s important to realize is that this trend is not new. Parts of CI have continuously been automated over time. News crawling, for instance, underwent transformation with the foray of RSS feeds into the mainstream. To the best of my knowledge, not too many CI professionals lost their jobs to prior phases of automation.

Having said that, the current wave of automation is more sophisticated and encompasses tasks that require a higher degree of cognition. Analytics on unstructured text, and the harmonization with structured data, is a prime example in the context of CI. Advances in semantic technologies, natural language processing, and machine learning, often overlapping in nature, now enable tools to process diverse text ranging from analyst call transcripts to scientific publications in a manner that approximates human interaction with the content. Data extracted from the text can then be linked to and stored in spreadsheets and databases. If you come to think of it, these steps represent the bulk of the secondary research performed by CI analysts. But this type of secondary research performed by a computer is unlikely to put a competent CI professional out of a job – on the contrary, the analyst will have more bandwidth for higher value tasks such as primary research and data analysis.

ES: You write that the utility of CI is not in question but it is the outdated methods of CI collection that is putting pressure on departments to visibly demonstrate their value to senior managers. So how can senior managers who consume CI get the most out of their limited (or shrinking) CI budgets in today’s environment?

PB: Senior managers need to both envision and evangelize a CI project as a digital initiative because deep entrenchment of technology is critical for maximizing a project’s value. The project roadmap needs to include tools used at each stage as well as a factual basis for selecting one tool over another. CI managers have typically been subject and/or practice area experts. By viewing a CI project through a technology lens, managers will be able to maximize the throughput at each stage of the project. Native infusion of technology from the outset that complies with a vision, instead of adding tools in an ad hoc manner, will only augment a manager’s effectiveness in leading a CI project.

CI managers also need to zealously scrutinize the value returned from every dollar invested using metrics such as Time To Value. The goal must be allocation of most investment to high-value tasks such as accessing sources of differentiating information and insight generation. For most companies, such scrutiny, in all likelihood, will reveal that an awful amount of analysts’ time is spent in collating data and beating it into a shape that is amenable to analysis. The majority of such tasks can be automated, thus enabling CI managers to deploy their scarce human expertise toward connecting dots between data, augmenting subject matter expertise, and generating insights.

Other cost sinks often emerge around primary research. The trend of minimizing costs using technology to bridge physical distance has been in effect for several years now. Tools for large scale surveys and reporting, text-to-speech applications, and mobile-based data collection tools are helping minimize costs and improve data quality.  Interconnecting or harmonizing such data will generate significant value.

ES: Do you have any particular advice for analysts burdened with a great deal of tedious ‘legwork’ that takes time away from higher value activities such as analysis that managers expect of them?

PB: I am fortunate to have had the opportunity to work as a CI analyst early in my career before diving into the world of creating smart data-driven CI solutions. The more I see of the spectrum of tools that a CI analyst can potentially use, the more I am convinced that familiarity with a wide variety of analytical tools and techniques is probably the most important asset a CI analyst can possess apart from an inquisitive nature and subject matter expertise.

A CI analyst needs to view herself as the conductor of an orchestra. The musicians are the software that perform one or more of the required tasks in the CI pipeline. The quality of insights is a function of how well the analyst can strike harmony between an array of tools and techniques. Every step of the CI process can leverage automation to varying degrees.

I would particularly draw attention to unstructured text analytics given that secondary research, apart from cleaning messy data, tends to be the most time intensive exercise in most CI projects.  To drastically reduce the time invested in secondary research and enhance productivity and scale of analysis, I strongly urge CI practitioners to consider using text analytics tools. A skilled CI analyst can identify and visualize crucial data from thousands of documents in a matter of hours through an intelligent combination of multiple annotators and dashboards.

ES: What are the best practices for structuring MI/CI projects that require regular updating and managing of complex information from disparate sources?

PB: There are a few core design decisions that need to be considered:

  1. Adopting smart data lakes: CI managers across enterprises are realizing that they are fighting a losing battle against the burgeoning diversity of data. Traditional data management systems are unable to cope with the variety of data, and analysts are consequently too bogged down in low-value repetitive data collation tasks to fully apply themselves to insight generation. Going forward, MI/CI projects will need to be anchored to what are colloquially known as ‘smart’ data lakes to be able to handle complex data from disparate sources. In such data lakes, all the data that a CI analyst has access to are interlinked and stored using a flexible data model so that depending on the project, the relevant data can be extracted and reused.
  1. Using advanced text analytics solutions: As previously discussed, current text analytics tools provide an array of features ranging from entity extraction to document translation. The use of such tools is a must to identify data of interest in unstructured text.
  1. Establishing robust data pipelines: Data lakes need to be connected to relevant data sources through data ‘pipelines’ that periodically update its content. The flexibility of Resource Description Framework (RDF), the semantic data model, is a critical differentiator in this context as it can seamlessly accommodate changes in data’s structure as well as context.
  1. Avoiding vendor lock-in: The key to successful management of diverse data is the ability to use best of breed tools or components that fit one’s workflow. Hence, it is important for practitioners to be able to work with platform(s) where different tools can be used concurrently.

oter(); ?>