KDBI special issue: MapIntel: A Visual Analytics Platform for
Competitive Intelligence
Abstract
Competitive Intelligence allows an organization to keep up with market
trends and foresee business opportunities. This practice is mainly
performed by analysts scanning for any piece of valuable information in
a myriad of dispersed and unstructured sources. Here we present
MapIntel, a system for acquiring intelligence from vast collections of
text data by representing each document as a multidimensional vector
that captures its own semantics. The system is designed to handle
complex Natural Language queries and visual exploration of the corpus,
potentially aiding overburdened analysts in finding meaningful insights
to help decision-making. The system searching module uses a
retriever and re-ranker engine that first finds the closest neighbors to
the query embedding and then sifts the results through a cross-encoder
model that identifies the most relevant documents. The browsing
or visualization module also leverages the embeddings by projecting them
onto two dimensions while preserving the multidimensional landscape,
resulting in a map where semantically related documents form topical
clusters which we capture using topic modeling. This map aims at
promoting a fast overview of the corpus while allowing a more detailed
exploration and interactive information encountering process. We
evaluate the system and its components on the 20 newsgroups dataset,
using the semantic document labels provided, and demonstrate the
superiority of Transformer-based components. Finally, we present a
prototype of the system in Python and show how some of its features can
be used to acquire intelligence from a news article corpus we collected
during a period of 8 months.