SIB: Sorted-integers-based Index for Compact and Fast Caching in
Top-down Logic Rule Mining
- Ruoyu Wang,
- Raymond Wong,
- Daniel Sun,
- Rajiv Ranjan
Ruoyu Wang
University of New South Wales School of Computer Science and Engineering
Author ProfileAbstract
not-yet-known
not-yet-known
not-yet-known
unknown
Mining logic rules from structured knowledge bases is the basis of
knowledge engineering. Due to the NP-hardness of the rule mining
problem, logic rules cannot be efficiently induced from knowledge bases,
especially large-scale ones, and most mining techniques employ
algorithmic and architectural optimizations to improve efficiency.
Data-oriented optimizations have also been explored to some extent, but
the data efficiency is relatively low, and the memory consumption is
thus becoming a new challenge for state-of-the-art systems. In this
article, we propose a compact and efficient index structure for the
maintenance of the intermediate data during top-down rule mining. The
index is based on a mapping from constant symbols to integers and the
sorting of the mapped integers. We evaluate our method on six datasets
which contain up to 160K records and are frequently used as benchmarks
in knowledge engineering related tasks. The experimental results show
that the proposed technique speeds up the rule mining procedure by 5x on
average and reduces memory consumption by up to 70%. The space overhead
of the data structure is about twice that of the indexed records, which
is more than 80% lower than that of the state-of-the-art technique.28 May 2024Submitted to Software: Practice and Experience 12 Jun 2024Reviewer(s) Assigned
24 Aug 2024Review(s) Completed, Editorial Evaluation Pending
31 Aug 2024Editorial Decision: Revise Major
30 Oct 20241st Revision Received
07 Nov 2024Submission Checks Completed
07 Nov 2024Assigned to Editor
07 Nov 2024Review(s) Completed, Editorial Evaluation Pending