Combination of the industrial Internet of Things (IIoT) and federated learning (FL) is deemed as a promising solution to realizing Industry 4.0 and beyond. In this paper, we focus on a hierarchical collaborative FL architecture over the IIoT systems, where the three-layer architectural design is conceived for supporting the training process. To effectively balance among the learning speed, energy consumption, and packet error rate for edge aggregation with regard to the participating IIoT devices, a weighted learning utility function is developed from the perspective of the fusing multiple performance metrics. An optimization problem is formulated to maximize the weighted learning utility by jointly optimizing the edge association as well as the allocations of resource block (RB), computation capacity, and transmit power of each IIoT device, under the practical constraints of the FL training process. The resulting problem is a non-convex and mixed integer optimization problem, and consequently it is difficult to solve. By resorting to the block coordinate descent method, we propose an overall alternating optimization algorithm to solve this problem in an iterative way. Specifically, in each iteration, for given transmit power and computation capacity, the sub-problem of joint RB assignment and edge association is transformed to a three-uniform weighted hypergraph model, which is solved by the local search-based three-dimensional hypergraph matching algorithm. Second, given RB assignment, edge association, and computation capacity, we employ the successive convex approximation method to tackle the sub-problem for optimizing the transmit power by turning it into a convex approximation problem. After the proposed alternating optimization algorithm converges to a tolerance threshold, a locally optimal solution of the original problem can be found. Numerical results reveal that our proposed joint optimization scheme can increase the system-wide learning utility and achieve significant performance gains over the four benchmark schemes.