In contrast to methods relying on a centralized training, emerging Internet of Things (IoT) applications can employ federated learning (FL) to train a variety of models for performance improvement and improved privacy preservation. FL calls for the distributed training of local models at end-devices, which uses a lot of processing power (i.e., CPU cycles/sec). Most end-devices have computing power limitations, such as IoT temperature sensors. One solution for this problem is split FL. However, split FL has its problems including a single point of failure, issues with fairness, and a poor convergence rate. We provide a novel framework, called hierarchical split FL (HSFL), to overcome these issues. On grouping, our HSFL framework is built. Partial models are constructed within each group at the devices, with the remaining work done at the edge servers. Each group then performs local aggregation at the edge following the computation of local models. End devices are given access to such an edge aggregated model so they can update their models. For each group, a unique edge aggregated HSFL model is produced by this procedure after a set number of rounds. Shared among edge servers, these edge aggregated HSFL models are then aggregated to produce a global model. Additionally, we propose an optimization problem that takes into account the RLA of devices, transmission latency, transmission energy, and edge servers’ compute latency in order to reduce the cost of HSFL. The formulated problem is a mixed-integer non-linear programming (MINLP) problem and cannot be solved easily. To tackle this challenge, we perform decomposition of the formulated problem to yield sub-problems. These sub-problems are edge computing resource allocation problem and joint relative local accuracy (RLA) minimization, wireless resource allocation, task offloading, and transmit power allocation sub-problem. Due to the convex nature of edge computing, resource allocation is done so utilizing a convex optimizer, as opposed to a block successive upper-bound minimization (BSUM) based approach for joint relative local accuracy (RLA) minimization, resource allocation, job offloading, and transmit power allocation. Finally, we present the performance evaluation findings for the proposed HSFL scheme.