Results
Design of CAR T-cell
Activation Simulation
The objective of the simulation is to maximize the number of activated
CAR T-cells through dynamic control of bead addition and removal. At
each time interval, simulated data of culture condition in the form of
tabulated sensor measurements and/or microscopic images will be provided
to the agent. The agent can either add more beads, take away all beads
or refrain from acting at that step (Figure 1b). After sufficient
training in a similar environment, the agent is expected to choose an
action based on the input data which will optimize the end goal. Before
attempting such a control strategy on a physical environment, the
process of bead-based CAR T-cell activation is simulated as an RL
environment (Figure 1a). A 2D surface (Figure 2c) for cell growth is
simulated as a continuous \(n\times\) \(n\) grid with spacing of 10
microns to match the approximate cell diameter . In all the simulation
50×50 corresponding to 500 by 500 sq-micron area is used. For better
clarity in observing the cells, (in Figure 2c) a 20 × 20 grid is used
for demonstration purposes. The simulated expansion area is made
continuous (no boundary). All defined parameters for this simulation are
described in Table 2. Although attempts were made to associate these
parameters with literature values, some assumptions were made. It is
important to note that the modular simulation and RL training presented
here can be readily updated as more measured values are determined. A
fixed time method is used with a value of 6 min per step, derived from
the approximate time a cell translates one diameter away or to the next
grid spacing (velocity of the cell is ~2 micron per
minute ). There are other factors affecting cellular migration like
media viscosity, age of cell, size of cell, etc. that are neglected in
this simplified model. The total simulation lasts for a 7-day expansion
campaign, equivalent to 1600 simulation steps. Bead to cell contact,
bead to cell ratio and confluence are taken into consideration in the
simulation rules considering their role in the efficiency of activation
.
At simulation start the grid is randomly seeded (Figure 2c) with a
specified number of naïve T-cells indicated as red cells in the
simulation. The following steps are iterated for each cell in the
simulation: Step 1. It can propagate to any of the 8 adjacent
cells if it satisfies movement conditions, namely vacancy at the chosen
grid and probability of making a move at that step determined
stochastically (Figure 2a and Supplement 2). Step 2: If a naïve
cell occupies a position where an activation bead (coupled to anti-CD3
and anti-CD28 antibodies) is present and if certain conditions
(probability of conversion at that step beyond a threshold determined
stochastically, detailed in Supplement 2) are met, the naïve cell is
activated and turns blue in the simulation (Figure 2a). Step 3:If an already activated cell gets in a position where there is a bead,
it gets exhausted depending on the value of the specified exhaustion
rate (Figure 2a). Step 4: At each time step each activated cell
gets exhausted as per natural, transient exhaustion rate which
is
(\(\ \frac{\text{natural\ exhaustion\ }}{\text{total\ timesteps}}\ \))
times smaller compared to accelerated exhaustion caused by over exposure
and stimulation caused by beads (see Table 1 Units and Figure 2b).