The discontinuous Galerkin's (DG) method is an efficient technique for packaging problems.It divides an original computational region into several subdomains, i.e., splits a large linear system into several smaller and balanced matrices.Once the spatial discretization is solved, an optimal time integration method is necessary.For explicit time stepping schemes, the smallest edge length in the entire discretized domain determines the maximal time step interval allowed by the stability criterion, thus they require a large number of time steps for packaging problems.Implicit time stepping schemes are unconditionally stable, thus domains with small structures can use a large time step interval.However, this approach requires inversion of matrices which are generally not positive definite as in explicit shemes for the first-order Maxwell's equations and thus becomes costly to solve for large problems.This work presents an algorithm that exploits the sequential way in which the subdomains are usually placed for layered structures in packaging problems.Specifically, a reordering of interface and volume unknowns combined with a block LDU (Lower-Diagonal-Upper) decomposition allows improvements in terms of memory cost and time of execution, with respect to previous DGTD implementations. INTRODUCTIONThe discontinuous Galerkin time-domain (DGTD) methods are promising in transient analysis of large and multiscale problems [1][2][3][4][5][6][7][8][9].Based on the idea of domain decomposition, the DG methods can handle problems too large to be solved by conventional numerical techniques.Basically, the DG methods divide an original computational domain into several well designed subdomains.Within each subdomain, the sizes of the elements are at the same level.In this way, a large system matrix is divided into several smaller and balanced matrices.Thus, once the spatial discretization is defined, an optimal time integration method is crucial.Time integration in transient multiscale modeling can be very challenging using explicit schemes; small cells needed for capturing fine details (e.g., a multiscale package-to-chip structure in Figure 1) will lead to an extremely small time step Δt, according to the Courant-Friedrichs-Lewy (CFL) condition [10].Consequently, the large number of calculation in time integration is unaffordable.On the other hand, implicit schemes can surpass the CFL limits because they are unconditionally stable.Thus, subdomains with electrically small structures can use a larger time step interval.This advantage is at the cost of inversion of matrices.Unlike explicit schemes, the matrices in an implicit scheme are not positive definite, and thus usually become more costly to invert.Several works have proposed algorithms to accelerate the implicit time stepping, such as the Block-Thomas (BT) algorithm introduced by [8].This work presents an algorithm, based on the Crank-Nicholson method [11][12][13][14][15], that exploits the sequential way in which the subdomains are usually placed for layered structures.The coupling between different subdomains leads to a linear block tri-diagonal system.This system is transformed into a block LDU (Lower-Diagonal-Upper) decomposition, resulting in a non-iterative and efficient algorithm.Numerical experiments show the benefits of this new method for sequential systems, in comparison to algorithms based on LU decomposition or the Block-Thomas algorithm [8].