검색
검색 팝업 닫기

Ex) Article Title, Author, Keywords

Article

Split Viewer

Invited Review Paper

Curr. Opt. Photon. 2024; 8(6): 531-544

Published online December 25, 2024 https://doi.org/10.3807/COPP.2024.8.6.531

Copyright © Optical Society of Korea.

A Tutorial on Inverse Design Methods for Metasurfaces

Jin-Young Jeong1, Sabiha Latif2, Sunae So1

1Department of Control and Instrumentation Engineering, Korea University, Sejong 30019, Korea
2Institute for Photonics and Materials, Korea University, Sejong 30019, Korea

Corresponding author: *sunaeso@korea.ac.kr, ORCID 0000-0001-8606-2234
These authors contributed equally to this paper.

Received: October 14, 2024; Revised: November 22, 2024; Accepted: November 22, 2024

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

This paper provides a tutorial on inverse design approaches for metasurfaces with a systematic analysis of the fundamental methodologies and underlying principles for achieving targeted optical properties. Traditionally, metasurfaces have been designed with extensive trial-and-error methods using analytical modeling and numerical simulations. However, as metasurface complexity grows, these conventional techniques become increasingly inefficient in exploring the vast design space. Recently, machine learning and optimization algorithms have emerged as powerful tools for overcoming these challenges and enabling more efficient and accurate inverse design. We begin by introducing the fundamentals of optical simulations used for forward modeling of metasurfaces and their relevance to inverse design. Next, we explore recent advancements in applying machine learning techniques such as neural networks, Markov decision processes, and Monte Carlo simulations, as well as optimization algorithms, including automatic differentiation, the adjoint method, genetic algorithms, and particle swarm optimizations, and show their potential to revolutionize the metasurface design process. Finally, we conclude with a summary of key findings and insights from this review.

Keywords: Inverse design, Machine learning, Metasurface, Optimization algorithm

OCIS codes: (150.1135) Algorithms; (240.0240) Optics at surfaces

Over the past decades, light-matter interaction has been an important concern for breakthroughs in nanophotonics research in advancing new technologies for operations with real-time, on-chip optical interconnects. The advancements in nanofabrication technologies made it possible to manufacture complex structures such as metamaterials/metasurfaces, which manipulate light at the nanoscale among nanostructured surfaces with features at or below the electromagnetic wavelength scale [13]. Metasurfaces, which are patterned subwavelength structures, offer the distinctive capability of wavefront control over phase and amplitude, which makes them suitable for applications in miniaturized optical devices such as advanced imaging systems [4], optical communication [5], 3D displays [6], augmented reality [7] and invisibility cloaking [8].

The conventional approach to designing metasurfaces involves investigating their geometrical characteristics to understand how they affect optical properties based on physical insights. Additionally, achieving multifunctionality often requires managing complex physical phenomena and interactions, leading to high levels of entanglement when identifying the effective approach for optimal solutions. This process can be time consuming and often requires extensive experimental verifications.

In contrast to conventional intuition-driven design approaches, inverse design techniques use computational algorithms to efficiently optimize or retrieve optical system design based on the desired optical characteristics [9]. Computational optimization approaches have received significant attention in the development of innovative optical devices with a broad variety of applications.

The inverse design of metasurfaces can be addressed using two primary approaches: Machine learning (ML) and computational optimization. These approaches play an efficient role in determining the optical characteristics of proposed metasurfaces, designing new metasurfaces, and optimizing designs to meet required optical specifications [10]. ML excels at rapidly predicting complex optical behaviors when trained on sufficient data and enables the generation of efficient designs without the need for repeated simulations. In contrast, optimization algorithms iteratively refine designs to achieve specific optical criteria, making them complementary tools for inverse design. Together, ML and computational optimization offer powerful methods for addressing the complexities of metasurface design. In the realm of applied mathematics, optimization techniques have laid the foundation for conventional computational algorithms and analytical methods, which are applied to solve complex technological, industrial, and economic optimization problems.

Recently, advancements of ML techniques and algorithmic breakthroughs have been achieved to address the computational challenges of developing substantial, high-efficiency photonic devices. Consequently, inverse design has become more efficient with ML, particularly in estimating the solutions to Maxwell’s equations while reducing computational costs. ML employs models that have been trained on data for the generation of exceptionally efficient photonic designs, thus eliminating the necessity for additional simulations. In contrast, computational optimization uses gradient-based or heuristic optimization algorithms to iteratively refine designs through multiple simulation steps [11]. While ML often relies on gradients and backpropagation, a major distinction is that conventional computational inverse design techniques are typically defined by a specific optimization target function. This might involve achieving desired optical characteristics such as resonance at a particular wavelength or enhancing a device’s broadband efficiency. With the rapid advancement of ML through the integration of AI and neural networks, it is now possible to address complicated spectral properties such as dual polarization and multiple resonances in photonic structures, as well as improve design procedures. Significant advancements in photonics have mainly been made to these techniques, which are currently providing end-to-end design solutions that surpass conventional forward and inverse design approaches [12].

Optimization algorithms can be classified into gradient based and non-gradient based methods [11, 13]. Gradient-based approaches are directional optimization methods that use gradient information for differentiable functions to find optimal solutions. However, these methods may struggle to find global solutions for nonconvex functions. In contrast, non-gradient methods do not rely on gradients, but explore the solution space with heuristic techniques or random search strategies. These methods are useful for complex problems where gradients are difficult to compute. Each method has different calculation methods and algorithms, which will be explained in Section 3.2. Additionally, non-gradient methods have a higher likelihood of finding the global optimum compared to gradient-based methods, which are more prone to getting stuck in local optima [14]. However, exploring the parameter space without gradient guidance can be slower than following a well-defined gradient descent path. Despite this, non-gradient optimization is a valuable tool when optimizing complex engineering designs where gradients are not readily available or easily calculated.

This tutorial explores the inverse design methods for metasurfaces using ML and optimization algorithms (Fig. 1). It aims to provide a deeper understanding of the functionalities, strengths, and limitations of various inverse design approaches with a detailed examination of their methodologies. The first part of the review covers forward calculation in optical simulations, which forms the basis for analyzing and designing metasurfaces. These include finite-difference time-domain (FDTD), rigorous coupled-wave analysis (RCWA), and finite element method (FEM). Next, we discuss the use of advanced ML approaches for inverse design, such as neural network (NN), Markov decision process (MDP), and Monte Carlo (MC) methods. Finally, we explore various optimization techniques employed to improve and optimize the design process, including automatic differentiation (AD), adjoint method (AM), genetic algorithm (GA), and particle swarm optimization (PSO).

Figure 1.Schematic illustration of an overview of inverse design metasurfaces using machine learning and optimization methods.

Electromagnetism, one of the fundamental forces of nature, is essential to all innovative technologies and natural phenomena. Central to understanding electromagnetism are Maxwell’s equations, which provide a mathematical framework for describing electromagnetic fields. Early research focused on identifying precise solutions for canonical geometries in Maxwell’s equations. Subsequently, innovative research introduced approaches including Sommerfield integral [15], Rayleigh scattering [16], Mie scattering [17], and the Debye model [18].

The advent of advanced computing has led to more efficient and flexible approaches to solving Maxwell’s equations in complex scenarios. Unlike precise or approximative analytical techniques, numerical methods offer greater adaptability and can handle more intricate situations. However, the scale at which structures can be developed and simulated in computational domains is constrained by available computational resources, which significantly affects the number of simulations that can be conducted.

In this chapter, we introduce key computational techniques such as the FDTD, RCWA, and FEM [1921].

2.1. Finite-difference Time-domain (FDTD)

The FDTD method is a widely used numerical technique for simulating light-matter interactions in various media. Introduced in 1966, FDTD solves Maxwell’s equations in the time domain by discretizing both time and space into grid units with alternating electromagnetic field components. Figure 2(a) illustrates the FDTD approach. It is particularly effective for full-wave evaluations of electromagnetic waves (EM) in complex media and intricate geometries.

Figure 2.Schematic of forward calculation techniques in optical simulation: (a) Finite-difference time-domain (FDTD), (b) rigorous coupled-wave analysis (RCWA), and (c) finite element method (FEM).

Using the Yee grid, the FDTD method iteratively updates electric (E) and magnetic (H) fields over a segmented simulation volume until the volume is saturated with excitation fields or until they exit. The technique approximates spatial and temporal derivatives using finite differences to solve Maxwell’s equations.

Faradays’ law is as follows:

 × E = Bt ,

and Ampere-Maxwell’s law is as follows:

 × H = J+ Dt, 

where J represents electric current density, D is electric displacement, and B is magnetic flux density. Material properties are characterized by conductivity (), electric permittivity (), and magnetic permeability (µ).

The FDTD method requires numerous spatial pixels and iteration and time steps to achieve high accuracy, and typically uses pixel sizes smaller than the wavelength of light. This method is computationally intensive, but scaling gradient computation helps optimize time and memory requirements based on independent input and output parameters.

The FDTD method is valued for its accuracy in simulating complex geometries and materials across broad frequency ranges, particularly for time-domain responses, such as broadband or fluctuating signals [22]. However, accurately capturing wave phenomena demands precise spatial discretization, which can substantially increase computational costs, especially in large-scale or broadband-frequency applications. Its applicability and adaptability are improved by open-source software such as Ceviche [23] and Meep [24], which provides memory synchronization and full scriptability.

FDTD is applicable in fields from classical to quantum physics and from subatomic to interstellar lengths. It plays a significant role in microstructures and enables simulations of natural wide-spectrum sources and analysis of various parameters such as structural size, morphology, and refractive indices before fabrication.

2.2. Rigorous Coupled-wave Analysis (RCWA)

The RCWA method is a powerful semi-analytical computational method for solving Maxwell’s equations in periodic structures. Initially developed by Dr. M. G. Moharram and Dr. Thomas K. Gaylord in the 1980s, RCWA has since become a standard tool in computational electromagnetics, particularly for analyzing diffraction gratings, photonic crystals, and other periodic optical devices.

RCWA operates on a plane wave foundation, also known as the Fourier modal method (FMM) or the transfer matrix method, which relies on Fourier-space analysis to efficiently calculate electromagnetic eigenmodes for transmittance and reflectance [25]. The fundamental principle of RCWA involves transforming Maxwell’s equations into Fourier space, where the periodicity of the structure enables an efficient semi-analytical solution through the division of electromagnetic fields into spatial harmonics. RCWA discretizes the spatial domain in the x- and y-dimensions by assuming uniformity in the z-direction. The RCWA technique is depicted in Fig. 2(b). By applying Bloch’s theorem, it determines the Bloch modes within the diffraction layers, which restrict the electric field responses to a finite set due to the periodic nature of the structure [26]. The material topology of each layer, expressed in Fourier space, is inherently linked to these modes and their Fourier components. The EM field propagation within the structure is then calculated using a modified transfer matrix approach for precise mathematical modeling of the light behavior throughout the system [27]. RCWA is particularly effective in evaluating two-dimensional and three-dimensional periodic structures such as harmonic waveguides [28], photonic crystals [29], and diffraction gratings [30]. Its ability to analyze both the H field profile and potential modes within the structure makes it especially valuable for applications involving 2D silicon meta-gratings.

Compared to fully numerical methods such as FDTD, RCWA offers significant computational advantages due to its selective discretization, and it can efficiently handle complex multilayered structures with varying refractive indices. This efficiency is particularly beneficial when working with large-scale periodic structures. However, the precision and effectiveness of RCWA depend on the choice of Fourier components. Increasing the number of Fourier components enhances simulation accuracy but simultaneously increases computational resource demands and requires a careful balance between accuracy and efficiency.

Ongoing developments continue to improve RCWA’s convergence rates with the aim of more accurate and resource-efficient simulations. Widely used open-source RCWA tools include S4 [31] and RETICOLO [32], which are designed for Python and MATLAB environments, respectively. More recent software, such as MAXIM [33], offers user-friendly graphical interfaces. Additionally, tools such as Meent [34] have introduced advancements in convergence and AD, significantly enhancing the efficiency and capabilities of RCWA for photonics research.

2.3. Finite Element Method (FEM)

FEM first emerged in the 1940s to address challenges in engineering, primarily in the fields of structural and aeronautical engineering [35, 36]. Initially developed for these fields, FEM has since expanded to a wide range of engineering problems as well. FEM is a powerful numerical technique that approximates solutions to partial differential equations and is widely employed for simulating optical phenomena and optical device characteristics where formulations of Maxwell’s equations are applied.

The fundamental concept of FEM involves dividing large volumes and complex structures into many small, manageable, finite elements and facilitating the meshing process to break the domain of the solution into smaller, more manageable elements. These elements, which can be triangles, quadrilaterals, tetrahedra, prisms, or hexahedra, can be used in the creation of irregular meshes capable of capturing intricate design units, as shown in Fig. 2(c). After meshing, the solution is estimated using a finite number of basis functions, typically low-order polynomials that are non-zero over a narrow range of adjacent elements. A key aspect of FEM is the Galerkin method, which aims to minimize the residuals of the differential equation in a weak formulation using test or weighting functions that are often identical to the basis functions [37].

FEM has proven very effective in solving complex mathematical equations, especially in fields such as aircraft design, and is renowned for its accuracy in handling complex shapes and structures. It is also versatile and applicable to various engineering problems in fluid dynamics, electromagnetics, and more [38]. However, a significant drawback is its high computational demand, particularly when applied to large-scale problems with intricate geometries. Despite its computation-hungry nature, FEM remains invaluable in many disciplines due to its accuracy and flexibility. Commercial software programs such as ANSYS [39], and COMSOL Multiphysics [40] leverage FEM’s capabilities, enabling researchers, scientists, and engineers to perform structural analysis, electromagnetics, fluid dynamics simulations, and other applications.

These three methods, FDTD, RCWA, and FEM, form the foundation for forward calculations in optical simulations. While FDTD offers high accuracy for time-domain simulations, RCWA provides efficiency in handling periodic structures, and FEM excels in complex geometrical analysis. Together, they facilitate detailed simulations of electromagnetic interactions and serve as essential tools for the inverse design of metasurfaces, which will be discussed in the following chapter.

The inverse design of metasurfaces is primarily achieved using ML and optimization algorithms [9]. ML employs data-driven models to learn design patterns and predict outcomes [41], while optimization techniques use gradient-based or evolutionary algorithms to refine designs [11, 42]. These approaches facilitate the creation of complex photonic structures that are often difficult to achieve with conventional methods. In this section, we introduce ML and optimization algorithms, along with key research studies that have effectively applied these methods to design various metasurfaces.

3.1. Machine Learning Techniques for Inverse Design

3.1.1. Neural Networks (NN)

A NN is a computational model inspired by the structure of human brain neurons [43]. Similar to how neurons exchange signals, NNs consist of multiple nodes that process inputs, apply weights, and produce outputs through activation functions. They learn through interconnected layers: The input layer receives data first and sends it to the hidden layer, the hidden layer processes it, and the output layer generates results. By adjusting weights, NN can recognize patterns in the input and predict outputs. The framework of the NN is shown in Fig. 3(a).

Figure 3.An architecture of neural network (NN) model and nanophotonic device design representation. (a) A framework of NN modeled after human neurons, where each node is connected to others. (b) Schematic for forward and reverse modeling of nanophotonic devices using deep NN. Reprinted from M. H. Tahersima et al. Sci. Rep. 9, 1368 (2019). Copyright © 2019, M. H. Tahersima et al. [54].

A deep neural network (DNN) extends NN structure by incorporating multiple hidden layers [44] to effectively capture features even in high-dimensional data. DNNs also support transfer learning, allowing them to adapt to different domains. When every node in one layer is connected to every node in the subsequent layer, it forms a fully connected neural network. Its characteristic is a relationship of connections between nodes in the immediately adjacent layer. Recurrent neural networks (RNNs) [45], on the other hand, use a loop structure to handle sequential data and learn from both input and hidden layers.

For inverse design, convolutional neural networks (CNNs), a widely used DNN variant, have proven to be very effective [46, 47]. Unlike standard NNs, CNNs excel in image processing by using convolution and pooling operations to extract key features. Convolution involves applying a kernel matrix over the input image to create a feature map to capture essential details [48]. Pooling then reduces the size of the feature map by selecting maximum, average, or minimum values, which preserves crucial information and reduces the computational load [49, 50]. This makes CNNs highly effective for tasks such as object recognition and image analysis.

NNs have been successfully applied in nano-manufacturing and material exploration to improve efficiency and enable real-time control and feedback [51]. For example, Li et al. [52] used CNN to map meta-atom structures to electromagnetic reflection spectra, creating a high-fidelity prediction model that accurately predicted amplitude and phase over an ultra-wide frequency band with low errors and short simulation times.

Moreover, traditional design methods are often time-consuming, require specialized expertise, and are associated with high costs. In contrast, NN offers a faster, more efficient, and more accurate alternative for making predictions. For instance, Chen et al. [53] developed a hybrid CNN and RNN model that demonstrated strong adaptability and reduced design time, approximately 4,800 times shorter than designing with the FDTD method, to inversely engineer an all-dielectric nanohole metasurface structure. Tahersima et al. [54] designed a power splitter using DNN and achieved a maximum transmission efficiency of more than 90% and target splitting specification while minimizing reflections [Fig. 3(b)]. These examples illustrate the effectiveness of NNs in the inverse design of metasurfaces.

3.1.2. Markov Decision Processes (MDP)

An MDP is a probabilistic model in ML that represents an agent’s decision-making process under uncertainty [55]. In MDPs, each state depends only on the immediately preceding state, making it a stochastic process with the Markov property, known as a Markov chain. Given the current state st, the future state st+1 is determined solely by st, and is independent of past states. This can be expressed mathematically as:

Pst+1|st= Pst+1|s1,s2, , st.

MDPs are built upon Markov processes (MPs) where transitions between states are based on a specified probability distribution [56]. The future state can be expressed by the transition probability matrix, where an MP is defined by a set of states S and transition probabilities P denoted as:

MP S, P,
Pss'=P[St+1=s|St=s].

Here, s represents any state in S, and s′ denotes a future state. If the reward concept is introduced to MP, the model becomes a Markov reward process (MRP) [57], represented as:

MRP S, P, R,γ,
R=E[Rt|St=s],
Vs=Rs+ γ sSP s |sVs'.

In this context, R(s) is the expected reward at state s, and γ is a discount factor balancing immediate and future rewards. An MDP extends this model by incorporating actions A that an agent can take in each state [58]. The schematic of MDP is depicted in Fig. 4(a). The objective is to find an optimal policy π that guides the agent’s actions to maximize cumulative rewards. Formally, an MDP can be expressed as a formula as:

Figure 4.An architecture of the Markov decision process (MDO) and its utility representation. (a) Framework of the MDP, (b) details of the algorithm L2DO proposed by R. Li et al. [59]. This algorithm uses the MDP framework for photonics inverse design, with the environment defined using FDTD-based simulation. Adopted from Li et al. Nanophotonics 12, 319–334 (2023). Copyright © 2023, R. Li et al. [59].

MDP S, A, P, R,γ,
Pss'a=P[St+1=s|St=s, At=a],
Ra=E[Rt+1|St=s,At=a],
π(a|s)=P[At=a|St=s].

Here, π(a|s) represents the policy function indicating the probability of choosing action a in state s, and the sum of probabilities for all actions in each state must equal 1.

MDP is useful for modeling environments with uncertainty and offers the advantage of evaluating various scenarios. This has led to its application in the inverse design of metamaterial and wave scattering optimization [5961]. For instance, Li et al. [59] developed an MDP-based algorithm using FDTD simulations to allow agents to explore refractive index, spatial arrangement, length, width, and thickness of materials. The framework of the proposed algorithm is shown in Fig. 4(b). Similarly, Park et al. [61] utilized MDP and Q-learning to design a 1D grating, incorporating physical information into reinforced learning to achieve highly efficient and complex optical designs. This MDP-based framework can also be extended to design more intricate devices, such as two-dimensional meta-gratings or meta-lenses, in combination with other optical simulation tools [62].

3.1.3. Monte Carlo (MC) Methods

The MC method is an algorithm that uses random sampling to approximate solutions for complex problems [63]. Based on the law of large numbers [64], it ensures that the average of a large number of samples closely approximates the true average of the entire population. MC methods are widely used in optimization, numerical integration, and probability distribution analysis, with applications spanning fields such as biotechnology and space engineering.

One of the key applications of MC is the MC integration [65], which estimates the integral of complex expressions using probabilistic sampling. Another variant, the Markov chain Monte Carlo (MCMC) method [66], generates samples using a Markov chain from a probability distribution to estimate characteristics of multidimensional distributions that are otherwise difficult to calculate directly.

MCMC is particularly useful in Bayesian simulations, where it helps sample prior and posterior probabilities [67, 68]. Bayesian simulations, based on Bayes’ theorem, update the probability of an even as new data becomes available:

P(θ|X)= P(X|θ)PθPX.

In Eq. (13), P(θ|X) represents the posterior probability of θ given the new data X, P(X|θ) is the likelihood of X given θ, P(θ) is the prior probability of θ, and P(X) is the overall probability of observing X. Bayesian simulation uses this approach to approximate posterior distributions.

In metasurface design, MC methods can be used to model the impact of manufacturing errors of the metasurface on performance [69, 70]. Lin et al. [71] developed an optical inverse design algorithm using the MC tree search algorithm, clustering initial samples and evaluating the average performance index of each cluster to optimize designs. This method effectively solved complex design problems and achieved near-perfect reflection within the red wavelength range. Similarly, Wray et al. [72] used MC integration to design optical filters by sampling general optical properties of atom shapes and generating random metasurfaces with Bayesian optimization. This study demonstrated the capability of designing various fundamental filter types (bandpass, shortpass, longpass, and bandstop) using a single material and a single layer. These examples highlight how MC methods, particularly MCMC, are valuable tools for managing uncertainty and enhancing the inverse design process in metasurface engineering.

3.2. Optimization Algorithms for Inverse Design

3.2.1. Automatic Differentiation (AD)

Gradient-based optimizations are methods used to solve optimization problems by leveraging the gradient of an objective function, such as a loss function [73, 74]. They iteratively update the design from an initial point of the design variables, moving in the direction of the gradient toward the objective function. If the new design meets the convergence criterion or reaches the function’s minimum value, the algorithms terminate; Otherwise, they continue to adjust according to a specified learning rate. The update rule is expressed as:

xi+1= xi αfxi.

Here, xi+1 represents the updated coordinate, xi is the current coordinate, α is the learning rate, and ∆f (xi) represents the gradient of function f at xi.

Since these methods rely on the gradient, they are very sensitive to the initial value and often fail to find the global optimum, especially in high-dimensional problems. To address these challenges, various algorithms such as stochastic gradient descent (SGD) [75], gradient descent with momentum (GDM) [76], and adaptive moment estimation (Adam) [77] have been developed. Two prominent gradient-based optimization techniques, AD and AM, efficiently calculate gradients in complex systems with numerous variables, making them particularly useful for metasurface design. AD is an algorithm that automatically differentiates functions and accumulates the gradients of each operation efficiently, while AM performs a forward pass to obtain the objective function value and a backward pass to calculate the gradient. In other words, the AD and AM approaches were developed to reduce computational demands by calculating gradients with only one or two simulations, making them particularly useful in high-dimensional parameter spaces [10, 78].

AD is a computational technique that computes the partial differentiation of a function using a chain rule [79, 80]. The chain rule can be expressed as:

y= fghx=  fg= fu,
yx= yuuvvx.

AD breaks a function into smaller components and automatically calculates derivatives for each part, and functions primarily in two modes: Forward mode and reverse mode. In forward mode, derivatives are calculated from input x to output y, which is efficient when the input dimension is smaller than the output dimension. Conversely, reverse mode calculates gradients from output y back to input x, which is efficient when the output dimension is smaller than the input dimension. Figure 5 visually illustrates this calculation method. AD is widely used in ML and optimization problems, where the backpropagation algorithm employs AD to update weights in NNs while propagating the gradient backward [81].

Figure 5.A schematic of automatic differentiation (AD). In forward mode, derivatives are calculated from the input x to the output y, while in backward mode, derivatives are calculated from the output y back to the input x.

AD can be integrated into various optical simulations to design optimal device structures [61] and offers more efficient computations than other methods, such as the Gerchberg-Saxton algorithm, at high degrees of freedom [82]. Colburn and Majumdar [83] implemented AD for matrices with complex degenerate eigenvalues in optical design for faster gradient computation. AD is also used to design a phase profile of the metasurface with a desired electromagnetic response. For example, So et al. [84] used AD to design a single metasurface for three-dimensional, multi-color RGB holograms. Jin et al. [85] demonstrated that AD combined with RCWA can be used to modulate the scattering properties of nanostructures, which has applications in thermal management, integrated photonics, and miniature mirrors.

3.2.2. Adjoint Methods (AM)

The AM is an optimization algorithm like AD that calculates gradients effectively [86]. It employs adjoint equations to compute gradients for constraints in a given optimization problem, typically represented as g(x, p) = 0 [87]. The adjoint equation can be expressed as:

gxTλ= fxT,

where g is the constraint function, λ is the adjoint variable, and f is the target function to be optimized. The gradients are calculated by evaluating the adjoint equation in reverse time.

AM is widely used in the inverse design of high-numerical-aperture meta-lenses [88] and for optimizing complex functionalities such as angle-multiplexed metasurface holograms [8991]. This approach enables the creation of highly efficient metasurfaces with improved accuracy, even in constrained design spaces. For example, Ma et al. [92] used AM to design a double-slot metasurface with a periodic gold structure and achieved a sixfold increase in spontaneous emission efficiency. Similarly, Yin et al. [93] applied AM to optimize metasurface design and analyzed how meta-atom size influences optical properties.

3.2.3. Genetic Algorithms (GA)

Non-gradient optimization, also known as metaheuristic optimization, uses randomness to search for optimal solutions, in contrast to gradient-based optimization, which follows a deterministic path. Heuristic methods are designed to solve problems more quickly and efficiently, and are often inspired by social behavior, natural processes, or phenomena. By randomly selecting data to explore the parameter space, these methods increase the likelihood of finding a global optimum. However, exploring the parameter space without gradient guidance can be slower compared to following a well-defined gradient descent path. Nevertheless, non-gradient optimization remains a valuable tool for optimizing complex engineering designs where gradients are either not readily available or difficult to compute.

GA [94] is an optimization technique inspired by biological evolution that employs operators such as mutation, crossover, and selection to evolve solutions towards an optimal result. In each iteration, GA selects individuals from the current population as parents to produce offspring for the next generation. Through successive iterations, the population gradually evolves toward better solutions. GA typically begins with a randomly generated population where each individual’s fitness is evaluated using a user-defined objective function. The fittest individuals are selected as parents, ensuring that advantageous traits are passed on the next generation. A flowchart of GA is shown in Fig. 6(a).

Figure 6.A diagram of genetic algorithms (GA) and their application. (a) A schematic of GA. The algorithm follows a cyclical form, as illustrated, repeatedly evolving generations to find individuals that best satisfy the fitness function. (b) Schematic of a traditional imaging system with a color filter (left) compared to an imaging system with a metasurface (right). (b) is reprinted from X. Zou et al. Nat. Commun. 13, 3288 (2022). Copyright © 2022, X. Zou et al. [106].

In GA, individuals are often represented as fixed-size arrays, which facilitates crossover operations. There are several types of crossover operations [95], and the single-point crossover, where an individual is split at one point to combine segments, is a common one. For an array of size n, there are n − 1 possible crossover points. On the other hand, multi-point crossover divides the individual into two or more segments, allowing greater diversity but typically resulting in slower convergence. Other crossover methods include uniform crossovers [96], which operates probabilistically; Cycle crossovers [97], which uses permutations; and heuristic crossovers [98].

Mutation, another essential GA operation, introduces diversity by altering genes that are not present in the parent individuals, thus expanding the search space and helping to escape local optima. Although mutations occur probabilistically and can sometimes reduce the quality of the solution, they are essential for maintaining diversity and improving the quality of the individual. Various forms of mutations exist, including swapping values or shuffling genes [99]. Non-uniform mutation, where mutation intensity decreases as the process advances, is another method [100]. The algorithm terminates after a maximum number of generations is reached, or a satisfactory fitness level is achieved.

GA has been successfully applied to metasurface design to optimize optical properties [101, 102]. For example, Wang et al. [103] used GA to design an ultra-broadband absorptive metasurface with polarization angle insensitivity and stable oblique incidence performance. It demonstrated an adjustable absorption rate between 4% and 100% across an ultra-broadband range. Similarly, the absorptive metasurface developed by Luo et al. [104] has coherent absorption characteristics, with an absorption rate of exceeding 90% over a wide range of incident angles, and is insensitive to both TM and TE polarization. Yu et al. [105] used a multi-objective GA to design a metasurface-based microwave filter that achieved high transmittance for a target electromagnetic response at dual-bandpass. Additionally, Zou et al. [106] designed a color filter for high-intensity imaging using GA and achieved double image intensity of commercial color filters, as shown in Fig. 6(b). These examples highlight GA’s versatility and effectiveness in addressing complex optimization challenges in metasurface design.

3.2.4. Particle Swarm Optimization

PSO [107] is also one kind of metaheuristic algorithm inspired by social behaviors that express the movement of organisms such as birds or fish. Similar to GA, PSO performs optimization without relying on gradients [108]. The process begins with a randomly generated swarm of particles, where each particle represents a potential solution. The fitness of each particle is evaluated, and the swarm iteratively updates to converge toward optimal solutions. Figure 7(a) shows the key steps in the PSO framework.

Figure 7.A schematic representation of particle swarm optimization (PSO) applications. (a) Diagram of the algorithm with PSO. (b) Structure of waveguide crossing. The configuration of each nanostructure is determined using PSO and finite-difference time-domain (FDTD). (a) and (b) are reprinted from K. Goudarzi and M. Lee, Results Phys. 34, 105268 (2022). Copyright © 2022, K. Goudarzi and M. Lee [110].

Unlike GA, PSO does not use genetic operators such as crossover and mutation. Instead, particles adjust their velocities and positions based on their own experience and that of neighboring particles, allowing them to remember optimal positions found in previous iterations. This self-regulation enables PSO to adapt more effectively and helps to avoid local optima and increasing the likelihood of finding the global optimum. While GA relies on interaction between individuals, PSO benefits from information sharing across the swarm, making it more efficient in exploring the search space.

PSO is particularly valuable for rapidly designing optical devices by reducing computational time and resources. It has been applied to inverse design problems in waveguides [109111]. For instance, Goudarzi and Lee [110] optimized the design parameters of the binary waveguide as illustrated in Fig. 7(b), to achieve high performance and efficiency. Wu et al. [112] used PSO to design a metasurface capable of controlling the amplitude and phase of terahertz waves to improve beam control efficiency. This approach improved beam steering and focusing quality by 150% while broadening the operating bandwidth. Additionally, Lee et al. [113] designed a transmissive color filter that covers the full sRGB color space using dielectrics and metals, achieving an efficiency of more than 70%.

In this paper, we explored various ML and optimization algorithms for the inverse design of metasurfaces and highlighted their unique strengths and applications. In the case of ML techniques such as NNs, MDP, and MC methods, it can learn complex patterns from large-scale data, enabling the discovery of innovative designs beyond human intuition. However, these methods often function as a black box, which makes it challenging to interpret the underlying decision-making process.

In contrast, optimization algorithms such as gradient-based methods (AD and AM) and metaheuristic approaches (GA and PSO) provide greater transparency and offer clear pathways to achieving specific target functions. These methods leverage gradients or mimic natural processes to effectively explore design spaces, resulting in precise and interpretable solutions.

While both ML and optimization algorithms are powerful tools on their own, their combined or strategic use can significantly enhance the metasurface design process. While each method is powerful independently, by harnessing the instantaneous predictive capabilities of ML and the goal-oriented nature of optimization algorithms, their combined application uses the rapid predictive capabilities of ML with the targeted problem-solving nature of optimization. This synergy enables not only the development of more efficient and innovative metasurface designs but also the discovery of new physical phenomena and the realization of multifunctional capabilities, surpassing the limitations of traditional empirical methods. This inverse design strategy provides a cost-effective and versatile solution for addressing complex design challenges, such as achieving precise control over optical properties and optimizing transmission and reflection within specific spectral bands.

This work was supported by a Korea University Grant, and “Regional Innovation Strategy (RIS)” through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (MOE) (Grant No. 2021RIS-004).

This work was supported by a Korea University Grant, and Regional Innovation Strategy (RIS) through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (MOE) (Grant No. 2021RIS-004).

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

  1. X. Zou, R. Lin, Y. Fu, G. Gong, X. Zhou, S. Wang, S. Zhu, and Z. Wang, “Advanced optical imaging based on metasurfaces,” Adv. Opt. Mater. 12, 2203149 (2024).
    CrossRef
  2. N. Yu, P. Genevet, M. a Kats, F. Aieta, J.-P. Tetienne, F. Capasso, and Z. Gaburro, “Light propagation with phase reflection and refraction,” Science 334, 333-337 (2011).
    Pubmed CrossRef
  3. E. Hasman, V. Kleiner, G. Biener, and A. Niv, “Polarization dependent focusing lens by use of quantized Pancharatnam-Berry phase diffractive optics,” Appl. Phys. Lett. 82, 328-330 (2003).
    CrossRef
  4. M. Khorasaninejad and F. Capasso, “Metalenses: Versatile multifunctional photonic components,” Science 358, eaam8100 (2017).
    Pubmed CrossRef
  5. G. Zheng, H. Mühlenbernd, M. Kenney, G. Li, T. Zentgraf, and S. Zhang, “Metasurface holograms reaching 80% efficiency,” Nat. Nanotechnol. 10, 308-312 (2015).
    Pubmed CrossRef
  6. X. Ni, A. V. Kildishev, and V. M. Shalaev, “Metasurface holograms for visible light,” Nat. Commun. 4, 2807 (2013).
    CrossRef
  7. L. Huang, X. Chen, H. Mühlenbernd, G. Li, B. Bai, Q. Tan, G. Jin, T. Zentgraf, and S. Zhang, “Dispersionless phase discontinuities for controlling light propagation,” Nano. Lett. 12, 5750-5755 (2012).
    Pubmed CrossRef
  8. X. Ni, Z. J. Wong, M. Mrejen, Y. Wang, and X. Zhang, “An ultrathin invisibility skin cloak for visible light,” Science 349, 1310-1314 (2015).
    Pubmed CrossRef
  9. S. So, J. Mun, J. Park, and J. Rho, “Revisiting the design strategies for metasurfaces: Fundamental physics, optimization, and beyond,” Adv. Mater. 35, 2206399 (2023).
    Pubmed CrossRef
  10. J. Noh, T. Badloe, C. Lee, J. Yun, S. So, and J. Rho, “Inverse design meets nanophotonics: From computational optimization to artificial neural network,” Intell. Nanotechnol. 3-32 (2023).
    Pubmed KoreaMed CrossRef
  11. J. S. Jensen and O. Sigmund, “Topology optimization for nano-photonics,” Laser Photon. Rev. 5, 308-321 (2011).
    CrossRef
  12. C. Kang, C. Park, M. Lee, J. Kang, M. S. Jang, and H. Chung, “Large-scale photonic inverse design: Computational challenges and breakthroughs,” Nanophotonics (2024).
    Pubmed KoreaMed CrossRef
  13. Z. Li, R. Pestourie, Z. Lin, S. G. Johnson, and F. Capasso, “Empowering metasurfaces with inverse design: Principles and applications,” ACS Photonics 9, 2178-2192 (2022).
    CrossRef
  14. O. Sigmund, “On the usefulness of non-gradient approaches in topology optimization,” Struct. Multidiscip. Optim. 43, 589-596 (2011).
    CrossRef
  15. H. Pang, S. Yin, Q. Deng, Q. Qiu, and C. Du, “A novel method for the design of diffractive optical elements based on the Rayleigh-Sommerfeld integral,” Opt. Lasers Eng. 70, 38-44 (2015).
    CrossRef
  16. R. E. Kleinman and T. B. A. Senior, "Chapter 1- Rayleigh scattering," in Mechanics and Mathematical Methods-Series of Handbooks, (Elsevier, Netherland, 1986), Vol. 2, pp. 1-70.
    CrossRef
  17. A. Dorodnyy, J. Smajic, and J. Leuthold, “Mie scattering for photonic devices,” Laser Photon. Rev. 17, 2300055 (2023).
    CrossRef
  18. P. Q. Mantas, “Dielectric response of materials: extension to the Debye model,” J. Eur. Ceram. Soc. 19, 2079-2086 (1999).
    CrossRef
  19. A. Taflove, S. C. Hagness, and M. Piket-May, "Computational electromagnetics: The finite-difference time-domain method,," in The Electrical Engineering Handbook, 3rd ed., R. C. Dorf, Ed (CRC Press, USA, 2005), pp. 629-670.
    CrossRef
  20. M. G. Moharam and T. K. Gaylord, “Rigorous coupled-wave analysis of planar-grating diffraction,” J. Opt. Soc. Am. 71, 811-818 (1981).
    CrossRef
  21. J. Jianming, The Finite Element Method in Electromagnetics, 3rd ed. (Wiley-IEEE Press, USA, 2014).
    CrossRef
  22. İ. R. Çapoğlu, C. A. White, J. D. Rogers, H. Subramanian, A. Taflove, and V. Backman, “Numerical simulation of partially coherent broadband optical imaging using the finite-difference time-domain method,” Opt. Lett. 36, 1596-1598 (2011).
    Pubmed KoreaMed CrossRef
  23. ianwilliamson, “Ceviche Challenges: Photonic Inverse Design Suite: A suite of photonic inverse design challenge problems for topology optimization benchmarking,” (GitHub, Published date: Jul. 21, 2022), https://github.com/google/ceviche-challenges (Accessed date: Dec. 1, 2024)
  24. A. F. Oskooi, D. Roundy, M. Ibanescu, P. Bermel, J. D. Joannopoulos, and S. G. Johnson, “Meep: A flexible free-software package for electromagnetic simulations by the FDTD method,” Comput. Phys. Commun. 181, 687-702 (2010).
    CrossRef
  25. L. Li, “New formulation of the Fourier modal method for crossed surface-relief gratings,” J. Opt. Soc. Am. A 14, 2758 (1997).
    CrossRef
  26. E. N. Glytsis and T. K. Gaylord, "Review and applications of rigorous coupled-wave analysis of grating diffraction," in Diffraction Optics: Design, Fabrication, and Applications (Optica Publishing Group, 1992), p. paper MD1.
    CrossRef
  27. M. G. Moharam, D. A. Pommet, E. B. Grann, and T. K. Gaylord, “Stable implementation of the rigorous coupled-wave analysis for surface-relief gratings: Enhanced transmittance matrix approach,” J. Opt. Soc. Am. A 12, 1077-1086 (1995).
    CrossRef
  28. G. Quaranta, G. Basset, O. J. F. Martin, and B. Gallinet, “Recent advances in resonant waveguide gratings,” Laser Photon. Rev. 12, 1800017 (2018).
    CrossRef
  29. R. Gansch, S. Kalchmair, P. Genevet, T. Zederbauer, H. Detz, A. M. Andrews, W. Schrenk, F. Capasso, M. Lončar, and G. Strasser, “Measurement of bound states in the continuum by a detector embedded in a photonic crystal,” Light Sci. Appl. 5, e16147 (2016).
    Pubmed KoreaMed CrossRef
  30. A. A. Zharov and N. A. Zharova, “Light-Induced diffraction gratings on liquid metamaterial metasurfaces,” J. Exp. Theor. Phys. 135, 808-812 (2022).
    CrossRef
  31. V. Liu and S. Fan, “S4: A free electromagnetic solver for layered periodic structures,” Comput. Phys. Commun. 183, 2233-2244 (2012).
    CrossRef
  32. J. P. Hugonin and P. Lalanne, “RETICOLO software for grating analysis,” arXiv: 2101.00901v3 (2021).
  33. G. Yoon and J. Rho, “MAXIM: Metasurfaces-oriented electromagnetic wave simulation software with intuitive graphical user interfaces,” Comput. Phys. Commun. 264, 107846 (2021).
    CrossRef
  34. Y. Kim, A. W. Jung, S. Kim, K. Octavian, D. Heo, C. Park, J. Shin, S. Nam, C. Park, J. Park, S. Han, J. Lee, S. Kim, M. S. Jang, and C. Y. Park, “Meent: Differentiable electromagnetic simulator for machine learning,” arXiv:2406.12904v1 (2024).
  35. R. W. Clough, “The Finite Element Method in Plane Stress Analysis,” in Proc. 2nd Conference on Electronic Computation (Pittsburgh, Pa, USA, Sep. 8-9, 1960).
    CrossRef
  36. S. Levy, “Structural analysis and influence coefficients for delta wings,” J. Aeronaut. Sci. 20, 449-454 (1953).
    CrossRef
  37. M. Kronbichler, "The discontinuous galerkin method: Derivation and properties," in Efficient High-order Discretizations for Computational Fluid Dynamics, M. Kronbichler and P.-O. Persson, Eds (Springer Cham, Swiss, 2021), pp. 1-55.
    CrossRef
  38. S. Zuo, D. G. Doñoro, Y. Zhang, Y. Bai, and X. Zhao, “Simulation of challenging electromagnetic problems using a massively parallel finite element method solver,” IEEE Access 7, 20346-20362 (2019).
    CrossRef
  39. P. C. Kohnke, "ANSYS," in Finite Element Systems, C. A. Brebbia, Ed (Springer Berlin, Germany, 1982), pp. 19-25.
    CrossRef
  40. MathWorks, “COMSOL Multiphysics-Interactive multiphysics modeling and simulation,” (MathWorks), https://kr.mathworks.com/products/connections/product_detail/comsol-multiphysics.html (Accessed date: Dec. 1, 2024)
  41. J. Chen, J. Huang, M. An, P. Hu, Y. Xie, J. Wu, and Y. Chen, “Application of machine learning on the design of acoustic metamaterials and phonon crystals: A review,” Smart Mater. Struct. 33, 073001 (2024).
    CrossRef
  42. D. Whitley, “A genetic algorithm tutorial,” Stat. Comput. 4, 65-85 (1994).
    CrossRef
  43. J. P. S. Rosa, D. J. D. Guerra, N. C. G. Horta, R. M. F. Martins, and N. C. C. Lourenço, "Overview of artificial neural networks," Using Artificial Neural Networks for Analog Integrated Circuit Design Automation, (Springer, USA, 2020), pp. 21-44.
    KoreaMed CrossRef
  44. J. Schmidhuber, “Deep Learning in neural networks: An overview,” Neural Netw. 61, 85-117 (2015).
    Pubmed CrossRef
  45. A. Tsantekidis, N. Passalis, and A. Tefas, "Recurrent neural networks,," in Deep Learning for Robot Perception and Cognition, A. Iosifidis and A. Tefas, Eds (Academic Press, USA, 2022), Chapter 5, pp. 101-115.
    CrossRef
  46. N. Ketkar and J. Moolayil, "Convolutional neural networks,," in Deep Learning with Python, 2nd ed., N. Ketkar and J. Moolayil, Eds (Apress Berkeley, USA, 2021), pp. 197-242.
    CrossRef
  47. Y. H. Ma and Y. Hao, "Chapter 8-Deep learning in metasurface design and optimization,," in Metamaterials-by-Design: Theory, Technologies, and Vision, A. Alù, N. Engheta, A. Massa, and G. Oliveri, Eds (Elsevier, Netherland, 2024), pp. 203-232.
    CrossRef
  48. Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou, “A survey of convolutional neural networks: Analysis, applications, and prospects,” IEEE Trans. Neural Netw. Learn. Syst. 33, 6999-7019 (2022).
    Pubmed CrossRef
  49. M. Sun, Z. Song, X. Jiang, J. Pan, and Y. Pang, “Learning pooling for convolutional neural network,” Neurocomputing 224, 96-104 (2017).
    CrossRef
  50. A. Zafar, M. Aamir, N. Mohd Nawi, A. Arshad, S. Riaz, A. Alruban, A. K. Dutta, and S. Almotairi, “A comparison of pooling methods for convolutional neural networks,” Appl. Sci. 12, 8643 (2022).
    CrossRef
  51. M. Nandipati, O. Fatoki, and S. Desai, “Bridging nanomanufacturing and artificial intelligence-A comprehensive review,” Materials 17, 1621 (2024).
    Pubmed KoreaMed CrossRef
  52. Y. Li, Y. Zhang, Y. Wang, J. Li, X. Jiang, G. Yang, K. Zhang, Y. Yuan, J. Fu, X. G. Di, and C. Wang, “Multifunctional metasurface inverse design based on ultra-wideband spectrum prediction neural network,” Adv. Opt. Mater. 12, 2302657 (2024).
    CrossRef
  53. Y. Chen, Q. Wang, D. Cui, W. Li, M. Shi, and G. Zhao, “Inverse design of nanohole all-dielectric metasurface based on deep convolutional neural network,” Opt. Commun. 569, 130793 (2024).
    CrossRef
  54. M. H. Tahersima, K. Kojima, T. Koike-Akino, D. Jha, B. Wang, C. Lin, and K. Parsons, “Deep neural network inverse design of integrated photonic power splitters,” Sci. Rep. 9, 1368 (2019).
    Pubmed KoreaMed CrossRef
  55. M. L. Puterman, "Markov decision processes: Discrete stochastic dynamic programming," (Wiley Series in Probability and Statistics) (John Wiley & Sons, USA, 2008).
  56. N. G. van Kampen, "Markov processes," in Stochastic Processes in Physics and Chemistry, 3rd ed., N. G. van Kampen, Ed (Elsevier, Netherland, 2007), pp. 73-95.
    CrossRef
  57. Q.-L. Li, "Markov reward processes," Constructive Computation in Stochastic Models with Applications, (Springer Berlin, Germany, 2010), pp. 526-573.
    CrossRef
  58. M. L. Puterman, "Markov decision processes," in Handbooks in Operations Research and Management Science, (Elsevier, Netherland, 1990), Vol. 2, pp. 331-434.
    CrossRef
  59. R. Li, C. Zhang, W. Xie, Y. Gong, F. Ding, H. Dai, Z. Chen, F. Yin, and Z. Zhang, “Deep reinforcement learning empowers automated inverse design and optimization of photonic crystals for nanoscale laser cavities,” Nanophotonics 12, 319-334 (2023).
    Pubmed KoreaMed CrossRef
  60. L. Rosafalco, J. M. De Ponti, L. Iorio, R. V. Craster, R. Ardito, and A. Corigliano, “Reinforcement learning optimisation for graded metamaterial design using a physical-based constraint on the state representation and action space,” Sci. Rep. 13, 21836 (2023).
    Pubmed KoreaMed CrossRef
  61. C. Park, S. Kim, A. W. Jung, J. Park, D. Seo, Y. Kim, C. Park, C. Y. Park, and M. S. Jang, “Sample-efficient inverse design of freeform nanophotonic devices with physics-informed reinforcement learning,” Nanophotonics 13, 1483-1492 (2024).
    Pubmed KoreaMed CrossRef
  62. H. Wankerl, M. L. Stern, A. Mahdavi, C. Eichler, and E. W. Lang, “Parameterized reinforcement learning for optical system optimization,” J. Phys. D: Appl. Phys. 54, 305104 (2021).
    CrossRef
  63. M. J. Fryer, “Simulation and the Monte Carlo method. by R. Y. Rubinstein,” J. R. Stat. Soc. Ser. A 146, 95-96 (1983).
    CrossRef
  64. F. M. Dekking, C. Kraaikamp, H. P. Lopuhaä, and L. E. Meester, "Testing hypotheses: Elaboration," in A Modern Introduction to Probability and Statistics: Understanding Why and How (Springer Texts in Statistics Series), (Springer-Verlag London, UK, 2005), pp. 383-397.
    CrossRef
  65. C. P. Robert and G. Casella, "Monte Carlo Integration,," in Monte Carlo Statistical Methods, 1st ed. (Springer New York, USA, 1999), pp. 71-138.
    CrossRef
  66. S. Brooks, “Markov chain Monte Carlo method and its application,” J. R. Stat. Soc.: Series D (The Statistician) 47, 69-100 (1998).
    CrossRef
  67. C. J. Geyer, "Introduction to Markov Chain Monte Carlo," in Handbook of Markov Chain Monte Carlo, S. Brooks, A. Gelman, G. Jones, and X.-L. Meng, Eds (Chapman and Hall/CRC, USA, 2011), pp. 3-48.
    CrossRef
  68. S. Jackman, “Estimation and inference via Bayesian simulation: An introduction to Markov Chain Monte Carlo,” Am. J. Pol. Sci. 44, 375-404 (2000).
    CrossRef
  69. M. Panipinto and J. D. Ryckman, “Effective medium metasurfaces using nanoimprinting of the refractive index: Design, performance, and predictive tolerance analysis,” Opt. Mater. Express 14, 847-861 (2024).
    CrossRef
  70. S. Goel, S. Leedumrongwatthanakun, N. H. Valencia, W. McCutcheon, A. Tavakoli, C. Conti, P. W. H. Pinkse, and M. Malik, “Inverse design of high-dimensional quantum optical circuits in a complex medium,” Nat. Phys. 20, 232-239 (2024).
    CrossRef
  71. R. Lin, V. Valuckas, T. Thu, H. Do, A. Nemati, A. I. Kuznetsov, J. Teng, S. T. Ha, R. Lin, V. Valuckas, T. T. H. Do, A. Nemati, A. I. Kuznetsov, J. Teng, and S. T. Ha, “Schrödinger's red beyond 65,000 pixel-per-inch by multipolar interaction in freeform meta-atom through efficient neural optimizer,” Adv. Sci. 11, 2303929 (2024).
    Pubmed KoreaMed CrossRef
  72. P. R. Wray, E. G. Paul, and H. A. Atwater, “Optical filters made from random metasurfaces using Bayesian optimization,” Nanophotonics 13, 183-193 (2024).
    Pubmed KoreaMed CrossRef
  73. S. Ruder, “An overview of gradient descent optimization algorithms,” arXiv:1609.04747v1 (2016).
    CrossRef
  74. J. Zhang, “Gradient descent based optimization algorithms for deep learning models training,” arXiv:1903.03614v1 (2019).
    CrossRef
  75. N. Ketkar, "Stochastic gradient descent," Deep Learning with Python, (Springer, USA, 2017), pp. 113-132.
    CrossRef
  76. I. Sutskever, J. Martens, G. Dahl, and G. Hinton, “On the importance of initialization and momentum in deep learning,” Proc. Mach. Learn. Res. 28, 1139-1147 (2013).
  77. D. Yi, J. Ahn, and S. Ji, “An effective optimization method for machine learning based on ADAM,” Appl. Sci. 10, 1073 (2020).
    CrossRef
  78. T. Hughes, I. Williamson, M. Minkov, and S. Fan, “Forward-mode differentiation of Maxwell's equations,” ACS Photonics 6, 3010-3016 (2019).
    CrossRef
  79. A. Güneş Baydin, B. A. Pearlmutter, A. A. Radul, and J. M. Siskind, “Automatic differentiation in machine learning: A survey,” J. Mach. Learn. Res. 18, 1-43 (2018).
  80. R. D. Neidinger, “Introduction to automatic differentiation and MATLAB object-oriented programming,” SIAM Review 52, 545-563 (2010).
    CrossRef
  81. A. G. Baydin and B. A. Pearlmutter, “Automatic differentiation of algorithms for machine learning,” arXiv:1404.7456v1 (2014).
  82. P. Georgi, Q. Wei, B. Sain, C. Schlickriede, Y. Wang, L. Huang, and T. Zentgraf, “Optical secret sharing with cascaded metasurface holography,” Sci. Adv. 7, eabf9718 (2021).
    Pubmed KoreaMed CrossRef
  83. S. Colburn and A. Majumdar, “Inverse design and flexible parameterization of meta-optics using algorithmic differentiation,” Commun. Phys. 4, 65 (2021).
    CrossRef
  84. S. So, J. Kim, T. Badloe, C. Lee, Y. Yang, H. Kang, and J. Rho, “Multicolor and 3D holography generated by inverse-designed single-cell metasurfaces,” Adv. Mater. 35, 2208520 (2023).
    Pubmed CrossRef
  85. W. Jin, W. Li, M. Orenstein, and S. Fan, “Inverse design of lightweight broadband reflector for relativistic lightsail propulsion,” ACS Photonics 7, 2350-2355 (2020).
    CrossRef
  86. C. Kang, C. Park, M. Lee, J. Kang, M. S. Jang, and H. Chung, “Large-scale photonic inverse design: Computational challenges and breakthroughs,” Nanophotonics 13, 3765-3792 (2024).
    Pubmed KoreaMed CrossRef
  87. A. M. Bradley, “PDE-constrained optimization problems and the adjoint method,” (Stanford University, Published date: 2010), https://cs.stanford.edu/~ambrad/adjoint_tutorial.pdf (Accessed date: Oct. 5, 2024)
  88. H. Chung and O. D. Miller, “High-NA achromatic metalenses by inverse design,” Opt. Express 28, 6945-6965 (2020).
    Pubmed CrossRef
  89. M. Zhou, D. Liu, S. W. Belling, H. Cheng, M. A. Kats, S. Fan, M. L. Povinelli, and Z. Yu, “Inverse design of metasurfaces based on coupled-mode theory and adjoint optimization,” ACS Photonics 8, 2265-2273 (2021).
    CrossRef
  90. M. Mansouree, A. McClung, S. Samudrala, and A. Arbabi, “Large-scale parametrized metasurface design using adjoint optimization,” ACS Photonics 8, 455-463 (2021).
    CrossRef
  91. Y. Zhou, Y. Shao, C. Mao, and J. A. Fan, “Inverse-designed metasurfaces with facile fabrication parameters,” J. Opt. 26, 055101 (2024).
    CrossRef
  92. H. Ma, G. Bao, J. Lai, and J. Lin, “Inverse design of a grating metasurface for enhancing spontaneous emission through hyperbolic metamaterials,” J. Opt. Soc. Am. B 41, A79-A85 (2024).
    CrossRef
  93. Y. Yin, Q. Jiang, H. Wang, J. Liu, Y. Xie, Q. Wang, Y. Wang, and L. Huang, “Multi-dimensional multiplexed metasurface holography by inverse design,” Adv. Mater. 36, 2312303 (2024).
    Pubmed CrossRef
  94. M. Mitchell, An Introduction to Genetic Algorithms (The MIT Press 221, USA, 1998).
    CrossRef
  95. P. W. Poon and J. N. Carter, “Genetic algorithm crossover operators for ordering applications,” Comput. Oper. Res. 22, 135-147 (1995).
    CrossRef
  96. G. Syswerda, “Uniform crossover in genetic algorithms,,” in Proc. 3rd International Conference on Genetic Algorithms (Fairfax, Virginia, USA, Jun. 1989), pp. 2-9.
  97. A. Hussain, Y. S. Muhammad, M. Nauman Sajid, I. Hussain, A. Mohamd Shoukry, and S. Gani, “Genetic algorithm for traveling salesman problem with modified cycle crossover operator,” Comput. Intell. Neurosci. 2017, 7430125 (2017).
    Pubmed KoreaMed CrossRef
  98. K. F. Pál, “Genetic algorithms for the traveling salesman problem based on a heuristic crossover operation,” Biol. Cybern. 69, 539-546 (1993).
    CrossRef
  99. S. Harifi and R. Mohamaddoust, “Zigzag mutation: A new mutation operator to improve the genetic algorithm,” Multimed. Tools Appl. 82, 45411-45432 (2023).
    CrossRef
  100. A. Neubauer, “Theoretical analysis of the non-uniform mutation operator for the modified genetic algorithm,” in Proc. 1997 IEEE Int. Conf. Evolutionary Computation-ICEC'97 (Indianapolis, IN, USA, Apr. 13-16, 1997), pp. 93-96.
    CrossRef
  101. H. Yang, X. Cao, F. Yang, J. Gao, S. Xu, M. Li, X. Chen, Y. Zhao, Y. Zheng, and S. Li, “A programmable metasurface with dynamic polarization, scattering and focusing control,” Sci. Rep. 6, 35692 (2016).
    Pubmed KoreaMed CrossRef
  102. P. R. Wiecha, A. Arbouet, C. Girard, A. Lecestre, G. Larrieu, and V. Paillard, “Evolutionary multi-objective optimization of colour pixels based on dielectric nanoantennas,” Nat. Nanotechnol. 12, 163-169 (2017).
    Pubmed CrossRef
  103. Y. Wang, G. Wu, J. Zhang, X. Wu, G. Yuan, and J. Liu, “Genetic algorithm-enhanced design of ultra-broadband tunable terahertz metasurface absorber,” Opt. Laser Technol. 170, 110262 (2024).
    CrossRef
  104. P. Luo, G. Lan, J. Nong, X. Zhang, T. Xu, and W. Wei, “Broadband coherent perfect absorption employing an inverse-designed metasurface via genetic algorithm,” Opt. Express 30, 34429-34440 (2022).
    Pubmed CrossRef
  105. K. Yu, J. Ge, H. Li, Y. Zhang, H. Dong, and L. Zhang, “Inverse design of dual-bandpass metasurface filters empowered by the multi-objective genetic algorithm,” Opt. Commun. 566, 130695 (2024).
    CrossRef
  106. X. Zou, Y. Zhang, R. Lin, G. Gong, S. Wang, S. Zhu, and Z. Wang, “Pixel-level Bayer-type colour router based on metasurfaces,” Nat. Commun. 13, 3288 (2022).
    Pubmed KoreaMed CrossRef
  107. A. Slowik, "Particle swarm optimization," Swarm Intelligence Algorithms, (CRC Press, USA, 2020), pp. 265-277.
    CrossRef
  108. D. Wang, D. Tan, and L. Liu, “Particle swarm optimization algorithm: An overview,” Soft Comput. 22, 387-408 (2018).
    CrossRef
  109. T. Baba, M. Nakata, R. Shiratori, and K. Hayashi, “Particle swarm optimization of silicon photonic crystal waveguide transition,” Opt. Lett. 46, 1904-1907 (2021).
    Pubmed CrossRef
  110. K. Goudarzi and M. Lee, “Inverse design of a binary waveguide crossing by the particle swarm optimization algorithm,” Results Phys. 34, 105268 (2022).
    CrossRef
  111. C. Sun, Y. Yu, G. Chen, and X. Zhang, “Ultra-compact bent multimode silicon waveguide with ultralow inter-mode crosstalk,” Opt. Lett. 42, 3004-3007 (2017).
    Pubmed CrossRef
  112. Q. Wu, W.-H. Fan, C. Qin, and X.-Q. Jiang, “Dual-parameter controlled reconfigurable metasurface for enhanced terahertz beamforming via inverse design method,” Phys. Scr. 99, 065517 (2024).
    CrossRef
  113. C. Lee, S. Lee, J. Seong, D. Y. Park, and J. Rho, “Inverse-designed metasurfaces for highly saturated transmissive colors,” J. Opt. Soc. Am. B 41, 151-158 (2024).
    CrossRef

Article

Invited Review Paper

Curr. Opt. Photon. 2024; 8(6): 531-544

Published online December 25, 2024 https://doi.org/10.3807/COPP.2024.8.6.531

Copyright © Optical Society of Korea.

A Tutorial on Inverse Design Methods for Metasurfaces

Jin-Young Jeong1, Sabiha Latif2, Sunae So1

1Department of Control and Instrumentation Engineering, Korea University, Sejong 30019, Korea
2Institute for Photonics and Materials, Korea University, Sejong 30019, Korea

Correspondence to:*sunaeso@korea.ac.kr, ORCID 0000-0001-8606-2234
These authors contributed equally to this paper.

Received: October 14, 2024; Revised: November 22, 2024; Accepted: November 22, 2024

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

This paper provides a tutorial on inverse design approaches for metasurfaces with a systematic analysis of the fundamental methodologies and underlying principles for achieving targeted optical properties. Traditionally, metasurfaces have been designed with extensive trial-and-error methods using analytical modeling and numerical simulations. However, as metasurface complexity grows, these conventional techniques become increasingly inefficient in exploring the vast design space. Recently, machine learning and optimization algorithms have emerged as powerful tools for overcoming these challenges and enabling more efficient and accurate inverse design. We begin by introducing the fundamentals of optical simulations used for forward modeling of metasurfaces and their relevance to inverse design. Next, we explore recent advancements in applying machine learning techniques such as neural networks, Markov decision processes, and Monte Carlo simulations, as well as optimization algorithms, including automatic differentiation, the adjoint method, genetic algorithms, and particle swarm optimizations, and show their potential to revolutionize the metasurface design process. Finally, we conclude with a summary of key findings and insights from this review.

Keywords: Inverse design, Machine learning, Metasurface, Optimization algorithm

I. INTRODUCTION

Over the past decades, light-matter interaction has been an important concern for breakthroughs in nanophotonics research in advancing new technologies for operations with real-time, on-chip optical interconnects. The advancements in nanofabrication technologies made it possible to manufacture complex structures such as metamaterials/metasurfaces, which manipulate light at the nanoscale among nanostructured surfaces with features at or below the electromagnetic wavelength scale [13]. Metasurfaces, which are patterned subwavelength structures, offer the distinctive capability of wavefront control over phase and amplitude, which makes them suitable for applications in miniaturized optical devices such as advanced imaging systems [4], optical communication [5], 3D displays [6], augmented reality [7] and invisibility cloaking [8].

The conventional approach to designing metasurfaces involves investigating their geometrical characteristics to understand how they affect optical properties based on physical insights. Additionally, achieving multifunctionality often requires managing complex physical phenomena and interactions, leading to high levels of entanglement when identifying the effective approach for optimal solutions. This process can be time consuming and often requires extensive experimental verifications.

In contrast to conventional intuition-driven design approaches, inverse design techniques use computational algorithms to efficiently optimize or retrieve optical system design based on the desired optical characteristics [9]. Computational optimization approaches have received significant attention in the development of innovative optical devices with a broad variety of applications.

The inverse design of metasurfaces can be addressed using two primary approaches: Machine learning (ML) and computational optimization. These approaches play an efficient role in determining the optical characteristics of proposed metasurfaces, designing new metasurfaces, and optimizing designs to meet required optical specifications [10]. ML excels at rapidly predicting complex optical behaviors when trained on sufficient data and enables the generation of efficient designs without the need for repeated simulations. In contrast, optimization algorithms iteratively refine designs to achieve specific optical criteria, making them complementary tools for inverse design. Together, ML and computational optimization offer powerful methods for addressing the complexities of metasurface design. In the realm of applied mathematics, optimization techniques have laid the foundation for conventional computational algorithms and analytical methods, which are applied to solve complex technological, industrial, and economic optimization problems.

Recently, advancements of ML techniques and algorithmic breakthroughs have been achieved to address the computational challenges of developing substantial, high-efficiency photonic devices. Consequently, inverse design has become more efficient with ML, particularly in estimating the solutions to Maxwell’s equations while reducing computational costs. ML employs models that have been trained on data for the generation of exceptionally efficient photonic designs, thus eliminating the necessity for additional simulations. In contrast, computational optimization uses gradient-based or heuristic optimization algorithms to iteratively refine designs through multiple simulation steps [11]. While ML often relies on gradients and backpropagation, a major distinction is that conventional computational inverse design techniques are typically defined by a specific optimization target function. This might involve achieving desired optical characteristics such as resonance at a particular wavelength or enhancing a device’s broadband efficiency. With the rapid advancement of ML through the integration of AI and neural networks, it is now possible to address complicated spectral properties such as dual polarization and multiple resonances in photonic structures, as well as improve design procedures. Significant advancements in photonics have mainly been made to these techniques, which are currently providing end-to-end design solutions that surpass conventional forward and inverse design approaches [12].

Optimization algorithms can be classified into gradient based and non-gradient based methods [11, 13]. Gradient-based approaches are directional optimization methods that use gradient information for differentiable functions to find optimal solutions. However, these methods may struggle to find global solutions for nonconvex functions. In contrast, non-gradient methods do not rely on gradients, but explore the solution space with heuristic techniques or random search strategies. These methods are useful for complex problems where gradients are difficult to compute. Each method has different calculation methods and algorithms, which will be explained in Section 3.2. Additionally, non-gradient methods have a higher likelihood of finding the global optimum compared to gradient-based methods, which are more prone to getting stuck in local optima [14]. However, exploring the parameter space without gradient guidance can be slower than following a well-defined gradient descent path. Despite this, non-gradient optimization is a valuable tool when optimizing complex engineering designs where gradients are not readily available or easily calculated.

This tutorial explores the inverse design methods for metasurfaces using ML and optimization algorithms (Fig. 1). It aims to provide a deeper understanding of the functionalities, strengths, and limitations of various inverse design approaches with a detailed examination of their methodologies. The first part of the review covers forward calculation in optical simulations, which forms the basis for analyzing and designing metasurfaces. These include finite-difference time-domain (FDTD), rigorous coupled-wave analysis (RCWA), and finite element method (FEM). Next, we discuss the use of advanced ML approaches for inverse design, such as neural network (NN), Markov decision process (MDP), and Monte Carlo (MC) methods. Finally, we explore various optimization techniques employed to improve and optimize the design process, including automatic differentiation (AD), adjoint method (AM), genetic algorithm (GA), and particle swarm optimization (PSO).

Figure 1. Schematic illustration of an overview of inverse design metasurfaces using machine learning and optimization methods.

II. FORWARD CALCULATION IN OPTICAL SIMULATION

Electromagnetism, one of the fundamental forces of nature, is essential to all innovative technologies and natural phenomena. Central to understanding electromagnetism are Maxwell’s equations, which provide a mathematical framework for describing electromagnetic fields. Early research focused on identifying precise solutions for canonical geometries in Maxwell’s equations. Subsequently, innovative research introduced approaches including Sommerfield integral [15], Rayleigh scattering [16], Mie scattering [17], and the Debye model [18].

The advent of advanced computing has led to more efficient and flexible approaches to solving Maxwell’s equations in complex scenarios. Unlike precise or approximative analytical techniques, numerical methods offer greater adaptability and can handle more intricate situations. However, the scale at which structures can be developed and simulated in computational domains is constrained by available computational resources, which significantly affects the number of simulations that can be conducted.

In this chapter, we introduce key computational techniques such as the FDTD, RCWA, and FEM [1921].

2.1. Finite-difference Time-domain (FDTD)

The FDTD method is a widely used numerical technique for simulating light-matter interactions in various media. Introduced in 1966, FDTD solves Maxwell’s equations in the time domain by discretizing both time and space into grid units with alternating electromagnetic field components. Figure 2(a) illustrates the FDTD approach. It is particularly effective for full-wave evaluations of electromagnetic waves (EM) in complex media and intricate geometries.

Figure 2. Schematic of forward calculation techniques in optical simulation: (a) Finite-difference time-domain (FDTD), (b) rigorous coupled-wave analysis (RCWA), and (c) finite element method (FEM).

Using the Yee grid, the FDTD method iteratively updates electric (E) and magnetic (H) fields over a segmented simulation volume until the volume is saturated with excitation fields or until they exit. The technique approximates spatial and temporal derivatives using finite differences to solve Maxwell’s equations.

Faradays’ law is as follows:

 × E = Bt ,

and Ampere-Maxwell’s law is as follows:

 × H = J+ Dt, 

where J represents electric current density, D is electric displacement, and B is magnetic flux density. Material properties are characterized by conductivity (), electric permittivity (), and magnetic permeability (µ).

The FDTD method requires numerous spatial pixels and iteration and time steps to achieve high accuracy, and typically uses pixel sizes smaller than the wavelength of light. This method is computationally intensive, but scaling gradient computation helps optimize time and memory requirements based on independent input and output parameters.

The FDTD method is valued for its accuracy in simulating complex geometries and materials across broad frequency ranges, particularly for time-domain responses, such as broadband or fluctuating signals [22]. However, accurately capturing wave phenomena demands precise spatial discretization, which can substantially increase computational costs, especially in large-scale or broadband-frequency applications. Its applicability and adaptability are improved by open-source software such as Ceviche [23] and Meep [24], which provides memory synchronization and full scriptability.

FDTD is applicable in fields from classical to quantum physics and from subatomic to interstellar lengths. It plays a significant role in microstructures and enables simulations of natural wide-spectrum sources and analysis of various parameters such as structural size, morphology, and refractive indices before fabrication.

2.2. Rigorous Coupled-wave Analysis (RCWA)

The RCWA method is a powerful semi-analytical computational method for solving Maxwell’s equations in periodic structures. Initially developed by Dr. M. G. Moharram and Dr. Thomas K. Gaylord in the 1980s, RCWA has since become a standard tool in computational electromagnetics, particularly for analyzing diffraction gratings, photonic crystals, and other periodic optical devices.

RCWA operates on a plane wave foundation, also known as the Fourier modal method (FMM) or the transfer matrix method, which relies on Fourier-space analysis to efficiently calculate electromagnetic eigenmodes for transmittance and reflectance [25]. The fundamental principle of RCWA involves transforming Maxwell’s equations into Fourier space, where the periodicity of the structure enables an efficient semi-analytical solution through the division of electromagnetic fields into spatial harmonics. RCWA discretizes the spatial domain in the x- and y-dimensions by assuming uniformity in the z-direction. The RCWA technique is depicted in Fig. 2(b). By applying Bloch’s theorem, it determines the Bloch modes within the diffraction layers, which restrict the electric field responses to a finite set due to the periodic nature of the structure [26]. The material topology of each layer, expressed in Fourier space, is inherently linked to these modes and their Fourier components. The EM field propagation within the structure is then calculated using a modified transfer matrix approach for precise mathematical modeling of the light behavior throughout the system [27]. RCWA is particularly effective in evaluating two-dimensional and three-dimensional periodic structures such as harmonic waveguides [28], photonic crystals [29], and diffraction gratings [30]. Its ability to analyze both the H field profile and potential modes within the structure makes it especially valuable for applications involving 2D silicon meta-gratings.

Compared to fully numerical methods such as FDTD, RCWA offers significant computational advantages due to its selective discretization, and it can efficiently handle complex multilayered structures with varying refractive indices. This efficiency is particularly beneficial when working with large-scale periodic structures. However, the precision and effectiveness of RCWA depend on the choice of Fourier components. Increasing the number of Fourier components enhances simulation accuracy but simultaneously increases computational resource demands and requires a careful balance between accuracy and efficiency.

Ongoing developments continue to improve RCWA’s convergence rates with the aim of more accurate and resource-efficient simulations. Widely used open-source RCWA tools include S4 [31] and RETICOLO [32], which are designed for Python and MATLAB environments, respectively. More recent software, such as MAXIM [33], offers user-friendly graphical interfaces. Additionally, tools such as Meent [34] have introduced advancements in convergence and AD, significantly enhancing the efficiency and capabilities of RCWA for photonics research.

2.3. Finite Element Method (FEM)

FEM first emerged in the 1940s to address challenges in engineering, primarily in the fields of structural and aeronautical engineering [35, 36]. Initially developed for these fields, FEM has since expanded to a wide range of engineering problems as well. FEM is a powerful numerical technique that approximates solutions to partial differential equations and is widely employed for simulating optical phenomena and optical device characteristics where formulations of Maxwell’s equations are applied.

The fundamental concept of FEM involves dividing large volumes and complex structures into many small, manageable, finite elements and facilitating the meshing process to break the domain of the solution into smaller, more manageable elements. These elements, which can be triangles, quadrilaterals, tetrahedra, prisms, or hexahedra, can be used in the creation of irregular meshes capable of capturing intricate design units, as shown in Fig. 2(c). After meshing, the solution is estimated using a finite number of basis functions, typically low-order polynomials that are non-zero over a narrow range of adjacent elements. A key aspect of FEM is the Galerkin method, which aims to minimize the residuals of the differential equation in a weak formulation using test or weighting functions that are often identical to the basis functions [37].

FEM has proven very effective in solving complex mathematical equations, especially in fields such as aircraft design, and is renowned for its accuracy in handling complex shapes and structures. It is also versatile and applicable to various engineering problems in fluid dynamics, electromagnetics, and more [38]. However, a significant drawback is its high computational demand, particularly when applied to large-scale problems with intricate geometries. Despite its computation-hungry nature, FEM remains invaluable in many disciplines due to its accuracy and flexibility. Commercial software programs such as ANSYS [39], and COMSOL Multiphysics [40] leverage FEM’s capabilities, enabling researchers, scientists, and engineers to perform structural analysis, electromagnetics, fluid dynamics simulations, and other applications.

These three methods, FDTD, RCWA, and FEM, form the foundation for forward calculations in optical simulations. While FDTD offers high accuracy for time-domain simulations, RCWA provides efficiency in handling periodic structures, and FEM excels in complex geometrical analysis. Together, they facilitate detailed simulations of electromagnetic interactions and serve as essential tools for the inverse design of metasurfaces, which will be discussed in the following chapter.

III. ADVANCED METHODS FOR INVERSE DESIGN

The inverse design of metasurfaces is primarily achieved using ML and optimization algorithms [9]. ML employs data-driven models to learn design patterns and predict outcomes [41], while optimization techniques use gradient-based or evolutionary algorithms to refine designs [11, 42]. These approaches facilitate the creation of complex photonic structures that are often difficult to achieve with conventional methods. In this section, we introduce ML and optimization algorithms, along with key research studies that have effectively applied these methods to design various metasurfaces.

3.1. Machine Learning Techniques for Inverse Design

3.1.1. Neural Networks (NN)

A NN is a computational model inspired by the structure of human brain neurons [43]. Similar to how neurons exchange signals, NNs consist of multiple nodes that process inputs, apply weights, and produce outputs through activation functions. They learn through interconnected layers: The input layer receives data first and sends it to the hidden layer, the hidden layer processes it, and the output layer generates results. By adjusting weights, NN can recognize patterns in the input and predict outputs. The framework of the NN is shown in Fig. 3(a).

Figure 3. An architecture of neural network (NN) model and nanophotonic device design representation. (a) A framework of NN modeled after human neurons, where each node is connected to others. (b) Schematic for forward and reverse modeling of nanophotonic devices using deep NN. Reprinted from M. H. Tahersima et al. Sci. Rep. 9, 1368 (2019). Copyright © 2019, M. H. Tahersima et al. [54].

A deep neural network (DNN) extends NN structure by incorporating multiple hidden layers [44] to effectively capture features even in high-dimensional data. DNNs also support transfer learning, allowing them to adapt to different domains. When every node in one layer is connected to every node in the subsequent layer, it forms a fully connected neural network. Its characteristic is a relationship of connections between nodes in the immediately adjacent layer. Recurrent neural networks (RNNs) [45], on the other hand, use a loop structure to handle sequential data and learn from both input and hidden layers.

For inverse design, convolutional neural networks (CNNs), a widely used DNN variant, have proven to be very effective [46, 47]. Unlike standard NNs, CNNs excel in image processing by using convolution and pooling operations to extract key features. Convolution involves applying a kernel matrix over the input image to create a feature map to capture essential details [48]. Pooling then reduces the size of the feature map by selecting maximum, average, or minimum values, which preserves crucial information and reduces the computational load [49, 50]. This makes CNNs highly effective for tasks such as object recognition and image analysis.

NNs have been successfully applied in nano-manufacturing and material exploration to improve efficiency and enable real-time control and feedback [51]. For example, Li et al. [52] used CNN to map meta-atom structures to electromagnetic reflection spectra, creating a high-fidelity prediction model that accurately predicted amplitude and phase over an ultra-wide frequency band with low errors and short simulation times.

Moreover, traditional design methods are often time-consuming, require specialized expertise, and are associated with high costs. In contrast, NN offers a faster, more efficient, and more accurate alternative for making predictions. For instance, Chen et al. [53] developed a hybrid CNN and RNN model that demonstrated strong adaptability and reduced design time, approximately 4,800 times shorter than designing with the FDTD method, to inversely engineer an all-dielectric nanohole metasurface structure. Tahersima et al. [54] designed a power splitter using DNN and achieved a maximum transmission efficiency of more than 90% and target splitting specification while minimizing reflections [Fig. 3(b)]. These examples illustrate the effectiveness of NNs in the inverse design of metasurfaces.

3.1.2. Markov Decision Processes (MDP)

An MDP is a probabilistic model in ML that represents an agent’s decision-making process under uncertainty [55]. In MDPs, each state depends only on the immediately preceding state, making it a stochastic process with the Markov property, known as a Markov chain. Given the current state st, the future state st+1 is determined solely by st, and is independent of past states. This can be expressed mathematically as:

Pst+1|st= Pst+1|s1,s2, , st.

MDPs are built upon Markov processes (MPs) where transitions between states are based on a specified probability distribution [56]. The future state can be expressed by the transition probability matrix, where an MP is defined by a set of states S and transition probabilities P denoted as:

MP S, P,
Pss'=P[St+1=s|St=s].

Here, s represents any state in S, and s′ denotes a future state. If the reward concept is introduced to MP, the model becomes a Markov reward process (MRP) [57], represented as:

MRP S, P, R,γ,
R=E[Rt|St=s],
Vs=Rs+ γ sSP s |sVs'.

In this context, R(s) is the expected reward at state s, and γ is a discount factor balancing immediate and future rewards. An MDP extends this model by incorporating actions A that an agent can take in each state [58]. The schematic of MDP is depicted in Fig. 4(a). The objective is to find an optimal policy π that guides the agent’s actions to maximize cumulative rewards. Formally, an MDP can be expressed as a formula as:

Figure 4. An architecture of the Markov decision process (MDO) and its utility representation. (a) Framework of the MDP, (b) details of the algorithm L2DO proposed by R. Li et al. [59]. This algorithm uses the MDP framework for photonics inverse design, with the environment defined using FDTD-based simulation. Adopted from Li et al. Nanophotonics 12, 319–334 (2023). Copyright © 2023, R. Li et al. [59].

MDP S, A, P, R,γ,
Pss'a=P[St+1=s|St=s, At=a],
Ra=E[Rt+1|St=s,At=a],
π(a|s)=P[At=a|St=s].

Here, π(a|s) represents the policy function indicating the probability of choosing action a in state s, and the sum of probabilities for all actions in each state must equal 1.

MDP is useful for modeling environments with uncertainty and offers the advantage of evaluating various scenarios. This has led to its application in the inverse design of metamaterial and wave scattering optimization [5961]. For instance, Li et al. [59] developed an MDP-based algorithm using FDTD simulations to allow agents to explore refractive index, spatial arrangement, length, width, and thickness of materials. The framework of the proposed algorithm is shown in Fig. 4(b). Similarly, Park et al. [61] utilized MDP and Q-learning to design a 1D grating, incorporating physical information into reinforced learning to achieve highly efficient and complex optical designs. This MDP-based framework can also be extended to design more intricate devices, such as two-dimensional meta-gratings or meta-lenses, in combination with other optical simulation tools [62].

3.1.3. Monte Carlo (MC) Methods

The MC method is an algorithm that uses random sampling to approximate solutions for complex problems [63]. Based on the law of large numbers [64], it ensures that the average of a large number of samples closely approximates the true average of the entire population. MC methods are widely used in optimization, numerical integration, and probability distribution analysis, with applications spanning fields such as biotechnology and space engineering.

One of the key applications of MC is the MC integration [65], which estimates the integral of complex expressions using probabilistic sampling. Another variant, the Markov chain Monte Carlo (MCMC) method [66], generates samples using a Markov chain from a probability distribution to estimate characteristics of multidimensional distributions that are otherwise difficult to calculate directly.

MCMC is particularly useful in Bayesian simulations, where it helps sample prior and posterior probabilities [67, 68]. Bayesian simulations, based on Bayes’ theorem, update the probability of an even as new data becomes available:

P(θ|X)= P(X|θ)PθPX.

In Eq. (13), P(θ|X) represents the posterior probability of θ given the new data X, P(X|θ) is the likelihood of X given θ, P(θ) is the prior probability of θ, and P(X) is the overall probability of observing X. Bayesian simulation uses this approach to approximate posterior distributions.

In metasurface design, MC methods can be used to model the impact of manufacturing errors of the metasurface on performance [69, 70]. Lin et al. [71] developed an optical inverse design algorithm using the MC tree search algorithm, clustering initial samples and evaluating the average performance index of each cluster to optimize designs. This method effectively solved complex design problems and achieved near-perfect reflection within the red wavelength range. Similarly, Wray et al. [72] used MC integration to design optical filters by sampling general optical properties of atom shapes and generating random metasurfaces with Bayesian optimization. This study demonstrated the capability of designing various fundamental filter types (bandpass, shortpass, longpass, and bandstop) using a single material and a single layer. These examples highlight how MC methods, particularly MCMC, are valuable tools for managing uncertainty and enhancing the inverse design process in metasurface engineering.

3.2. Optimization Algorithms for Inverse Design

3.2.1. Automatic Differentiation (AD)

Gradient-based optimizations are methods used to solve optimization problems by leveraging the gradient of an objective function, such as a loss function [73, 74]. They iteratively update the design from an initial point of the design variables, moving in the direction of the gradient toward the objective function. If the new design meets the convergence criterion or reaches the function’s minimum value, the algorithms terminate; Otherwise, they continue to adjust according to a specified learning rate. The update rule is expressed as:

xi+1= xi αfxi.

Here, xi+1 represents the updated coordinate, xi is the current coordinate, α is the learning rate, and ∆f (xi) represents the gradient of function f at xi.

Since these methods rely on the gradient, they are very sensitive to the initial value and often fail to find the global optimum, especially in high-dimensional problems. To address these challenges, various algorithms such as stochastic gradient descent (SGD) [75], gradient descent with momentum (GDM) [76], and adaptive moment estimation (Adam) [77] have been developed. Two prominent gradient-based optimization techniques, AD and AM, efficiently calculate gradients in complex systems with numerous variables, making them particularly useful for metasurface design. AD is an algorithm that automatically differentiates functions and accumulates the gradients of each operation efficiently, while AM performs a forward pass to obtain the objective function value and a backward pass to calculate the gradient. In other words, the AD and AM approaches were developed to reduce computational demands by calculating gradients with only one or two simulations, making them particularly useful in high-dimensional parameter spaces [10, 78].

AD is a computational technique that computes the partial differentiation of a function using a chain rule [79, 80]. The chain rule can be expressed as:

y= fghx=  fg= fu,
yx= yuuvvx.

AD breaks a function into smaller components and automatically calculates derivatives for each part, and functions primarily in two modes: Forward mode and reverse mode. In forward mode, derivatives are calculated from input x to output y, which is efficient when the input dimension is smaller than the output dimension. Conversely, reverse mode calculates gradients from output y back to input x, which is efficient when the output dimension is smaller than the input dimension. Figure 5 visually illustrates this calculation method. AD is widely used in ML and optimization problems, where the backpropagation algorithm employs AD to update weights in NNs while propagating the gradient backward [81].

Figure 5. A schematic of automatic differentiation (AD). In forward mode, derivatives are calculated from the input x to the output y, while in backward mode, derivatives are calculated from the output y back to the input x.

AD can be integrated into various optical simulations to design optimal device structures [61] and offers more efficient computations than other methods, such as the Gerchberg-Saxton algorithm, at high degrees of freedom [82]. Colburn and Majumdar [83] implemented AD for matrices with complex degenerate eigenvalues in optical design for faster gradient computation. AD is also used to design a phase profile of the metasurface with a desired electromagnetic response. For example, So et al. [84] used AD to design a single metasurface for three-dimensional, multi-color RGB holograms. Jin et al. [85] demonstrated that AD combined with RCWA can be used to modulate the scattering properties of nanostructures, which has applications in thermal management, integrated photonics, and miniature mirrors.

3.2.2. Adjoint Methods (AM)

The AM is an optimization algorithm like AD that calculates gradients effectively [86]. It employs adjoint equations to compute gradients for constraints in a given optimization problem, typically represented as g(x, p) = 0 [87]. The adjoint equation can be expressed as:

gxTλ= fxT,

where g is the constraint function, λ is the adjoint variable, and f is the target function to be optimized. The gradients are calculated by evaluating the adjoint equation in reverse time.

AM is widely used in the inverse design of high-numerical-aperture meta-lenses [88] and for optimizing complex functionalities such as angle-multiplexed metasurface holograms [8991]. This approach enables the creation of highly efficient metasurfaces with improved accuracy, even in constrained design spaces. For example, Ma et al. [92] used AM to design a double-slot metasurface with a periodic gold structure and achieved a sixfold increase in spontaneous emission efficiency. Similarly, Yin et al. [93] applied AM to optimize metasurface design and analyzed how meta-atom size influences optical properties.

3.2.3. Genetic Algorithms (GA)

Non-gradient optimization, also known as metaheuristic optimization, uses randomness to search for optimal solutions, in contrast to gradient-based optimization, which follows a deterministic path. Heuristic methods are designed to solve problems more quickly and efficiently, and are often inspired by social behavior, natural processes, or phenomena. By randomly selecting data to explore the parameter space, these methods increase the likelihood of finding a global optimum. However, exploring the parameter space without gradient guidance can be slower compared to following a well-defined gradient descent path. Nevertheless, non-gradient optimization remains a valuable tool for optimizing complex engineering designs where gradients are either not readily available or difficult to compute.

GA [94] is an optimization technique inspired by biological evolution that employs operators such as mutation, crossover, and selection to evolve solutions towards an optimal result. In each iteration, GA selects individuals from the current population as parents to produce offspring for the next generation. Through successive iterations, the population gradually evolves toward better solutions. GA typically begins with a randomly generated population where each individual’s fitness is evaluated using a user-defined objective function. The fittest individuals are selected as parents, ensuring that advantageous traits are passed on the next generation. A flowchart of GA is shown in Fig. 6(a).

Figure 6. A diagram of genetic algorithms (GA) and their application. (a) A schematic of GA. The algorithm follows a cyclical form, as illustrated, repeatedly evolving generations to find individuals that best satisfy the fitness function. (b) Schematic of a traditional imaging system with a color filter (left) compared to an imaging system with a metasurface (right). (b) is reprinted from X. Zou et al. Nat. Commun. 13, 3288 (2022). Copyright © 2022, X. Zou et al. [106].

In GA, individuals are often represented as fixed-size arrays, which facilitates crossover operations. There are several types of crossover operations [95], and the single-point crossover, where an individual is split at one point to combine segments, is a common one. For an array of size n, there are n − 1 possible crossover points. On the other hand, multi-point crossover divides the individual into two or more segments, allowing greater diversity but typically resulting in slower convergence. Other crossover methods include uniform crossovers [96], which operates probabilistically; Cycle crossovers [97], which uses permutations; and heuristic crossovers [98].

Mutation, another essential GA operation, introduces diversity by altering genes that are not present in the parent individuals, thus expanding the search space and helping to escape local optima. Although mutations occur probabilistically and can sometimes reduce the quality of the solution, they are essential for maintaining diversity and improving the quality of the individual. Various forms of mutations exist, including swapping values or shuffling genes [99]. Non-uniform mutation, where mutation intensity decreases as the process advances, is another method [100]. The algorithm terminates after a maximum number of generations is reached, or a satisfactory fitness level is achieved.

GA has been successfully applied to metasurface design to optimize optical properties [101, 102]. For example, Wang et al. [103] used GA to design an ultra-broadband absorptive metasurface with polarization angle insensitivity and stable oblique incidence performance. It demonstrated an adjustable absorption rate between 4% and 100% across an ultra-broadband range. Similarly, the absorptive metasurface developed by Luo et al. [104] has coherent absorption characteristics, with an absorption rate of exceeding 90% over a wide range of incident angles, and is insensitive to both TM and TE polarization. Yu et al. [105] used a multi-objective GA to design a metasurface-based microwave filter that achieved high transmittance for a target electromagnetic response at dual-bandpass. Additionally, Zou et al. [106] designed a color filter for high-intensity imaging using GA and achieved double image intensity of commercial color filters, as shown in Fig. 6(b). These examples highlight GA’s versatility and effectiveness in addressing complex optimization challenges in metasurface design.

3.2.4. Particle Swarm Optimization

PSO [107] is also one kind of metaheuristic algorithm inspired by social behaviors that express the movement of organisms such as birds or fish. Similar to GA, PSO performs optimization without relying on gradients [108]. The process begins with a randomly generated swarm of particles, where each particle represents a potential solution. The fitness of each particle is evaluated, and the swarm iteratively updates to converge toward optimal solutions. Figure 7(a) shows the key steps in the PSO framework.

Figure 7. A schematic representation of particle swarm optimization (PSO) applications. (a) Diagram of the algorithm with PSO. (b) Structure of waveguide crossing. The configuration of each nanostructure is determined using PSO and finite-difference time-domain (FDTD). (a) and (b) are reprinted from K. Goudarzi and M. Lee, Results Phys. 34, 105268 (2022). Copyright © 2022, K. Goudarzi and M. Lee [110].

Unlike GA, PSO does not use genetic operators such as crossover and mutation. Instead, particles adjust their velocities and positions based on their own experience and that of neighboring particles, allowing them to remember optimal positions found in previous iterations. This self-regulation enables PSO to adapt more effectively and helps to avoid local optima and increasing the likelihood of finding the global optimum. While GA relies on interaction between individuals, PSO benefits from information sharing across the swarm, making it more efficient in exploring the search space.

PSO is particularly valuable for rapidly designing optical devices by reducing computational time and resources. It has been applied to inverse design problems in waveguides [109111]. For instance, Goudarzi and Lee [110] optimized the design parameters of the binary waveguide as illustrated in Fig. 7(b), to achieve high performance and efficiency. Wu et al. [112] used PSO to design a metasurface capable of controlling the amplitude and phase of terahertz waves to improve beam control efficiency. This approach improved beam steering and focusing quality by 150% while broadening the operating bandwidth. Additionally, Lee et al. [113] designed a transmissive color filter that covers the full sRGB color space using dielectrics and metals, achieving an efficiency of more than 70%.

IV. SUMMARY

In this paper, we explored various ML and optimization algorithms for the inverse design of metasurfaces and highlighted their unique strengths and applications. In the case of ML techniques such as NNs, MDP, and MC methods, it can learn complex patterns from large-scale data, enabling the discovery of innovative designs beyond human intuition. However, these methods often function as a black box, which makes it challenging to interpret the underlying decision-making process.

In contrast, optimization algorithms such as gradient-based methods (AD and AM) and metaheuristic approaches (GA and PSO) provide greater transparency and offer clear pathways to achieving specific target functions. These methods leverage gradients or mimic natural processes to effectively explore design spaces, resulting in precise and interpretable solutions.

While both ML and optimization algorithms are powerful tools on their own, their combined or strategic use can significantly enhance the metasurface design process. While each method is powerful independently, by harnessing the instantaneous predictive capabilities of ML and the goal-oriented nature of optimization algorithms, their combined application uses the rapid predictive capabilities of ML with the targeted problem-solving nature of optimization. This synergy enables not only the development of more efficient and innovative metasurface designs but also the discovery of new physical phenomena and the realization of multifunctional capabilities, surpassing the limitations of traditional empirical methods. This inverse design strategy provides a cost-effective and versatile solution for addressing complex design challenges, such as achieving precise control over optical properties and optimizing transmission and reflection within specific spectral bands.

Acknowledgments

This work was supported by a Korea University Grant, and “Regional Innovation Strategy (RIS)” through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (MOE) (Grant No. 2021RIS-004).

FUNDING

This work was supported by a Korea University Grant, and Regional Innovation Strategy (RIS) through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (MOE) (Grant No. 2021RIS-004).

DISCLOSURES

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

DATA AVAILABILITY

Data sharing is not applicable to this article, as no new data were created or analyzed in this study.

Fig 1.

Figure 1.Schematic illustration of an overview of inverse design metasurfaces using machine learning and optimization methods.
Current Optics and Photonics 2024; 8: 531-544https://doi.org/10.3807/COPP.2024.8.6.531

Fig 2.

Figure 2.Schematic of forward calculation techniques in optical simulation: (a) Finite-difference time-domain (FDTD), (b) rigorous coupled-wave analysis (RCWA), and (c) finite element method (FEM).
Current Optics and Photonics 2024; 8: 531-544https://doi.org/10.3807/COPP.2024.8.6.531

Fig 3.

Figure 3.An architecture of neural network (NN) model and nanophotonic device design representation. (a) A framework of NN modeled after human neurons, where each node is connected to others. (b) Schematic for forward and reverse modeling of nanophotonic devices using deep NN. Reprinted from M. H. Tahersima et al. Sci. Rep. 9, 1368 (2019). Copyright © 2019, M. H. Tahersima et al. [54].
Current Optics and Photonics 2024; 8: 531-544https://doi.org/10.3807/COPP.2024.8.6.531

Fig 4.

Figure 4.An architecture of the Markov decision process (MDO) and its utility representation. (a) Framework of the MDP, (b) details of the algorithm L2DO proposed by R. Li et al. [59]. This algorithm uses the MDP framework for photonics inverse design, with the environment defined using FDTD-based simulation. Adopted from Li et al. Nanophotonics 12, 319–334 (2023). Copyright © 2023, R. Li et al. [59].
Current Optics and Photonics 2024; 8: 531-544https://doi.org/10.3807/COPP.2024.8.6.531

Fig 5.

Figure 5.A schematic of automatic differentiation (AD). In forward mode, derivatives are calculated from the input x to the output y, while in backward mode, derivatives are calculated from the output y back to the input x.
Current Optics and Photonics 2024; 8: 531-544https://doi.org/10.3807/COPP.2024.8.6.531

Fig 6.

Figure 6.A diagram of genetic algorithms (GA) and their application. (a) A schematic of GA. The algorithm follows a cyclical form, as illustrated, repeatedly evolving generations to find individuals that best satisfy the fitness function. (b) Schematic of a traditional imaging system with a color filter (left) compared to an imaging system with a metasurface (right). (b) is reprinted from X. Zou et al. Nat. Commun. 13, 3288 (2022). Copyright © 2022, X. Zou et al. [106].
Current Optics and Photonics 2024; 8: 531-544https://doi.org/10.3807/COPP.2024.8.6.531

Fig 7.

Figure 7.A schematic representation of particle swarm optimization (PSO) applications. (a) Diagram of the algorithm with PSO. (b) Structure of waveguide crossing. The configuration of each nanostructure is determined using PSO and finite-difference time-domain (FDTD). (a) and (b) are reprinted from K. Goudarzi and M. Lee, Results Phys. 34, 105268 (2022). Copyright © 2022, K. Goudarzi and M. Lee [110].
Current Optics and Photonics 2024; 8: 531-544https://doi.org/10.3807/COPP.2024.8.6.531

References

  1. X. Zou, R. Lin, Y. Fu, G. Gong, X. Zhou, S. Wang, S. Zhu, and Z. Wang, “Advanced optical imaging based on metasurfaces,” Adv. Opt. Mater. 12, 2203149 (2024).
    CrossRef
  2. N. Yu, P. Genevet, M. a Kats, F. Aieta, J.-P. Tetienne, F. Capasso, and Z. Gaburro, “Light propagation with phase reflection and refraction,” Science 334, 333-337 (2011).
    Pubmed CrossRef
  3. E. Hasman, V. Kleiner, G. Biener, and A. Niv, “Polarization dependent focusing lens by use of quantized Pancharatnam-Berry phase diffractive optics,” Appl. Phys. Lett. 82, 328-330 (2003).
    CrossRef
  4. M. Khorasaninejad and F. Capasso, “Metalenses: Versatile multifunctional photonic components,” Science 358, eaam8100 (2017).
    Pubmed CrossRef
  5. G. Zheng, H. Mühlenbernd, M. Kenney, G. Li, T. Zentgraf, and S. Zhang, “Metasurface holograms reaching 80% efficiency,” Nat. Nanotechnol. 10, 308-312 (2015).
    Pubmed CrossRef
  6. X. Ni, A. V. Kildishev, and V. M. Shalaev, “Metasurface holograms for visible light,” Nat. Commun. 4, 2807 (2013).
    CrossRef
  7. L. Huang, X. Chen, H. Mühlenbernd, G. Li, B. Bai, Q. Tan, G. Jin, T. Zentgraf, and S. Zhang, “Dispersionless phase discontinuities for controlling light propagation,” Nano. Lett. 12, 5750-5755 (2012).
    Pubmed CrossRef
  8. X. Ni, Z. J. Wong, M. Mrejen, Y. Wang, and X. Zhang, “An ultrathin invisibility skin cloak for visible light,” Science 349, 1310-1314 (2015).
    Pubmed CrossRef
  9. S. So, J. Mun, J. Park, and J. Rho, “Revisiting the design strategies for metasurfaces: Fundamental physics, optimization, and beyond,” Adv. Mater. 35, 2206399 (2023).
    Pubmed CrossRef
  10. J. Noh, T. Badloe, C. Lee, J. Yun, S. So, and J. Rho, “Inverse design meets nanophotonics: From computational optimization to artificial neural network,” Intell. Nanotechnol. 3-32 (2023).
    Pubmed KoreaMed CrossRef
  11. J. S. Jensen and O. Sigmund, “Topology optimization for nano-photonics,” Laser Photon. Rev. 5, 308-321 (2011).
    CrossRef
  12. C. Kang, C. Park, M. Lee, J. Kang, M. S. Jang, and H. Chung, “Large-scale photonic inverse design: Computational challenges and breakthroughs,” Nanophotonics (2024).
    Pubmed KoreaMed CrossRef
  13. Z. Li, R. Pestourie, Z. Lin, S. G. Johnson, and F. Capasso, “Empowering metasurfaces with inverse design: Principles and applications,” ACS Photonics 9, 2178-2192 (2022).
    CrossRef
  14. O. Sigmund, “On the usefulness of non-gradient approaches in topology optimization,” Struct. Multidiscip. Optim. 43, 589-596 (2011).
    CrossRef
  15. H. Pang, S. Yin, Q. Deng, Q. Qiu, and C. Du, “A novel method for the design of diffractive optical elements based on the Rayleigh-Sommerfeld integral,” Opt. Lasers Eng. 70, 38-44 (2015).
    CrossRef
  16. R. E. Kleinman and T. B. A. Senior, "Chapter 1- Rayleigh scattering," in Mechanics and Mathematical Methods-Series of Handbooks, (Elsevier, Netherland, 1986), Vol. 2, pp. 1-70.
    CrossRef
  17. A. Dorodnyy, J. Smajic, and J. Leuthold, “Mie scattering for photonic devices,” Laser Photon. Rev. 17, 2300055 (2023).
    CrossRef
  18. P. Q. Mantas, “Dielectric response of materials: extension to the Debye model,” J. Eur. Ceram. Soc. 19, 2079-2086 (1999).
    CrossRef
  19. A. Taflove, S. C. Hagness, and M. Piket-May, "Computational electromagnetics: The finite-difference time-domain method,," in The Electrical Engineering Handbook, 3rd ed., R. C. Dorf, Ed (CRC Press, USA, 2005), pp. 629-670.
    CrossRef
  20. M. G. Moharam and T. K. Gaylord, “Rigorous coupled-wave analysis of planar-grating diffraction,” J. Opt. Soc. Am. 71, 811-818 (1981).
    CrossRef
  21. J. Jianming, The Finite Element Method in Electromagnetics, 3rd ed. (Wiley-IEEE Press, USA, 2014).
    CrossRef
  22. İ. R. Çapoğlu, C. A. White, J. D. Rogers, H. Subramanian, A. Taflove, and V. Backman, “Numerical simulation of partially coherent broadband optical imaging using the finite-difference time-domain method,” Opt. Lett. 36, 1596-1598 (2011).
    Pubmed KoreaMed CrossRef
  23. ianwilliamson, “Ceviche Challenges: Photonic Inverse Design Suite: A suite of photonic inverse design challenge problems for topology optimization benchmarking,” (GitHub, Published date: Jul. 21, 2022), https://github.com/google/ceviche-challenges (Accessed date: Dec. 1, 2024)
  24. A. F. Oskooi, D. Roundy, M. Ibanescu, P. Bermel, J. D. Joannopoulos, and S. G. Johnson, “Meep: A flexible free-software package for electromagnetic simulations by the FDTD method,” Comput. Phys. Commun. 181, 687-702 (2010).
    CrossRef
  25. L. Li, “New formulation of the Fourier modal method for crossed surface-relief gratings,” J. Opt. Soc. Am. A 14, 2758 (1997).
    CrossRef
  26. E. N. Glytsis and T. K. Gaylord, "Review and applications of rigorous coupled-wave analysis of grating diffraction," in Diffraction Optics: Design, Fabrication, and Applications (Optica Publishing Group, 1992), p. paper MD1.
    CrossRef
  27. M. G. Moharam, D. A. Pommet, E. B. Grann, and T. K. Gaylord, “Stable implementation of the rigorous coupled-wave analysis for surface-relief gratings: Enhanced transmittance matrix approach,” J. Opt. Soc. Am. A 12, 1077-1086 (1995).
    CrossRef
  28. G. Quaranta, G. Basset, O. J. F. Martin, and B. Gallinet, “Recent advances in resonant waveguide gratings,” Laser Photon. Rev. 12, 1800017 (2018).
    CrossRef
  29. R. Gansch, S. Kalchmair, P. Genevet, T. Zederbauer, H. Detz, A. M. Andrews, W. Schrenk, F. Capasso, M. Lončar, and G. Strasser, “Measurement of bound states in the continuum by a detector embedded in a photonic crystal,” Light Sci. Appl. 5, e16147 (2016).
    Pubmed KoreaMed CrossRef
  30. A. A. Zharov and N. A. Zharova, “Light-Induced diffraction gratings on liquid metamaterial metasurfaces,” J. Exp. Theor. Phys. 135, 808-812 (2022).
    CrossRef
  31. V. Liu and S. Fan, “S4: A free electromagnetic solver for layered periodic structures,” Comput. Phys. Commun. 183, 2233-2244 (2012).
    CrossRef
  32. J. P. Hugonin and P. Lalanne, “RETICOLO software for grating analysis,” arXiv: 2101.00901v3 (2021).
  33. G. Yoon and J. Rho, “MAXIM: Metasurfaces-oriented electromagnetic wave simulation software with intuitive graphical user interfaces,” Comput. Phys. Commun. 264, 107846 (2021).
    CrossRef
  34. Y. Kim, A. W. Jung, S. Kim, K. Octavian, D. Heo, C. Park, J. Shin, S. Nam, C. Park, J. Park, S. Han, J. Lee, S. Kim, M. S. Jang, and C. Y. Park, “Meent: Differentiable electromagnetic simulator for machine learning,” arXiv:2406.12904v1 (2024).
  35. R. W. Clough, “The Finite Element Method in Plane Stress Analysis,” in Proc. 2nd Conference on Electronic Computation (Pittsburgh, Pa, USA, Sep. 8-9, 1960).
    CrossRef
  36. S. Levy, “Structural analysis and influence coefficients for delta wings,” J. Aeronaut. Sci. 20, 449-454 (1953).
    CrossRef
  37. M. Kronbichler, "The discontinuous galerkin method: Derivation and properties," in Efficient High-order Discretizations for Computational Fluid Dynamics, M. Kronbichler and P.-O. Persson, Eds (Springer Cham, Swiss, 2021), pp. 1-55.
    CrossRef
  38. S. Zuo, D. G. Doñoro, Y. Zhang, Y. Bai, and X. Zhao, “Simulation of challenging electromagnetic problems using a massively parallel finite element method solver,” IEEE Access 7, 20346-20362 (2019).
    CrossRef
  39. P. C. Kohnke, "ANSYS," in Finite Element Systems, C. A. Brebbia, Ed (Springer Berlin, Germany, 1982), pp. 19-25.
    CrossRef
  40. MathWorks, “COMSOL Multiphysics-Interactive multiphysics modeling and simulation,” (MathWorks), https://kr.mathworks.com/products/connections/product_detail/comsol-multiphysics.html (Accessed date: Dec. 1, 2024)
  41. J. Chen, J. Huang, M. An, P. Hu, Y. Xie, J. Wu, and Y. Chen, “Application of machine learning on the design of acoustic metamaterials and phonon crystals: A review,” Smart Mater. Struct. 33, 073001 (2024).
    CrossRef
  42. D. Whitley, “A genetic algorithm tutorial,” Stat. Comput. 4, 65-85 (1994).
    CrossRef
  43. J. P. S. Rosa, D. J. D. Guerra, N. C. G. Horta, R. M. F. Martins, and N. C. C. Lourenço, "Overview of artificial neural networks," Using Artificial Neural Networks for Analog Integrated Circuit Design Automation, (Springer, USA, 2020), pp. 21-44.
    KoreaMed CrossRef
  44. J. Schmidhuber, “Deep Learning in neural networks: An overview,” Neural Netw. 61, 85-117 (2015).
    Pubmed CrossRef
  45. A. Tsantekidis, N. Passalis, and A. Tefas, "Recurrent neural networks,," in Deep Learning for Robot Perception and Cognition, A. Iosifidis and A. Tefas, Eds (Academic Press, USA, 2022), Chapter 5, pp. 101-115.
    CrossRef
  46. N. Ketkar and J. Moolayil, "Convolutional neural networks,," in Deep Learning with Python, 2nd ed., N. Ketkar and J. Moolayil, Eds (Apress Berkeley, USA, 2021), pp. 197-242.
    CrossRef
  47. Y. H. Ma and Y. Hao, "Chapter 8-Deep learning in metasurface design and optimization,," in Metamaterials-by-Design: Theory, Technologies, and Vision, A. Alù, N. Engheta, A. Massa, and G. Oliveri, Eds (Elsevier, Netherland, 2024), pp. 203-232.
    CrossRef
  48. Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou, “A survey of convolutional neural networks: Analysis, applications, and prospects,” IEEE Trans. Neural Netw. Learn. Syst. 33, 6999-7019 (2022).
    Pubmed CrossRef
  49. M. Sun, Z. Song, X. Jiang, J. Pan, and Y. Pang, “Learning pooling for convolutional neural network,” Neurocomputing 224, 96-104 (2017).
    CrossRef
  50. A. Zafar, M. Aamir, N. Mohd Nawi, A. Arshad, S. Riaz, A. Alruban, A. K. Dutta, and S. Almotairi, “A comparison of pooling methods for convolutional neural networks,” Appl. Sci. 12, 8643 (2022).
    CrossRef
  51. M. Nandipati, O. Fatoki, and S. Desai, “Bridging nanomanufacturing and artificial intelligence-A comprehensive review,” Materials 17, 1621 (2024).
    Pubmed KoreaMed CrossRef
  52. Y. Li, Y. Zhang, Y. Wang, J. Li, X. Jiang, G. Yang, K. Zhang, Y. Yuan, J. Fu, X. G. Di, and C. Wang, “Multifunctional metasurface inverse design based on ultra-wideband spectrum prediction neural network,” Adv. Opt. Mater. 12, 2302657 (2024).
    CrossRef
  53. Y. Chen, Q. Wang, D. Cui, W. Li, M. Shi, and G. Zhao, “Inverse design of nanohole all-dielectric metasurface based on deep convolutional neural network,” Opt. Commun. 569, 130793 (2024).
    CrossRef
  54. M. H. Tahersima, K. Kojima, T. Koike-Akino, D. Jha, B. Wang, C. Lin, and K. Parsons, “Deep neural network inverse design of integrated photonic power splitters,” Sci. Rep. 9, 1368 (2019).
    Pubmed KoreaMed CrossRef
  55. M. L. Puterman, "Markov decision processes: Discrete stochastic dynamic programming," (Wiley Series in Probability and Statistics) (John Wiley & Sons, USA, 2008).
  56. N. G. van Kampen, "Markov processes," in Stochastic Processes in Physics and Chemistry, 3rd ed., N. G. van Kampen, Ed (Elsevier, Netherland, 2007), pp. 73-95.
    CrossRef
  57. Q.-L. Li, "Markov reward processes," Constructive Computation in Stochastic Models with Applications, (Springer Berlin, Germany, 2010), pp. 526-573.
    CrossRef
  58. M. L. Puterman, "Markov decision processes," in Handbooks in Operations Research and Management Science, (Elsevier, Netherland, 1990), Vol. 2, pp. 331-434.
    CrossRef
  59. R. Li, C. Zhang, W. Xie, Y. Gong, F. Ding, H. Dai, Z. Chen, F. Yin, and Z. Zhang, “Deep reinforcement learning empowers automated inverse design and optimization of photonic crystals for nanoscale laser cavities,” Nanophotonics 12, 319-334 (2023).
    Pubmed KoreaMed CrossRef
  60. L. Rosafalco, J. M. De Ponti, L. Iorio, R. V. Craster, R. Ardito, and A. Corigliano, “Reinforcement learning optimisation for graded metamaterial design using a physical-based constraint on the state representation and action space,” Sci. Rep. 13, 21836 (2023).
    Pubmed KoreaMed CrossRef
  61. C. Park, S. Kim, A. W. Jung, J. Park, D. Seo, Y. Kim, C. Park, C. Y. Park, and M. S. Jang, “Sample-efficient inverse design of freeform nanophotonic devices with physics-informed reinforcement learning,” Nanophotonics 13, 1483-1492 (2024).
    Pubmed KoreaMed CrossRef
  62. H. Wankerl, M. L. Stern, A. Mahdavi, C. Eichler, and E. W. Lang, “Parameterized reinforcement learning for optical system optimization,” J. Phys. D: Appl. Phys. 54, 305104 (2021).
    CrossRef
  63. M. J. Fryer, “Simulation and the Monte Carlo method. by R. Y. Rubinstein,” J. R. Stat. Soc. Ser. A 146, 95-96 (1983).
    CrossRef
  64. F. M. Dekking, C. Kraaikamp, H. P. Lopuhaä, and L. E. Meester, "Testing hypotheses: Elaboration," in A Modern Introduction to Probability and Statistics: Understanding Why and How (Springer Texts in Statistics Series), (Springer-Verlag London, UK, 2005), pp. 383-397.
    CrossRef
  65. C. P. Robert and G. Casella, "Monte Carlo Integration,," in Monte Carlo Statistical Methods, 1st ed. (Springer New York, USA, 1999), pp. 71-138.
    CrossRef
  66. S. Brooks, “Markov chain Monte Carlo method and its application,” J. R. Stat. Soc.: Series D (The Statistician) 47, 69-100 (1998).
    CrossRef
  67. C. J. Geyer, "Introduction to Markov Chain Monte Carlo," in Handbook of Markov Chain Monte Carlo, S. Brooks, A. Gelman, G. Jones, and X.-L. Meng, Eds (Chapman and Hall/CRC, USA, 2011), pp. 3-48.
    CrossRef
  68. S. Jackman, “Estimation and inference via Bayesian simulation: An introduction to Markov Chain Monte Carlo,” Am. J. Pol. Sci. 44, 375-404 (2000).
    CrossRef
  69. M. Panipinto and J. D. Ryckman, “Effective medium metasurfaces using nanoimprinting of the refractive index: Design, performance, and predictive tolerance analysis,” Opt. Mater. Express 14, 847-861 (2024).
    CrossRef
  70. S. Goel, S. Leedumrongwatthanakun, N. H. Valencia, W. McCutcheon, A. Tavakoli, C. Conti, P. W. H. Pinkse, and M. Malik, “Inverse design of high-dimensional quantum optical circuits in a complex medium,” Nat. Phys. 20, 232-239 (2024).
    CrossRef
  71. R. Lin, V. Valuckas, T. Thu, H. Do, A. Nemati, A. I. Kuznetsov, J. Teng, S. T. Ha, R. Lin, V. Valuckas, T. T. H. Do, A. Nemati, A. I. Kuznetsov, J. Teng, and S. T. Ha, “Schrödinger's red beyond 65,000 pixel-per-inch by multipolar interaction in freeform meta-atom through efficient neural optimizer,” Adv. Sci. 11, 2303929 (2024).
    Pubmed KoreaMed CrossRef
  72. P. R. Wray, E. G. Paul, and H. A. Atwater, “Optical filters made from random metasurfaces using Bayesian optimization,” Nanophotonics 13, 183-193 (2024).
    Pubmed KoreaMed CrossRef
  73. S. Ruder, “An overview of gradient descent optimization algorithms,” arXiv:1609.04747v1 (2016).
    CrossRef
  74. J. Zhang, “Gradient descent based optimization algorithms for deep learning models training,” arXiv:1903.03614v1 (2019).
    CrossRef
  75. N. Ketkar, "Stochastic gradient descent," Deep Learning with Python, (Springer, USA, 2017), pp. 113-132.
    CrossRef
  76. I. Sutskever, J. Martens, G. Dahl, and G. Hinton, “On the importance of initialization and momentum in deep learning,” Proc. Mach. Learn. Res. 28, 1139-1147 (2013).
  77. D. Yi, J. Ahn, and S. Ji, “An effective optimization method for machine learning based on ADAM,” Appl. Sci. 10, 1073 (2020).
    CrossRef
  78. T. Hughes, I. Williamson, M. Minkov, and S. Fan, “Forward-mode differentiation of Maxwell's equations,” ACS Photonics 6, 3010-3016 (2019).
    CrossRef
  79. A. Güneş Baydin, B. A. Pearlmutter, A. A. Radul, and J. M. Siskind, “Automatic differentiation in machine learning: A survey,” J. Mach. Learn. Res. 18, 1-43 (2018).
  80. R. D. Neidinger, “Introduction to automatic differentiation and MATLAB object-oriented programming,” SIAM Review 52, 545-563 (2010).
    CrossRef
  81. A. G. Baydin and B. A. Pearlmutter, “Automatic differentiation of algorithms for machine learning,” arXiv:1404.7456v1 (2014).
  82. P. Georgi, Q. Wei, B. Sain, C. Schlickriede, Y. Wang, L. Huang, and T. Zentgraf, “Optical secret sharing with cascaded metasurface holography,” Sci. Adv. 7, eabf9718 (2021).
    Pubmed KoreaMed CrossRef
  83. S. Colburn and A. Majumdar, “Inverse design and flexible parameterization of meta-optics using algorithmic differentiation,” Commun. Phys. 4, 65 (2021).
    CrossRef
  84. S. So, J. Kim, T. Badloe, C. Lee, Y. Yang, H. Kang, and J. Rho, “Multicolor and 3D holography generated by inverse-designed single-cell metasurfaces,” Adv. Mater. 35, 2208520 (2023).
    Pubmed CrossRef
  85. W. Jin, W. Li, M. Orenstein, and S. Fan, “Inverse design of lightweight broadband reflector for relativistic lightsail propulsion,” ACS Photonics 7, 2350-2355 (2020).
    CrossRef
  86. C. Kang, C. Park, M. Lee, J. Kang, M. S. Jang, and H. Chung, “Large-scale photonic inverse design: Computational challenges and breakthroughs,” Nanophotonics 13, 3765-3792 (2024).
    Pubmed KoreaMed CrossRef
  87. A. M. Bradley, “PDE-constrained optimization problems and the adjoint method,” (Stanford University, Published date: 2010), https://cs.stanford.edu/~ambrad/adjoint_tutorial.pdf (Accessed date: Oct. 5, 2024)
  88. H. Chung and O. D. Miller, “High-NA achromatic metalenses by inverse design,” Opt. Express 28, 6945-6965 (2020).
    Pubmed CrossRef
  89. M. Zhou, D. Liu, S. W. Belling, H. Cheng, M. A. Kats, S. Fan, M. L. Povinelli, and Z. Yu, “Inverse design of metasurfaces based on coupled-mode theory and adjoint optimization,” ACS Photonics 8, 2265-2273 (2021).
    CrossRef
  90. M. Mansouree, A. McClung, S. Samudrala, and A. Arbabi, “Large-scale parametrized metasurface design using adjoint optimization,” ACS Photonics 8, 455-463 (2021).
    CrossRef
  91. Y. Zhou, Y. Shao, C. Mao, and J. A. Fan, “Inverse-designed metasurfaces with facile fabrication parameters,” J. Opt. 26, 055101 (2024).
    CrossRef
  92. H. Ma, G. Bao, J. Lai, and J. Lin, “Inverse design of a grating metasurface for enhancing spontaneous emission through hyperbolic metamaterials,” J. Opt. Soc. Am. B 41, A79-A85 (2024).
    CrossRef
  93. Y. Yin, Q. Jiang, H. Wang, J. Liu, Y. Xie, Q. Wang, Y. Wang, and L. Huang, “Multi-dimensional multiplexed metasurface holography by inverse design,” Adv. Mater. 36, 2312303 (2024).
    Pubmed CrossRef
  94. M. Mitchell, An Introduction to Genetic Algorithms (The MIT Press 221, USA, 1998).
    CrossRef
  95. P. W. Poon and J. N. Carter, “Genetic algorithm crossover operators for ordering applications,” Comput. Oper. Res. 22, 135-147 (1995).
    CrossRef
  96. G. Syswerda, “Uniform crossover in genetic algorithms,,” in Proc. 3rd International Conference on Genetic Algorithms (Fairfax, Virginia, USA, Jun. 1989), pp. 2-9.
  97. A. Hussain, Y. S. Muhammad, M. Nauman Sajid, I. Hussain, A. Mohamd Shoukry, and S. Gani, “Genetic algorithm for traveling salesman problem with modified cycle crossover operator,” Comput. Intell. Neurosci. 2017, 7430125 (2017).
    Pubmed KoreaMed CrossRef
  98. K. F. Pál, “Genetic algorithms for the traveling salesman problem based on a heuristic crossover operation,” Biol. Cybern. 69, 539-546 (1993).
    CrossRef
  99. S. Harifi and R. Mohamaddoust, “Zigzag mutation: A new mutation operator to improve the genetic algorithm,” Multimed. Tools Appl. 82, 45411-45432 (2023).
    CrossRef
  100. A. Neubauer, “Theoretical analysis of the non-uniform mutation operator for the modified genetic algorithm,” in Proc. 1997 IEEE Int. Conf. Evolutionary Computation-ICEC'97 (Indianapolis, IN, USA, Apr. 13-16, 1997), pp. 93-96.
    CrossRef
  101. H. Yang, X. Cao, F. Yang, J. Gao, S. Xu, M. Li, X. Chen, Y. Zhao, Y. Zheng, and S. Li, “A programmable metasurface with dynamic polarization, scattering and focusing control,” Sci. Rep. 6, 35692 (2016).
    Pubmed KoreaMed CrossRef
  102. P. R. Wiecha, A. Arbouet, C. Girard, A. Lecestre, G. Larrieu, and V. Paillard, “Evolutionary multi-objective optimization of colour pixels based on dielectric nanoantennas,” Nat. Nanotechnol. 12, 163-169 (2017).
    Pubmed CrossRef
  103. Y. Wang, G. Wu, J. Zhang, X. Wu, G. Yuan, and J. Liu, “Genetic algorithm-enhanced design of ultra-broadband tunable terahertz metasurface absorber,” Opt. Laser Technol. 170, 110262 (2024).
    CrossRef
  104. P. Luo, G. Lan, J. Nong, X. Zhang, T. Xu, and W. Wei, “Broadband coherent perfect absorption employing an inverse-designed metasurface via genetic algorithm,” Opt. Express 30, 34429-34440 (2022).
    Pubmed CrossRef
  105. K. Yu, J. Ge, H. Li, Y. Zhang, H. Dong, and L. Zhang, “Inverse design of dual-bandpass metasurface filters empowered by the multi-objective genetic algorithm,” Opt. Commun. 566, 130695 (2024).
    CrossRef
  106. X. Zou, Y. Zhang, R. Lin, G. Gong, S. Wang, S. Zhu, and Z. Wang, “Pixel-level Bayer-type colour router based on metasurfaces,” Nat. Commun. 13, 3288 (2022).
    Pubmed KoreaMed CrossRef
  107. A. Slowik, "Particle swarm optimization," Swarm Intelligence Algorithms, (CRC Press, USA, 2020), pp. 265-277.
    CrossRef
  108. D. Wang, D. Tan, and L. Liu, “Particle swarm optimization algorithm: An overview,” Soft Comput. 22, 387-408 (2018).
    CrossRef
  109. T. Baba, M. Nakata, R. Shiratori, and K. Hayashi, “Particle swarm optimization of silicon photonic crystal waveguide transition,” Opt. Lett. 46, 1904-1907 (2021).
    Pubmed CrossRef
  110. K. Goudarzi and M. Lee, “Inverse design of a binary waveguide crossing by the particle swarm optimization algorithm,” Results Phys. 34, 105268 (2022).
    CrossRef
  111. C. Sun, Y. Yu, G. Chen, and X. Zhang, “Ultra-compact bent multimode silicon waveguide with ultralow inter-mode crosstalk,” Opt. Lett. 42, 3004-3007 (2017).
    Pubmed CrossRef
  112. Q. Wu, W.-H. Fan, C. Qin, and X.-Q. Jiang, “Dual-parameter controlled reconfigurable metasurface for enhanced terahertz beamforming via inverse design method,” Phys. Scr. 99, 065517 (2024).
    CrossRef
  113. C. Lee, S. Lee, J. Seong, D. Y. Park, and J. Rho, “Inverse-designed metasurfaces for highly saturated transmissive colors,” J. Opt. Soc. Am. B 41, 151-158 (2024).
    CrossRef