|
|
SimulationIn this chapter we will describe in some detail the implementation we propose for this simulation. We start by providing a brief introduction to the general class of Monte-Carlo (MC) methods to which our simulation will belong to. After which we proceed to elaborate on how we would implement the different high-symmetry crystal lattice geometries that are commonly used. Finally, we describe the event engine that would manage the simulation in an efficient and easily extensible way.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| (4.3) |
that have the same periodicity of the original lattice. It's easy to show that the reciprocal lattice is also a Bravais lattice and that the reciprocal vectors
![]() |
|
![]() |
(4.4) |
![]() |
It's then obvious that:
| (4.5) |
One should also note that if we calculate the reciprocal of the reciprocal lattice we re-obtain the original lattice. Given the lattice vectors and/or the reciprocal lattice vectors we can completely describe the geometrical arrangement of the atoms or molecules of the crystal. We are, however, interested not in a bulk crystal, but, instead, on the geometry of a surface.
Experimentally, the substrate surface that is used for deposition is chosen to be a lattice plane, defined to be any plane containing at least three nonlinear Bravais lattice points. Because of the symmetry of the Bravais lattice, any such plane will actually contain infinitely many lattice points, which form a two dimensional Bravais lattice[60]. As before, the position of the atoms on this lattice is given by the 2D analog of Eq. 4.2.
One should take a moment to consider the extreme usefulness of what was said in the last paragraph. Since we can represent all of the surfaces used experimentally using just two indices, we can map them in to a
Before we can proceed to efficiently simulate the geometry of any substrate, we must be able to easily define what surface of the crystal we are using as the substrates surface. This is routinely done by experimentalists by recurring to the Miller Indices of the lattice plane. As is well known from basic geometry, we can specify a plane simply by defining a vector that is normal to it. Following this idea, we defined the Miller indices of a given lattice plane to be the coordinates of the shortest reciprocal lattice vector normal to that plane. This a plane with Miller indices
,
,
, is normal to the reciprocal lattice vector
. Miller indices are commonly given in the literature in parenthesis, as
. In the case where one of the indices is negative, this is represented by a bar over the corresponding index. For example, the plane that is normal to the vector with indices
would be cited simply by,
. One can also use a similar convention using direct lattice vectors instead of the reciprocal ones. In this case, we would use square brackets instead of parenthesis. Returning to our previous example, a vector that has indices
in the direct lattice base, would be represented as
. Now that we have a way to locate every particle in the system4.2, we must calculate the full energy of the system. The main contribution to the energy was calculated in detain in
Chapter 3, but there are, however, a few other contributions that are of crucial importance to the dynamics of our system.
Ehrlich-Schwoebel Instability
|
In
Fig. 4.1 we represent schematically a monoatomic step in a surface. We shall concern ourselves only with the probability that atoms at positions
or
move into position
at the step. The two major modifications of the potential may occur at
and
due to the decreased and increased coordination between a surface diffusing atom and the substrate, respectively. If we denote the probability of an adsorbed atom at
moving into the step by
and the probability for an atom at
moving into the step by
, we can easily see that areas ahead or behind the step are more important to step growth depending on whether
or
is larger, respectively. The evolution of step sizes and distances can then be analyzed for several cases, as done by Schwoebel.
It is well known in the literature[47,57,58,59] that this instability is caused by an energy barrier,
, that is present at the island edges. It should be noted that this barrier does not affect the probability of stepping down from the edge, but that it only makes it more difficult to approach the step-edge site. There are also two other contributions to the energy that we must take in to consideration. A constant binding energy to the substrate
, and the energy cost of unbinding from the nearest neighbors. Besides the elastic strain energy,
and the barrier energy
there are two other contributions to the energy of a particle in our system. The total energy
of a diffusing adatom at each point is then:
where
is the number of nearest neighbors a that given site.
In
Fig. 4.2 we represent three possible local environments for the adatoms diffusing near a step. These examples are illustrative of the direction dependency of
Eq. 4.7 contribution to the diffusion energy. The hopping barrier
is equal to
for hops in all directions for adatom number
. For adatom number
,
is equal to
in the direction of the arrows and, simply, equal to
in the remaining directions. Finally, for adatom number
, we have
in the direction of the arrows and
in the other two directions. As we illustrate with adatom number
, this barrier has also the effect of making upwards hops more difficult. This is important in preventing unphysical growth behavior by reducing only the downward interlayer transport and in maintaining the ergodicity of the system. It should be noted that we must add to these energies the contribution of the Elastic Energy,
discussed previously and that, for simplicity's sake, we omitted in these brief examples.
The total energy given by
Eq. 4.7 is the energy that will be used to determine the diffusion of the adatoms as outlined in the Monte-Carlo section·
Material Properties
In order to perform a realistic simulation that provides us with results that can be directly compared with experimental results we must take the properties of the materials that are used in to consideration. In this section we describe in some detail which physical properties must be used to reproduce the results described Moison et al. The values of the constants used are summed up in Table 4.1.
|
The crystal structure of both
and
is the Zinc Blende structure, also commonly referred to as the diamond structure. This structure can be mathematically (and hence, computationally) described as being constituted by two inter-penetrating Face Centered Cubic (FCC) Bravais Lattices shifted by
on all three directions, or, equivalently, as being a simple FCC lattice with two particles per site each of which is shifted from its normal position by the following vectors4.3.
![]() |
(4.8) |
The Zinc Blend structure is represented in Fig. 4.3. The FCC structure that is inherent to this construction can be represented by the lattice
Algorithm
In this section we use all that was said above to compose a high level description of the algorithm that will rule our simulation. The algorithm is illustrated in Fig. 4.4 fluxogram. Our program would be constituted by three different parts, initialization, simulation and termination. The initialization section represents all the steps used to set up the data structures and communication channels4.4. The termination section is where all the clean up and final output shall occur. The technical details of these two sections are fairly obvious and will not be discussed further. In the remainder of this section, we will concentrate only on what happens between these two stages, during the actual simulation procedure. The four main steps in the simulation process will now be described.
- The first thing to do in order to deposit a particle is to select the location where the particle will make contact for the first time with the growing surface. This can be promptly done by generating two random numbers that we will used to index the array that we use to represent our system. The height of the stack of particles above that location is then measured and our new particle is deposited on top.
- After a convenient location for the new particle is found, we add this information to the list of diffusing particles. This list contains the information relative to all the particles in the system that are capable of diffusing4.5
- To simulate the simultaneous diffusion of several particles we will then perform a number
of diffusion steps for each of the diffusing particles. To prevent any unphysical behavior due to the serialization of this process, we will go through the diffusing list in a random order. The number,
, of diffusion steps per deposition attempt represents the ratio between the deposition rate and the diffusion rate. The adatoms move in random directions as indicated by Metropolis algorithm with probability given by
Eq. 4.1, and where the energy is calculated using
Eq. 4.7.
- Every time we move one particle we must verify if that same particle is still free to move. If not it must be removed from the diffusion list before we can proceed. After completely going through the diffusion list, we verify if there are still diffusion steps left before we can deposit another particle. If not, we simply select a new deposition location and repeat this cycle. The simulation will terminate after a specified number of particles has been deposited.
Parallelization Strategies
The tremendous evolution in CPU power that we have witnessed in the last few years has allowed us to routinely use in our everyday life resources that until
or
years ago were only available to researchers in high budget national labs. Although this evolution has proceeded at an exponential rate, our need for these resources has grown at an even greater pace. We can overcome the limitations of an individual CPU by using tens, hundreds or even thousands of machines in parallel so as to mimic the power of a single computer with the equivalent amount of resources. In this section we describe the two parallelization strategies that are commonly used in Monte-Carlo simulations similar to the one we are planning. These two strategies allow us to overcome the two most common limitations that simulations are confronted with, number of runs that can be used for averaging and system size.
|
The most commonly used form of parallelization tries to increase the number of times that the simulation is run by performing several runs at the same time, each of which in a different machine. Results from each of the machines are then combined together in to a final result. The easiest way to do this is by simply running the same program in several machines simultaneously and then combining the results ``by hand''. It's this technique that justifies calling this strategy trivial parallelization.
A more efficient and scalable technique that is currently used by large scale distributed processing project such as SETI@Home and Folding@Home is commonly referred to as the Client-Server model. In this model, there is a Server machine that acts as a master sending jobs to the Client machines that act as slaves. The server machine is also responsible for combining the results received from the other machines and for managing the resources of all the clients in such a way that it results in as high a performance as possible. Communication over the network only occurs between Server and Client machines and never between different clients. The communication between the machines can be easily implemented by utilizing freely available packages and protocols, such as the Message Passing Interface (MPI)[9], Remote Procedure Call (RPC)[23] and Remote Method Invocation (RMI)[22], among several others.
As we mentioned above, there are other ways to parallelize our simulation. One scheme that would allow us to simulate a larger substrate and to include the simultaneous deposition of several particles. In this section we make a high-level description of an implementation that uses
machines to simulate a domain
times larger. This description is used as as example and can be expanded to take advantage of a larger number of machines.
We'll use a square lattice as illustrated in Fig. 4.6. Each sub-domain is attributed to a different machine that is responsible for handling all the particles located in it and by transferring particles to neighboring domains if necessary. Due to the long range interaction derived in Chapter 3 each machine must be able to access the location of every particle in the system, including the ones on the other domains. There are three possibilities that immediately come to mind:
- Direct Query Each machine can directly query every other machine in the cluster about the positions of all its particles every time it starts depositing a new particle4.6
- Local Cache Each machine can maintain a local copy of the total system and update it every time it receives a message from another machine saying it successfully deposited a new particle. This implies that a machine broadcast the position of the deposited particle as soon as it completes a deposition attempt and that all the other machines must be waiting for these messages to be received.
Figure 4.7: Schematic representation of the parallelization strategy for the domain represented in Fig. 4.6
- Shared Memory We could also maintain all the information regarding the system in shared memory that would be accessible to all the nodes in the cluster. This strategy would require either, to have a dedicated machine that would share its memory to all other machines, or to have each machine sharing its memory. A different machine would act as a controller, making sure that all the computational nodes are operating according to our specifications. This node could also be the machine managing the systems shared memory. We represent this idea in Fig. 4.7.
These strategies could be implemented by using the tools mentioned in the previous subsection, or by using freely available software such as openMosix[46] among several others. Finally, we should mention the possibility of combining the two parallelization strategies that we described in order to simultaneously explore the advantages of both methods. This can be easily achieved by implementing several clusters as described in this subsection and then combining the results as described in previously.
When trying to parallelize this type of simulation, several difficulties arise. Many of them, have been analyzed in the literature and solutions have been proposed. We will mention just a few of them and give references to where they are analyzed in detail. The artificial constraint of depositing particles at constant time intervals can be modified by using a Poisson time distribution as described in[41,42]. The problem of knowing when to move a particle to another sub-domain is similar to the hand-off problem in Cellular phone networks as described in [21]. An efficient way to determine where the particles should be deposited and how to minimize the amount of memory necessary is described in [35,34]. Several algorithms that allow us to increase the efficiency of parallel simulations can be found at[20,33]








