Working version with a simple molecule viewer

823cee04 · moritz.buchhorn · 3d91204b · 823cee04 · 823cee04
Commit 823cee04 authored 6 months ago by moritz.buchhorn
--- a/04-dft/04-dft.ipynb
+++ b/04-dft/04-dft.ipynb
@@ -23,6 +23,13 @@
    "import pandas as pd"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Als Anfangsgeometrie verwenden wir die Koordinaten in **hcho-guess.xyz**, die Sie sich in der nächsten Zelle anschauen können."
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": null,
@@ -35,10 +42,33 @@
   ]
  },
  {
-   "cell_type": "markdown",
+   "cell_type": "code",
+   "execution_count": null,
   "metadata": {},
+   "outputs": [],
   "source": [
-    "Als Anfangsgeometrie verwenden wir die Koordinaten in **hcho-guess.xyz**, die Sie sich in der nächsten Zelle anschauen können."
+    "import numpy as np\n",
+    "import plotly.graph_objects as go\n",
+    "point = np.array([0, 0, 0])\n",
+    "v = np.array([0, 2, 1])\n",
+    "radius = 5.0\n",
+    "num_points = 20\n",
+    "\n",
+    "\n",
+    "fig = go.Figure()\n",
+    "\n",
+    "x,y,z = coords.circle_coordinates(point, v, radius, num_points)\n",
+    "fig.add_trace(go.Scatter3d(x=x, y=y, z=z, mode=\"lines\"))\n",
+    "\n",
+    "x1,y1,z1 = coords.circle_coordinates(point+(np.array([0,2,1])*5), v, radius, num_points)\n",
+    "fig.add_trace(go.Scatter3d(x=x1, y=y1, z=z1, mode=\"lines\"))\n",
+    "\n",
+    "xx = np.append(x, x1)\n",
+    "yy = np.append(y, y1)\n",
+    "zz = np.append(z, z1)\n",
+    "fig.add_trace(go.Mesh3d(x=xx, y=yy, z=zz, opacity=1, alphahull=0))\n",
+    "\n",
+    "fig.show()"
   ]
  },
  {
@@ -84,7 +114,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Die fertigen Rechnungen können wir einlesen und in einer Tabelle sortieren. Vergleichen Sie die erhaltenen Bindungslängen für die C=O-Bindung mit dem experimentellen Wert von 1.21 Å."
+    "Die fertigen Rechnungen können wir einlesen und in einer Tabelle sortieren (alle Werte in Å). Vergleichen Sie die erhaltenen Bindungslängen für die C=O-Bindung mit dem experimentellen Wert von 1.21 Å."
   ]
  },
  {
@@ -169,7 +199,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.10"
+   "version": "3.10.13"
  }
 },
 "nbformat": 4,

 %% Cell type:markdown id: tags:

 # B.CTC: DFT
 In der Vorlesung haben sie den Hartree-Fock (HF)-Ansatz und die Dichtefunktionaltheorie (DFT) kennengelernt. Die Implementation als Kohn-Sham-DFT ist dem HF-Ansatz aus technischer Perspektive sehr ähnlich, weshalb jede (uns bekannte) Quantenchemie-Software beide Methoden zur Verfügung stellt.

 In dieser Demonstration wollen wir uns die Ergebnisse einer Geometrieoptimierung für verschiedene Methoden genauer anschauen. Da DFT nicht systematisch verbesserbar ist, wird zu Beginn einer theoretischen Studie immer ein sogenanntes _Benchmarking_ durchgeführt, ähnlich einer Kalibration in der Analytik. Mit verschiedenen Funktionalen wird die geforderte Größe (hier Bindungslängen) einer bekannten Verbindung berechnet. Mit dem Funktional, das die experimentelle Größe am besten wiedergibt, wird dann die entsprechende Größe für eine andere Verbindung berechnet. Dieses Verfahren funktioniert umso besser, je ähnlicher sich die Referenzverbindungen und die untersuchten Verbindungen sind.

 In diesem Notebook verwenden wir Formaldehyd, das hier beste Funktional könnte also zum Beispiel sehr gut für Acetaldehyd verwendet werden.

 %% Cell type:code id: tags:

 ``` python
 import subprocess as sub
 import support
 import pandas as pd
 ```

+%% Cell type:markdown id: tags:
+
+Als Anfangsgeometrie verwenden wir die Koordinaten in **hcho-guess.xyz**, die Sie sich in der nächsten Zelle anschauen können.
+
 %% Cell type:code id: tags:

 ``` python
 coords = support.XYZ(f"hcho-guess.xyz")
 fig = coords.get_molecular_viewer()
 fig.show()
 ```

-%% Cell type:markdown id: tags:
+%% Cell type:code id: tags:

-Als Anfangsgeometrie verwenden wir die Koordinaten in **hcho-guess.xyz**, die Sie sich in der nächsten Zelle anschauen können.
+``` python
+import numpy as np
+import plotly.graph_objects as go
+point = np.array([0, 0, 0])
+v = np.array([0, 2, 1])
+radius = 5.0
+num_points = 20
+
+
+fig = go.Figure()
+
+x,y,z = coords.circle_coordinates(point, v, radius, num_points)
+fig.add_trace(go.Scatter3d(x=x, y=y, z=z, mode="lines"))
+
+x1,y1,z1 = coords.circle_coordinates(point+(np.array([0,2,1])*5), v, radius, num_points)
+fig.add_trace(go.Scatter3d(x=x1, y=y1, z=z1, mode="lines"))
+
+xx = np.append(x, x1)
+yy = np.append(y, y1)
+zz = np.append(z, z1)
+fig.add_trace(go.Mesh3d(x=xx, y=yy, z=zz, opacity=1, alphahull=0))
+
+fig.show()
+```

 %% Cell type:code id: tags:

 ``` python
 # Molecule viewer here
 ```

 %% Cell type:markdown id: tags:

 Bevor wir das Benchmarking durchführen, wollen wir mit dem HF-Ansatz für verschiedene Basissätze Bindungslängen berechnen. Die drei Basissätze sind STO-3g, def2-SVP und def2-TZVP und ihre Größe nimmt in dieser Reihenfolge zu.

 In der nächsten Zelle werden in einer Schleife _input files_ erzeugt und anschließend mit ORCA berechnet.

 %% Cell type:code id: tags:

 ``` python
 # Schleife über die Basissätze
 for basis_set in ("sto-3g", "def2-svp", "def2-tzvp"):
    # öffnet eine Datei, in die geschrieben wird
    with open(f"hcho-hf-{basis_set}-opt.inp", "w") as f:
        # schreibt in die Datei
        f.write(f"""\
 ! RHF {basis_set} Opt

 * xyzfile 0 1 hcho-guess.xyz
 """)
    # Ruft ORCA auf, als wäre es im Terminal ausgeführt worden
    sub.run(f"orca hcho-hf-{basis_set}-opt.inp > hcho-hf-{basis_set}-opt.out",
            shell=True)
 ```

 %% Cell type:markdown id: tags:

-Die fertigen Rechnungen können wir einlesen und in einer Tabelle sortieren. Vergleichen Sie die erhaltenen Bindungslängen für die C=O-Bindung mit dem experimentellen Wert von 1.21 Å.
+Die fertigen Rechnungen können wir einlesen und in einer Tabelle sortieren (alle Werte in Å). Vergleichen Sie die erhaltenen Bindungslängen für die C=O-Bindung mit dem experimentellen Wert von 1.21 Å.

 %% Cell type:code id: tags:

 ``` python
 table = pd.DataFrame(index=["STO-3G", "def2-SVP", "def2-TZVP"])
 column = []
 for basis_set in ("sto-3g", "def2-svp", "def2-tzvp"):
    coords = support.XYZ(f"hcho-hf-{basis_set}-opt.xyz")
    column.append(coords.get_bond_length(0, 1))
 table["HF"] = column
 display(table)
 ```

 %% Cell type:markdown id: tags:

 Für HF-Berechnungen ist ersichtlich, dass die Bindungslänge für größere Basissätze als zu kurz berechnet wird. Da ein größerer Basissatz _immer_ ein besseres (oder gleich gutes) Ergebnis liefert, muss der Fehler in der Vorhersage durch den HF-Ansatz verursacht sein.

 Es mag antiintuitiv erscheinen, dass die Bindungslänge mit einem angeblich schlechteren (kleineren) Basissatz scheinbar besser vorhergesagt wird. Das ist ein klassisches Beispiel für "richtig aus dem falschen Grund", ein in der Quantenchemie recht häufig auftretendes Ergebnis. Durch Fehlerausgleich werden scheinbar korrekte Größen bestimmt, und es liegt an Ihnen als Wissenschaftler, die Plausibilität dieser Größen zu beurteilen.

 %% Cell type:code id: tags:

 ``` python
 for functional in ("bp", "pbe", "tpss", "tpssh"):
    for basis_set in ("sto-3g", "def2-svp", "def2-tzvp"):
        with open(f"hcho-{functional}-{basis_set}-opt.inp", "w") as f:
            f.write(f"""\
 ! RKS {functional} {basis_set} Opt

 * xyzfile 0 1 hcho-guess.xyz
 """)
        sub.run(f"orca hcho-{functional}-{basis_set}-opt.inp > hcho-{functional}-opt.out",
            shell=True)
 ```

 %% Cell type:code id: tags:

 ``` python
 exp = 1.21
 for functional in ("bp", "pbe", "tpss", "tpssh"):
    column = []
    for basis_set in ("sto-3g", "def2-svp", "def2-tzvp"):
        coords = support.XYZ(f"hcho-{functional}-{basis_set}-opt.xyz")
        column.append(coords.get_bond_length(0, 1))
    table[functional.upper()] = column
 table
 ```

 %% Cell type:markdown id: tags:

 Im Gegensatz zur Basissatzgröße sind Dichtefunktionale nicht systematisch verbesserbar, das heißt es gibt keinen Parameter, der mathematisch beweisbar das Ergebnis verbessert. Es ist daher ein plausibles Ergebnis, dass das "anspruchsvollere" Funktional TPSSh die schlechteste Bindungslänge liefert. In diesem Fall dürfen wir uns einfach freuen, dass die Funktionale mit geringen Rechenzeiten so gute Ergebnisse liefern und diese als bevorzugte Funktionale für unsere Rechnung verbuchen.

--- a/04-dft/support.py
+++ b/04-dft/support.py
@@ -23,15 +23,61 @@ class XYZ:
    def get_bond_length(self, i: int, j: int) -> float:
        """
        Returns the bond length between atoms i and j.
-        @param i: Integer. Index of the first atom.
-        @param j: Integer. Index of the second atom.
-        @return: Float. The bond length between the two atoms.
+
+        Args:
+            i (int): Index of the first atom.
+            j (int): Index of the second atom.
+
+        Returns:
+            float: The bond length between the two atoms.
+        """
+        return float(np.linalg.norm(self.xyz[i] - self.xyz[j]))
+
+    def circle_coordinates(self, point: np.ndarray, v: np.ndarray, radius: float, num_points: int) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
+        """
+        Generate cartesian coordinates of a circle in 3D space.
+
+        Args:
+            point (np.ndarray): A point on the circle.
+            v (np.ndarray): A vector perpendicular to the circle.
+            radius (float): The radius of the circle.
+            num_points (int): The number of points to generate on the circle.
+
+        Returns:
+            np.ndarray: A 2D array of shape (num_points, 3) containing the cartesian coordinates of the circle.
+        """
+
+        # Normalize the vector v
+        v = v / np.linalg.norm(v)
+
+        # Create a vector u that is perpendicular to v
+        u = np.cross(v, [1, 0, 0]) if np.linalg.norm(np.cross(v, [1, 0, 0])) > 1e-6 else np.cross(v, [0, 1, 0])
+        u = u / np.linalg.norm(u)
+
+        # Create a vector w that is perpendicular to both u and v
+        w = np.cross(u, v)
+
+        # Generate the cartesian coordinates of the circle
+        theta = np.linspace(0, 2 * np.pi, num_points)
+        circle_coords = point + radius * (u * np.cos(theta)[:, np.newaxis] + w * np.sin(theta)[:, np.newaxis])
+
+        x, y, z = circle_coords[:, 0], circle_coords[:, 1], circle_coords[:, 2]
+        return x, y, z
+
+    def make_fibonacci_sphere(self, center: np.ndarray, radius: float = 0.1, resolution: int = 32) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
+        """
+        Return cartesian coordinates of points evenly distributed on the surface of a sphere.
+
+        Args:
+            center (np.ndarray): The coordinates of the center of the sphere.
+            radius (float, optional): The radius of the sphere. Defaults to 0.1.
+            resolution (int, optional): The number of points to be generated. Defaults to 32.
+
+        Returns:
+            tuple[np.ndarray, np.ndarray, np.ndarray]: The cartesian coordinates of the points
+                on the surface of the sphere. The three arrays are the x, y and z coordinates
+                of the points.
        """
-        return np.linalg.norm(self.xyz[i] - self.xyz[j])
-    
-    def make_fibonacci_sphere(self, center, radius=0.1, resolution = 32):
-        """Return cartesian coordinates of points evenly distributed
-        on the surface of a sphere"""
        num_points = resolution
        indices = np.arange(0, num_points, dtype=float) + 0.5
        phi = np.arccos(1 - 2*indices/num_points)
@@ -42,39 +88,116 @@ class XYZ:
        z = radius * np.cos(phi) + center[2]

        return x, y, z
+    
+    def make_cylinder(self, center_i: int, center_j: int, resolution: int = 32, radius: float = 0.1) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
+        """
+        Return cartesian coordinates of points on the edges of a cylinder.
+
+        The cylinder is defined by the line between atoms i and j, and the radius is the given value. The number of points is determined by the resolution.
+
+        Args:
+            center_i (int): Index of the first atom that defines the cylinder.
+            center_j (int): Index of the second atom that defines the cylinder.
+            resolution (int, optional): The number of points to be generated. Defaults to 32.
+            radius (float, optional): The radius of the cylinder. Defaults to 0.1.

-    def make_atom_mesh(self, i: int) -> go.Mesh3d:
+        Returns:
+            tuple[np.ndarray, np.ndarray, np.ndarray]: The cartesian coordinates of the points on the edges of the cylinder. The three arrays are the x, y and z coordinates of the points.
        """
+        v = self.xyz[center_j] - self.xyz[center_i]
+        x1, y1, z1 = self.circle_coordinates(self.xyz[center_i], v, radius=radius, num_points=resolution//4)
+        x2, y2, z2 = self.circle_coordinates(self.xyz[center_j], v, radius=radius, num_points=resolution//4)
+        return np.concatenate((x1, x2)), np.concatenate((y1, y2)), np.concatenate((z1, z2))
+        
+
+    def make_atom_mesh(self, i: int, resolution: int = 32) -> go.Mesh3d:
+        """
+        Return a Mesh3d object that represents the atom at index i.

+        The atom is represented as a sphere with a radius depending on its van der Waals radius.
+
+        Args:
+            i (int): Index of the atom to be represented.
+            resolution (int, optional): The number of points to be generated. Defaults to 32.
+
+        Returns:
+            go.Mesh3d: A Mesh3d object that represents the atom.
        """
-        if self.sphere_mode == "vdw": # check to see if we are going to use the vdw radii
-            radius = atomp.vdw_radii_dict[self.elements[i]] # if so, then reassign the value
+        if self.sphere_mode == "vdw":
+            radius = atomp.vdw_radii_dict[self.elements[i]]
        elif self.sphere_mode == "ball":
            radius = atomp.vdw_radii_dict[self.elements[i]] * 0.2

-        print(atomp.vdw_radii_dict[self.elements[i]])
-        x, y, z = self.make_fibonacci_sphere(self.xyz[i], radius = radius)
+        x, y, z = self.make_fibonacci_sphere(self.xyz[i], radius=radius, resolution=resolution)
        
-        atom_trace = go.Mesh3d(
+        atom_mesh = go.Mesh3d(
            x=x, 
            y=y, 
            z=z, 
            color=atomp.atom_colors_dict[self.elements[i]], 
            opacity=1, 
            alphahull=0,
-            name=f'{self.elements[i]}',
+            name=f'{self.elements[i]}{i}',
            hoverinfo='name',  # Only show the name on hover
            )
-        return atom_trace
+        return atom_mesh
+    
+    def make_bond_mesh(self, i: int, j: int, resolution: int = 32, radius: float = 0.1) -> go.Mesh3d:
+        """
+        Return a Mesh3d object that represents a bond between atoms i and j.
+
+        The bond is represented as a cylinder with a radius depending on the given radius.
+
+        Args:
+            i (int): Index of the first atom in the bond.
+            j (int): Index of the second atom in the bond.
+            resolution (int, optional): The number of points to be generated. Defaults to 32.
+            radius (float, optional): The radius of the cylinder. Defaults to 0.1.
+
+        Returns:
+            go.Mesh3d: A Mesh3d object that represents the bond.
+        """
+        x, y, z = self.make_cylinder(i, j, resolution=resolution, radius=radius)

+        bond_mesh = go.Mesh3d(
+            x=x, 
+            y=y, 
+            z=z, 
+            color="#444444", 
+            opacity=1, 
+            alphahull=0,
+            # name=f'{self.elements[i]}-{self.elements[j]}',
+            hoverinfo='none',  # Only show the name on hover
+            )
+        return bond_mesh

-    def get_molecular_viewer(self) -> go.Figure:
+    def get_molecular_viewer(self, resolution: int = 64, rel_cutoff: float = 0.5) -> go.Figure:
        """
        Returns a plotly figure with the molecular structure.
+
+        The molecular structure is represented as a collection of spheres at the atomic positions,
+        with the bonds represented as cylinders between the atoms.
+
+        Args:
+            resolution (int, optional): The number of points to be generated. Defaults to 64.
+            rel_cutoff (float, optional): Bond are only shown if the interatomic distance is smaller than the sum of the atomic vdw radii multiplied with this parameter.
+                Defaults to 0.5.
+
+        Returns:
+            go.Figure: A plotly figure with the molecular structure.
        """
        fig = go.Figure()

+        # Add the bonds
+        for i in range(len(self.elements)):
+            for j in range(i+1, len(self.elements)):
+                    bond_length = self.get_bond_length(i, j)
+                    max_bond_length = rel_cutoff * (atomp.vdw_radii_dict[self.elements[i]] + atomp.vdw_radii_dict[self.elements[j]])
+                    if bond_length < max_bond_length:
+                        fig.add_trace(self.make_bond_mesh(i, j, resolution=resolution, radius=0.1))
+
+        # Add the atoms
        for i in range(len(self.elements)):
-            fig.add_trace(self.make_atom_mesh(i))
+            fig.add_trace(self.make_atom_mesh(i, resolution=resolution))

        return fig
\ No newline at end of file