Upload final Notebook and changes in functions

e2c3258f · Paula Lanze · abc3b452 · e2c3258f · e2c3258f · e2c3258f
Commit e2c3258f authored 4 months ago by Paula Lanze
--- a/Notebooks/.DS_Store
+++ b/Notebooks/.DS_Store
--- a/Notebooks/MetEngSim/.DS_Store
+++ b/Notebooks/MetEngSim/.DS_Store
--- a/Notebooks/MetEngSim/MetEngSim.html
+++ b/Notebooks/MetEngSim/MetEngSim.html
--- a/Notebooks/MetEngSim.ipynb
+++ b/Notebooks/MetEngSim.ipynb
@@ -40,7 +40,7 @@
    "\n",
    "<div style=\"width: 500px; margin: auto;\">\n",
    "\n",
-    "![Feedforward loop classes](../Figures/Jupyter/Eco_core_met.png)\n",
+    "![Feedforward loop classes](../../Figures/Jupyter/Eco_core_met.png)\n",
    "\n",
    "</div>\n",
    "\n",
@@ -74,7 +74,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -130,7 +130,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
@@ -159,7 +159,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 7,
+   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
@@ -223,7 +223,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
@@ -271,16 +271,16 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "[<Metabolite fum_c at 0x7fded5a798e0>, <Metabolite fum_e at 0x7fded5a79910>]"
+       "[<Metabolite fum_c at 0x7fafc5df18b0>, <Metabolite fum_e at 0x7fafc5df18e0>]"
      ]
     },
-     "execution_count": 6,
+     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -292,7 +292,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 7,
+   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
@@ -301,14 +301,14 @@
       "[['SUCDi',\n",
       "  'Succinate dehydrogenase (irreversible)',\n",
       "  'q8_c + succ_c --> fum_c + q8h2_c'],\n",
+       " ['FUM', 'Fumarase', 'fum_c + h2o_c <=> mal__L_c'],\n",
       " ['FUMt2_2',\n",
       "  'Fumarate transport via proton symport (2 H)',\n",
       "  'fum_e + 2.0 h_e --> fum_c + 2.0 h_c'],\n",
-       " ['FRD7', 'Fumarate reductase', 'fum_c + q8h2_c --> q8_c + succ_c'],\n",
-       " ['FUM', 'Fumarase', 'fum_c + h2o_c <=> mal__L_c']]"
+       " ['FRD7', 'Fumarate reductase', 'fum_c + q8h2_c --> q8_c + succ_c']]"
      ]
     },
-     "execution_count": 7,
+     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -332,7 +332,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 14,
+   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
@@ -372,7 +372,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 15,
+   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {

 %% Cell type:markdown id: tags:

 # Microbial metabolism analysis

 <hr>

 %% Cell type:markdown id: tags:

 ## Introduction
 <hr>

 %% Cell type:markdown id: tags:

 $\color{darkblue}{\textbf{Learning Outcome}}$
 - Memorize 5 metabolites and 3 reactions from the *E. coli* metabolism.
 - Discuss the metabolic pathway correlated to the substrate-product pair regarding length and efficiency.
 - Compare and evaluate metabolic pathway databases.
 <hr>

 Genome-scale metabolic models (GSMMs) contain all metabolic information of a biological system and convert them into a mathematical model. This allows for in-depth insight into molecular mechanisms of organisms and enables the analysis of metabolic pathways, because it breaks them down into their respective reactions and enzymes.\
 Metabolites present in the, for instance, glycolysis or TCA-cycle are utilized in this Notebook to simulate the *E. coli* metabolism via reaction rates.

 $\color{darkblue}{\text{Glycolysis:}}$\
 Glycolysis is one component of the central metabolism and is an amphibolic pathway (involves both anabolism and catabolism) because it can reversibly produce hexoses from varius low-molecular weight molecules. It produces six precursor metabolites that are starting materials for the biosynthesis of building blocks for macromolecules and other needed small molecules. Therefore its functioning under all conditions is essential.

 $\color{darkblue}{\text{TCA-cycle:}}$\
 The TCA-cycle is another component of the central metabolism and is a catabolic pathway of aerobic respiration. It generates energy (ATP) and also precursors for biosynthesis.

 You are provided with an overview of the *E. coli* core metabolism. It is now clear that the metabolism not only consists of the glycolysis and TCA-cycle. A lot of metabolic byproducts take place in the energy managment of the cell and even reactions of fermentation could be part of the pathway. The larger letters (Glyc, PPP, etc.) refer to the major components of the metabolism, such as (OxP) oxidative phosphorylation, (Glyc) glycolysis pathway, (PPP) pentose phosphate pathway, (TCA) tricarboxylic acid cycle, (Ana) glycoxylate cycle, gluconeogenesis, and anapleurotic reactions, (Ferm) fermentation, and (N) nitrogen metabolism.

 <div style="width: 500px; margin: auto;">

-![Feedforward loop classes](../Figures/Jupyter/Eco_core_met.png)
+![Feedforward loop classes](../../Figures/Jupyter/Eco_core_met.png)

 </div>

 The model makes metabolic pathway analysis easy, because is breaks down the pathways into their respective reactions.
 The *E. coli* core model contains only 137 genes, 95 reactions and 72 metabolites, what makes working with this model easier and more time efficient, than working with the full *E. coli* model.
 The related flux balance analysis seeks to mathematically simulate metabolism in genome-scale reconstructions of metabolic network.

 **Comparing database maps:**
 - $\color{darkblue}{\text{KEGG:}}$ Kyoto Encyclopedia of Genes and Genomes (KEGG) is a bioinformatics database containing information about genes, proteins, reactions and pathways. With the 'KEGG Pathway' option it is possible to find pathway maps in which the metabolite ID can be searched and is afterwards highlighted in the map.
 - $\color{darkblue}{\text{BioCyc/EcoCyc:}}$ A collection of pathway/genome databases for model organisms and it contains software tools to explore them. The 'Metabolic Map' option makes finding metabolic pathways easy and time efficient by typing in the 'map-name'.
 - $\color{darkblue}{\text{BiGG:}}$ A bioinformatics database that utilizes the ESCHER map. This map visualizes metabolic pathways like Glycolysis and TCA-Cycle and provides information about metabolites and reactions. It displays different metabolic processes in one map, for instance, Glycolysis, TCA-Cycle, and electron transport chain.
 - $\color{darkblue}{\text{ChatGPT:}}$ Chat-AI that answers questions about any topic. It is not possible to create a map or any other form of visualization which gives less overview of the metabolic reactions.

 Database| Enzymes| Genes| Reactions| Pathways| Metabolites
 :-------:|:-------:|:-------:|:-------:|:-------:|:-------:
 KEGG       | X | - | X | XX | XXX |
 BioCyc/EcoCyc | XX | XX | XX | XX | XX |
 BiGG/ESCHER| - | - | XX | XXX | XX |
 ChatGPT    | X | - | X | - | X |

 <hr>

 %% Cell type:markdown id: tags:

 Set up the computational environment by running this cell.\
 Make sure all neccessary files (EcoMetSim_functions, images) are loaded.

 %% Cell type:code id: tags:

 ``` python
 # install requirements
 # %pip install -r ../requirements.txt
 ```

 %% Cell type:code id: tags:

 ``` python
 # this code-cell sets up the computational environment for the notebook

 # file system and path operations
 # import os
 import numpy as np
 import random
 from random import *
 # import matplotlib.pyplot as plt

 # load cobra toolbox and install it if necessary
 from cobra.io import read_sbml_model
 print('Cobra toolbox is already installed')

 # load functions from the EcoMetSim toolbox
 from MetEngSim_functions import *

 print('Done')
 ```

 %% Output

    Cobra toolbox is already installed
    Done

 %% Cell type:markdown id: tags:

 ## Set up model
 In this cell you generate your individual pair of two metabolites that will operate as substrate and product.\
 The Output displays the metabolite ID, the maximal possible product formation rate and the currently generated product formation rate.

 **Input:**
 - your student ID

 %% Cell type:code id: tags:

 ``` python
 Student_ID = 412501
 model, selected_pair, all_reactions, product, product_lim, _ = make_metabolite_combination(Student_ID)
 print(selected_pair, product_lim)
 ```

 %% Output

    Loading existing file e_coli_core.xml.gz
    ['fum_e', 'glu__L_e'] [0.5]

 %% Cell type:markdown id: tags:

 <hr>
 You can generate a bar chart by executing this cell.\
 It displays the maximal possible product flux and your current product flux.

 %% Cell type:code id: tags:

 ``` python
 create_bar_chart(all_reactions, product_lim)
 ```

 %% Output



 %% Cell type:markdown id: tags:

 ## Databases
 <hr>

 ### Exercise:

 Have a look into the listed Databases to find a pathway from substrate to product and the associated reactions.
 Chose one Database that seems to work best for your combination and look at the reactions to chose one you would like to test in the Experiment.

 Put in the Metabolite ID of your substrate or product in 'None'. \
 The Output gives you the database ID for your substrate and product.

 $\color{darkblue}{\text{TIPPS:}}$\
 Every database-link leads you to a map where metabolites, enzymes and reactions are displayed. By guiding the cursor over, for example, a metabolite more information such as the meabolite ID is shown. Often the reaction is displayed as a number, except for the Escher-map.
 Sometimes the database-ID is not the right ID for this model, use the two cells below to get the right ID for the model.

 $\color{darkblue}{\text{KEGG:}}$\
 [KEGG TCA-cycle](https://www.genome.jp/pathway/map00020)\
 [KEGG glycolysis](https://www.genome.jp/pathway/map00010)\
 Click on the link for the TCA-Cycle or Clycolysis and just add the KEGG_ID into the search bar by adding +KEGG_ID without any spaces after the link. This will highlight the metabolite associated to the KEGG_ID and makes the serach for substrate and product more efficient.

 $\color{darkblue}{\text{BiGG/Escher:}}$\
 [ESCHER-map](https://escher.github.io/#/app?map=e_coli_core.Core%20metabolism&tool=Builder&model=e_coli_core)\
 Under 'View' and then 'Find', you can also search for individual metabolites and display them by entering the metabolite ID or name.

 $\color{darkblue}{\text{EcoCyc:}}$\
 [EcoCyc TCA-cycle](https://biocyc.org/pathway?orgid=ECOLI&id=TCA&detail-level=2)\
 [EcoCyc TCA-cycle](https://biocyc.org/pathway?orgid=ECOLI&id=GLYCOLYSIS&detail-level=2)

 $\color{darkblue}{\text{ChatGPT:}}$\
 [ChatGPT](https://chatgpt.com)\
 Give precise commands and provide the AI with all information you gathered to this point. It will help achieving your goal more time efficient and keep in mind that there is no map or visualization like in the other databases.


 **Input:**

 - metabolite ID for substrate or product in (' ')

 %% Cell type:code id: tags:

 ``` python
 # get KEGG ID for substrate
 print(model.metabolites.get_by_id('fum_e').annotation['kegg.compound'])
 # get KEGG ID for product
 print(model.metabolites.get_by_id('glu__L_e').annotation['kegg.compound'])

 # get BiGG ID for substrate
 print(model.metabolites.get_by_id('fum_e').annotation['bigg.metabolite'])
 # get BiGG ID for product
 print(model.metabolites.get_by_id('glu__L_e').annotation['bigg.metabolite'])

 # get EcoCyc ID for substrate
 print(model.metabolites.get_by_id('fum_e').annotation['biocyc'])
 # get EcoCyc ID for product
 print(model.metabolites.get_by_id('glu__L_e').annotation['biocyc'])
 ```

 %% Output

    C00122
    ['C00025', 'C00302']
    fum
    glu__L
    META:FUM
    ['META:Glutamates', 'META:GLT']

 %% Cell type:markdown id: tags:

 <hr>
 The next two Code-cells might help to find the reaction ID you're looking for. By searching the metabolite and the reactions this metabolite participates in, you get information about the reaction ID, reaction name and the reaction itself. Consider Capital letters for the first letter.\
 Have a close look at your metabolite-pair and especially the pathway and consider different component of the metabolism.

 **Input:**
 - search_met: the metabolite name or any short form to find the metabolite ID
 - met_reactions: metabolite ID you extracted from previous cell to get information about the reactions

 %% Cell type:code id: tags:

 ``` python
 search_met = 'Fum'
 model.metabolites.query(search_met, 'name')
 ```

 %% Output

-    [<Metabolite fum_c at 0x7fded5a798e0>, <Metabolite fum_e at 0x7fded5a79910>]
+    [<Metabolite fum_c at 0x7fafc5df18b0>, <Metabolite fum_e at 0x7fafc5df18e0>]

 %% Cell type:code id: tags:

 ``` python
 met_reactions = 'fum_c'
 # list comprehension to show all reactions that contain the metabolite
 [[reaction.id, reaction.name, reaction.reaction] for reaction in model.metabolites.get_by_id(met_reactions).reactions]
 ```

 %% Output

    [['SUCDi',
      'Succinate dehydrogenase (irreversible)',
      'q8_c + succ_c --> fum_c + q8h2_c'],
+     ['FUM', 'Fumarase', 'fum_c + h2o_c <=> mal__L_c'],
     ['FUMt2_2',
      'Fumarate transport via proton symport (2 H)',
      'fum_e + 2.0 h_e --> fum_c + 2.0 h_c'],
-     ['FRD7', 'Fumarate reductase', 'fum_c + q8h2_c --> q8_c + succ_c'],
-     ['FUM', 'Fumarase', 'fum_c + h2o_c <=> mal__L_c']]
+     ['FRD7', 'Fumarate reductase', 'fum_c + q8h2_c --> q8_c + succ_c']]

 %% Cell type:markdown id: tags:

 ## Experiment
 <hr>

 **Input:**
 - target_reaction: any reaction ID of your choice. Consider using the code-cells above to find right reaction ID for this model.

 %% Cell type:code id: tags:

 ``` python
 # reaction ID of choice
 target_reaction = 'CS'

 # add reaction ID to list of all reactions
 all_reactions.append(target_reaction)
 print(all_reactions)

 # optimize the model with reaction ID
 model = optimize_reaction(model, target_reaction)
 product_lim.append(round(model.slim_optimize(), 2) / product)
 ```

 %% Output

    ['fum_e --> glu__L_e limited', 'CS', 'CS', 'CS']

 %% Cell type:markdown id: tags:

 <hr>
 Here you can extend the bar chart to show the reaction ID you put in above and the product flux associated to that reaction. Compare the old product flux with the newly generated and decide if your chosen reaction ID is correct.

 If the reaction ID is correct try to extend the product flux even more, you may be able to reach the original production rate.

 $\color{darkblue}{\text{TIP:}}$
 If no bar chart is created because the code cells returns an ERROR, restart the kernel, set up the computational environment and load the model as well as your metabollic pair again. Afterwards the bar chart should be generated again.
 An ERROR may occur if the reaction ID is incorrectly input, for instance ensure there are no spaces added to the reaction ID.

 %% Cell type:code id: tags:

 ``` python
 # check if reaction ID is correct and model is optimized
 create_bar_chart(all_reactions, product_lim)
 ```

 %% Output



--- a/Notebooks/MetEngSim/MetEngSim.pdf
+++ b/Notebooks/MetEngSim/MetEngSim.pdf
--- a/Notebooks/MetEngSim_functions.py
+++ b/Notebooks/MetEngSim_functions.py
@@ -17,8 +17,8 @@ def load_model():
    '''
    # define boths paths where the model can be stored
    ModelFiles = [
-        os.path.join('..', '53_Models', 'e_coli_core.xml.gz'),
-        os.path.join('..', 'models', 'e_coli_core.xml.gz')
+        os.path.join('..', '..', '..', 'metengsim', 'iamb-folder-template', '50-59_Code', '53_Models', 'e_coli_core.xml.gz'),
+        #os.path.join('..', 'models', 'e_coli_core.xml.gz')
    ]
    model = None