Skip to content
Snippets Groups Projects
Select Git revision
  • main
1 result

extracting-and-analyzing-object-centric-game-data

  • Clone with SSH
  • Clone with HTTPS
  • Name Last commit Last update
    data
    src
    .gitignore
    README.md
    main.py
    setup.py

    Extracting and Analyzing Object-Centric Game Data

    This repository contains the full code base for a bachelor's thesis.

    It is focused on extracting, transforming, and analyzing object-centric event data from StarCraft II replay files, and converting it into the OCEL 2.0 (Object-Centric Event Log) format for process mining and data science.

    ✨ Goal

    The goal of this thesis is to bridge video game telemetry and process mining by transforming rich in-game interactions (units, players, abilities, locations) into structured, object-aware event logs.

    These logs can be used with tools like PM4PY to perform advanced analysis of game strategies, behaviors, and object lifecycles.

    📁 Repository Structure

    .
    ├── src/
    │   ├── Pipeline/
    │   │   ├── raw2structured.py        # Extracts and groups structured events from SC2Replay files
    │   │   ├── structured2file.py       # Writes structured events to JSON
    │   │   ├── json2sql.py              # Converts grouped JSON to OCEL 2.0-compliant SQLite
    │   │   ├── constants.py             # Paths to replays/output folders
    │   │   └── __init__.py              # Entry point for end-to-end pipeline
    │   └── analysis/                    # (optional) downstream analysis and visualization
    ├── data/
    │   ├── replays/                     # Raw .SC2Replay files
    │   └── output/                      # JSON and SQLite OCEL outputs
    └── README.md

    🧪 Technologies Used

    • sc2reader: to parse .SC2Replay files
    • Python 3.11 and standard libraries (e.g., sqlite3, json, collections)
    • OCEL 2.0 Schema: relational model for object-centric event logs
    • PM4PY: for OCEL importing and process mining analysis

    🚀 Pipeline Overview

    1. Extract Events from Replays (raw2structured.py)

    • Parses every event in a StarCraft II replay
    • Converts it into a flat Python dictionary
    • Adds metadata: timestamps, player, unit info, location, etc.

    2. Write Structured Events to JSON (structured2file.py)

    • Processes all replays in a folder
    • Saves each grouped event log to *_events.json

    3. Convert to OCEL SQLite (json2sql.py)

    • Transforms grouped JSON into an OCEL 2.0-compliant relational database

    • Creates:

      • event, object, event_object, object_object tables
      • Event-specific and object-type-specific tables (event_UnitBornEvent, object_Unit, etc.)
    • Tracks metadata such as:

      • Unit type, owner, position
      • Player identity
      • Control group relationships
      • Complex inter-object links: attacks, kills, casts, gathering, healing

    🤔 What Makes This Pipeline Object-Centric?

    • Multiple object types: Player, Unit, ControlGroup, Location, etc.
    • Multi-object relations: each event can involve several objects
    • Lifecycles & interactions: Unit creation, movement, killing, casting, gathering, etc.
    • Richer semantics: not just activities, but structured interactions

    ⚙️ How to Use

    1. Place .SC2Replay files into the data/replays/ directory.
    2. Run the structured extraction:
    python src/Pipeline/structured2file.py
    1. Convert JSON to OCEL SQLite:
    python src/Pipeline/json2sql.py  # Or call convert_to_ocel_sqlite from __init__.py
    1. Analyze using PM4PY or other tools:
    from pm4py.objects.ocel.importer.sqlite import factory as sqlite_importer
    ocel = sqlite_importer.apply('data/output/example.sqlite')

    🔹 Example Use Cases

    • Compare strategies of different players across replays
    • Mine object lifecycles (e.g., when are Zerglings most commonly killed?)
    • Detect behavior patterns across units (e.g., movement -> cast -> attack)
    • Train predictive models using OCEL data

    📝 Thesis Context

    This repository supports a bachelor's thesis on object-centric process mining from video game data, using StarCraft II as a case study. The approach is guided by academic sources such as:

    • A Framework for Extracting Real-World Object-Centric Event Logs from Game Data
    • Object-centric process mining dealing with divergence and convergence in event data

    🚫 Limitations & Warnings

    • PM4PY may raise warnings if foreign keys or OCEL constraints are not perfectly satisfied. Some are harmless, but others may indicate schema issues.
    • Some metadata may be missing for destroyed units or legacy replays.
    • .xes export is not supported yet.

    ✏️ Future Work

    • Add XOCEL or XES export
    • Integrate lifecycle transitions (e.g., created, moved, destroyed)
    • Add process visualizations
    • Explore transformer-based models on OCEL event logs

    🎓 Author

    Fabian Gries B.Sc. in Computer Science RWTH Aachen University

    For questions or academic references, feel free to open an issue or reach out.

    ❤️ Acknowledgements

    • Lukas Liß (Supervisor)
    • Prof. Dr. Wil van der Aalst (Object-Centric Process Mining)
    • sc2reader developers
    • OCEL standardization community
    • PM4PY project