Data Models in Power Systems Software
The most critical deisgn aspect you have never heard of in Power Systems. Excerpt from PowerSystems.jl — A power system data management package for large scale modeling
As the power grid evolves and demand grows, modeling enhances decision-making. News and forums highlight delays, high costs, and links to utility and state processes. Many organizations now focus on streamlined workflows and interoperability in power systems modeling.
The main obstacle in power systems modeling is the absence of a shared, precise technical vocabulary. Wittgenstein noted that language limits understanding; I argue that our field’s limited vocabulary restricts progress. Sienna’s goal is to address this constraint.
The Sienna platform addresses this by making its main contribution both technical and philosophical: creating a shared language for power systems modeling. By starting with a schema in PowerSystems.jl 1 to organize key concepts, later focusing on operational simulations 2 and finally on dynamics3, Sienna aims to build an actionable ontology that embodies the clarity of language Wittgenstein advocated, recognizing that structured vocabulary shapes the domain’s progress.
The limits of my language means the limits of my world.
Ludwig Wittgenstein
Power systems modeling is defined by a central tension: the need for precise, unambiguous definitions while contending with inconsistent vocabularies across tools and contexts. This patchwork undermines shared understanding and constrains the field’s growth — a problem that Sienna explicitly seeks to address.
Sienna draws from both Investigations and Tractatus. It sees meaning as arising from modeling practice, formalized into a coherent vocabulary. Terms like Device, Service, and Formulation capture modelers’ realities and become precise, machine-actionable definitions.
Separate terms create artificial boundaries. Time domain analysis is probably the area where the disfunction is more extreme. For decades, electromagnetic transient and phasor representations were seen as distinct modeling domains rather than as choices leading to endless debates about their applicability. This made questions involving both timescales structurally inaccessible at times when the need to model inverter based sources is a critical need. Model boundaries became boundaries in options for modeling.
Sienna extends its semantic framework to encompass dynamic and EMT formulations within the same type system; it proposes that the boundary is not ontological but representational, a choice of language game, not a difference in underlying reality 3. A converter, a line, a machine: these are the same objects whether we describe them in phasor or full electromagnetic terms. What changes is the formulation, not the object or the component. This is a Wittgensteinian investigation at work: the world does not change when we change our language, but what we can see in it does.
Extending what EMT means from a narrow simulation modality to a first-class member of a unified modeling vocabulary is not a software engineering decision, its a conceptual one. It enlarges the language game and, in doing so, expands the world in which power systems researchers can work; in practice, this supports the argument for dq0 modeling as an additional alternative to scale up complex simulations that require electromagnetic phenomena representation.
The same logic applies, with equal force, to the modeling of power system operations. Here, too, the field has operated with a vocabulary that obscures a distinction it badly needs. Optimization models used in practice conflate two fundamentally different epistemic situations: a decision-maker acting under uncertainty, relying on forecasts, choosing actions that will shape the future, and a system acting in real time, responding to conditions as they actually unfold. These are not the same thing, and treating them with the same vocabulary has consequences. It makes it harder to reason clearly about feedback, the gap between anticipated and realized outcomes, and how decisions propagate over time.
PowerSimulations adds a semantic distinction between the DecisionModel and the EmulationModel, introduced in the paper on scientific computing for power systems 2. The DecisionModel takes forecasts, optimizes decisions, and produces actions. The EmulationModel simulates how the system responds to those actions. This is not just an implementation detail; it’s a claim: planning and response are distinct language games, and conflating them impairs software and thinking alike.
This distinction, rooted in Wittgenstein’s Investigations, formalizes and clarifies tacit knowledge. Sienna’s vocabulary shapes which problems are visible and solvable, and it also directly changes how modelers work: it enables researchers to define problems more precisely, reduces errors caused by ambiguity, and creates new pathways for collaborative modeling and technological innovation. Each abstraction that Sienna formalizes unlocks new types of analysis and collaboration. By clarifying conceptual boundaries, Sienna makes reproducibility and interoperability truly achievable, allowing practitioners to articulate, evaluate, and share choices that were once inaccessible or ambiguous.
The downstream consequences of establishing semantic clarity are not simply intellectual — they are critical to the effectiveness of power system modeling. Two practical consequences, reproducibility and interoperability, directly support the argument for a structured shared vocabulary.
With Sienna, reproducibility improves as models use explicit, shared vocabularies. This makes it easier to validate and compare research, since structure and meaning are universally understandable instead of hidden in individual implementation details.
Interoperability follows the same logic. Tools interoperate not when they share file formats, but when they share concepts. A data exchange that moves numbers between two systems without a common understanding of what those numbers represent is fragile — it depends on conventions that are never fully documented and frequently violated. Sienna’s semantic layer makes the concepts explicit, which is a precondition for genuine interoperability: not just data flowing between tools, but meaning flowing between them.
There is, however, a cost to this approach — and it is worth being direct about it. A shared vocabulary is necessarily opinionated. To define what a Device is, or to insist that DecisionModel and EmulationModel are distinct, is to take a position. It rules things out. Users who have organized their thinking differently, or whose use cases sit at the edges of the definitions, will encounter friction. Some will find that Sienna’s vocabulary does not cleanly map onto their problem, and they will have to either adapt their problem to the framework or extend the framework to accommodate it.
Let me be clear: this is not a failure of design, it is an inherent feature of any serious attempt to build a shared language. Wittgenstein understood this: a language game has rules, and rules constrain. The alternative is a framework so flexible that it imposes no structure, and it is not a neutral choice. It is a choice to leave the semantic work undone, to defer the hard decisions about what things are, and to pay for that deferral in every downstream task that requires coordination. Sienna’s opinionated design is not a limitation to be apologized for. It is the mechanism by which the semantic contribution becomes real.
History warns what happens when semantic efforts are broad but not actionable: the Common Information Model (CIM). Created under IEC standards, it aimed for a universal language for utility communication — covering assets, topology, and operations. Utilities, vendors, and tools could exchange information without relying on proprietary formats or implicit conventions.
CIM developed a detailed ontology with thousands of classes, attributes, and relationships for power system descriptions. But detail alone doesn’t ensure actionability. The model expanded for every use case and convention, eventually allowing almost anything — and thus constraining almost nothing. Utilities could both claim CIM compliance, yet still need major manual work to compare systems. Interoperability remained aspirational, not practical.
This is a Tractarian failure of a particular kind. The Tractatus states that a proposition pictures a fact, but if it pictures anything, it pictures nothing. The CIM, in striving for universality, lost the precision that gives a vocabulary its power to act. The lesson is not that standardization is futile. Rather, a vocabulary must be designed to support action as well as description. In Wittgenstein’s later terms, a language game is defined by its capacity to do, not just to say. A modeling framework must be executable and usable for real engineering tasks — otherwise, it is just a taxonomy, not a language game. Taxonomies have their place, but only actionable vocabularies transform the field.
Sienna draws this distinction deliberately; the vocabulary and the computation are designed together, so that the meaning of a term is inseparable from what the software does when it encounters that term. This is what it means for a semantic contribution to be actionable: not that it describes a domain clearly, but that the description is itself the mechanism of computation.
The current wave of enthusiasm for AI in power systems rests on a largely unexamined assumption: that machine learning models, given enough data, will discover the domain’s structure on their own. This assumption has proven productive in domains where the relevant structure is latent and must be inferred from observations because it cannot be stated explicitly. But power systems analysis is not that kind of domain. Although the physics are well-characterized and the decision problems have been formulated for decades, there is a lack of one that is both human-readable and machine-actionable. The reliance on human intervention to make the applications work is pervasive across the domains, and the vocabularies are as diverse as the organizations.
When AI is applied to power systems without that representation, the results are predictable. Models trained on data from one system fail to transfer to another because the data carries implicit semantic assumptions that are never made explicit. For instance, assumptions about what the variables mean, how the topology is organized, and which constraints are active. Policies that perform well in simulation fail in deployment because the simulation’s vocabulary does not match the operational system’s vocabulary. Research results that look impressive in isolation resist integration because there is no common semantic substrate onto which they can be assembled.
This is, again, a Wittgensteinian problem. A machine learning model is, in a precise sense, a participant in a language game. It learns to produce outputs that are appropriate given inputs, within the context established by its training distribution. But if the language game is not well-defined, the model learns a private language, and generalization remains elusive. Wittgenstein was unambiguous about private languages: they cannot function as genuine language at all, because meaning requires the possibility of checking, of correction, of shared use.
A well-defined semantic framework for power systems is therefore not merely useful for AI; it is a precondition for AI. Models trained against Sienna’s type system inherit its semantic commitments, removing the need for translations between private languages. A ThermalStandard is the same object whether it appears in a power flow case in Chile or a production system in Texas. The semantic layer does for machine learning what it does for human researchers: it makes knowledge transferable by making meaning consistent.
The field is at an inflection point. The ambition to apply AI to power system operations, planning, and control is real and growing. But ambition without semantics produces tools that are impressive in demonstration and brittle in practice. The infrastructure for trustworthy, generalizable AI in power systems will not be built solely from data. It will be built from carefully chosen and consistently applied definitions. Sienna is an argument, in working code, that this infrastructure is both necessary and achievable.
Lara, José Daniel, et al. “PowerSystems. jl—A power system data management package for large scale modeling.” SoftwareX 15 (2021): 100747. ↩
Lara, Jose Daniel, et al. “Computational experiment design for operations model simulation.” Electric Power Systems Research 189 (2020): 106680. ↩ ↩2
Lara, Jose Daniel, et al. “Revisiting power systems time-domain simulation methods and models.” IEEE Transactions on Power systems 39.2 (2023): 2421-2437. ↩ ↩2