Created:
Updated:

Standard to Spec Part 1: Using Large Language Models to transform Product Carbon Footprinting Methodologies into Functional Specifications.

Using LLMs to Create Clear Software Specifications from PCF Standards

In the September issue of the ZeroTwentyFifty Newsletter, I crudely outlined an idea that has been a non-productive tenant of my mental real estate for quite some time. This idea is not particularly groundbreaking, but I have a belief that it would be enormously beneficial to have the common carbon accounting practices and methodologies represented in a software engineering friendly format; in this case a functional specification. This work is akin to an extension of the PACT approach of the last few years, which has culminated in the creation of the PACT Methodology and the PACT Network/Technical Spec which I have explored at varying levels over a few previous articles.

Much of my thinking on the successful use and creation of specifications has been influenced heavily by the writing of Joel Spolsky’s on Functional Specifications, I am a not-so-secret admirer of Joel, I enjoy his writing, finding it approachable, while also being backed by the data of the success of Fog Creek and the many acquisitions that have come out of their stable. I would wager money on many readers not knowing that Fog Creek is the progenitor of Manuscript, Trello, Stack Overflow and Glitch. Their processes have enabled the creation of software with substantial value, and I think some of that experience and expertise can be leaned on in the pursuit of better Product Carbon Footprint software. Now with the veracity of these claims settled, let's move on.

This’ll be easy… right?

A foolproof way to create code

Regarding my own thoughts on the creation of a functional specification for a PCF Methodology, the way forward always seemed fairly straightforward, I would simply take the GHG Protocol Product Standard, copy paste it into ChatGPT and voila, I would have a perfectly workable functional specification. But is it really that simple?

During the development of ZeroTwentyFifty (our eponymous Open Source solution), the approach to learning about PCF data exchange and developing software for this use case has changed a few times. There were one or two false starts and I will admit that there was a lot of learning to do. Trying to tackle a new domain by writing code for it makes sense; I am a software engineer after all, so modelling concepts with code has been a very useful way to develop new understanding. Pairing this with the most recent advancements in AI and the impacts they are having on development workflows is also the sensible thing to do.

However, both sitting down and writing code solo and using the assistance of AI tooling such as ChatGPT, would often result in progress being halted in order to rethink the approach. I believe this boiled down to a few underlying issues:

  1. Bad data formats (PDF is mordor-tier black magic)
  2. The general machinations of LLMs
  3. Difficulty with creating good quality prompts and whether they’re having a bad day or not (not me though, but my friends and family may disagree with my opinion on this one).
  4. No clear idea of what the output would be, is it a library, web app, does it have a user interface?

What this feels like is when you try to build something when you don't know what it is. For example, when I write articles without thinking about the structure, it always comes out a mess, the timelines blow out and I'm left hating what I've produced.

This is not just down to me either, when you read accounts from across the industry, you’ll find case after case of development teams wishing for a specification before writing code, horror stories of badly managed projects and products and general bad-timery.

y tho?

I believe that much of the benefit is actually fairly public facing, I probably wouldn’t be writing a blog series on this if I didn’t think so. So to explain much of the reasoning behind this work, I think it makes sense to explain it in terms of who I wish to impact with this work:

  • Users: By creating some documentation on what a functional solution looks like for an end product, they have a clear idea and understanding of what they can expect to see on their screens in the future from ZeroTwentyFifty.
  • Contributors: I have had a few conversations with developers wondering what I’m working on, and I have at various times wished to have had a much clearer roadmap to be able to show off some of the cool stuff I have planned for development, so that I can show that off and bring more people into the incredibly cool world of decarbonisation, and they can have some assurance that their time will be respected with an amount of up front work.
  • Investors: If you are reading this, you are exactly the type of investor that I want to talk with.
  • Other providers/wider community: Ultimately, if there is a common set of specs, organisations have a common place to develop from, enabling them to create far more interoperability between solutions, boosting the network effect and ensuring that all participants gain the benefits of increasing PCF transmission.

With all this in mind, let’s move to what I mean by a functional specification, and how do we go from a PCF Methodology to a technology artefact useful for creating Open Source software?

What are functional specifications and how do they differ from technical specifications?

I am going to be leaning on the Wikipedia article for Functional Specifications quite a bit in this section, and much of what I say will simply be paraphrased from that article, because the community of writers there have done a stellar job of it, if you wish to dig into this enthralling topic further, then you can indeed do so using the link provided above, and I will also provide direct links in the writing when I have used it as source.

What is a functional specification?

The wikipedia definition of “Functional Specification” is as follows:

A functional specification (also, functional spec, specs, functional specifications document (FSD), functional requirements specification) in systems engineering and software development is a document that specifies the functions that a system or component must perform (often part of a requirements specification)

And to provide an extra definition just for illustrative purposes, Joel Spolsky defines it as such:

A functional specification describes how a product will work entirely from the user’s perspective. It doesn’t care how the thing is implemented. It talks about features. It specifies screens, menus, dialogs, and so on.

So, the functional specification defines what functionality the system can be expected to perform, without defining how it will perform these functions, taking directly from Wikipedia, the primary purposes are:

  1. To let the developers know what to build.
  2. To let the testers know what tests to run.
  3. To let stakeholders know what they are getting.

As an example, a functional requirement in our functional specification might state as follows:

When the user clicks the “Create Scope”  button, the form has a field presented which allows the user to enter “additional GHGs” to be included, outside of the scope of “required GHGs” automatically included as per the requirements.

What is a technical specification?

Moving onto the next stage of software development, and a stage that is commonly confused, combined or generally misunderstood, is the technical specification, where Joel Spolsky has this to say:

A technical specification describes the internal implementation of the program. It talks about data structures, relational database models, choice of programming languages and tools, algorithms, etc. 

This is ultimately where the rubber hits the road for the winged monkeys you have caged programmers, who in our situation, would use the functional specification to generate a technical specification, from which they will have a realistic idea of the exactness of the technology they intend to produce. For what an example of such a technical specification would, could and should look like, there would be no better example than the PACT Technical Specification.

If you clicked on the PACT Technical Spec, and were left wondering, “what is going on here?”, then we have arrived at a useful illustration of the difference between the functional specification and the technical specification, which is the audience and intended readership. If you looked at the technical specification and thought, “I have no clue what this is talking about”, that would be because the intended users are software practitioners, that’s the purpose of this process, converting the various outputs into another output that can be understood by the relevant stakeholders of the software development process.

A complete, detailed and easy to understand overview of Functional Specifications:

If you wish to dig deeper into this, here are links to the entire series of articles written by Spolsky on this topic, they are an excellent supplement to this blog and what I intend to do with this series.

  1. Painless Functional Specifications – Part 1: Why Bother?
  2. Painless Functional Specifications – Part 2: What’s a Spec? – Joel on Software
  3. Painless Functional Specifications – Part 3: But… How? – Joel on Software
  4. Painless Functional Specifications – Part 4: Tips

Choosing an appropriate methodology for PCF calculation

My intention with this blog series, is to detail the steps taken in selecting a common Product Carbon Footprint methodology and producing a format useful for software engineers and developers to develop commodity code from, my intention is to generate a usable functional specification that can be read and understood by anyone with an interest, and so that I can feed that into a form of technical specification for the creation and extension of ZeroTwentyFifty’s Product Carbon Footprint solution.

The current state of PCF use and exchange

As a result of the most recent round of conferences such as NY Climate Week, data highlighting trends within the PCF exchange space has been made more readily available, from both solutions providers and PACT itself. Here are some of the realities of PCF use right now across the PACT Network.

PACT Statistics

Some stats and figures:

  • It takes months to calculate a PCF currently, which explains the migration from the synchronous model of PCF exchange to an asynchronous model of PCF exchange present in version 2.3 of the technical specification.
  • Much of the exchange of PCF data is not happening via API, but in other shared formats. Regarding how long it takes companies to calculate a PCF, there is only anecdotal evidence so far but typically it takes on the order of months to calculate PCFs. Many companies do not already have a calculation engine and/or process setup. This process usually involves manual calculations via spreadsheets or modelling using tools like Sphere/Gabi. They are also often working with external consultants to do this, as they lack the internal expertise / capacity. The goal is to support an ecosystem to develop solutions that can more rapidly calculate PCFs.
  • Solution Providers such as Altruistiq noted that the total number of PCFs calculated within their solution is around 83x the amount of PCFs exchanged, indicating that organisations are interested in the creation of more granular product footprint data.

The need for realistic PCF data

During the development of ZeroTwentyFifty and its PACT capabilities, it has been a constant battle of how to create realistic looking data due to the software currently not having the creational capabilities of a standalone carbon accounting solution. I’ve gotten around this issue by using fake data generated by hand, untethered to any real product or service. This has been an enduring issue due primarily to the nature of the development of the PACT Network and the underlying Data Exchange Protocol, it has been built with the intention of being an interface to a pre-existing solution, in our situation this would be a fully fledged Carbon Accounting/Product Carbon Footprinting platform or suite, that had the additional requirement for PCF data exchange across a supply chain.

I started building ZeroTwentyFifty largely as a way of learning more about and understanding PACT, PCF, Carbon Accounting and Decarbonisation. When I had something complete regarding the PACT Network, like our PACT Conformance, I figured that it could be a useful tool to organisations looking to add PACT Network capabilities to their own software, as it could be used as a proxy or a small service that handled for communication, however, this hypothesis ended up not really working out, I think as a combination of multiple things, one being it’s just not enough to use, along with the space being so nascent and communicating this as a feature set being a bit difficult.

Filling a gap in the solution

Over progressive months of conversations with potential investors, customers and users, a common theme has unveiled itself in that organisations and people actually have a need to create PCFs. There aren’t large databases of PCFs just sitting around waiting for users to ask for them, which makes sense given the state of the space, which is growing rapidly but the volumes are still ultimately “low” in the sense of global commerce and how far we have to decarbonise.

It also makes sense that this is the way it currently works because of the migration to the asynchronous method of communication encouraged by recent changes to the specification, this is highlight fairly sincerely in some of our above points 

Choosing between the options

In light of these conversations, it became clear that I needed to support a carbon accounting methodology that could be used to produce data. The GHGP Product Standard is one of the accepted Methodologies listed in the DataAttribute cross-sectoral-standards - GHGP Product, and given that GHGP is the most dominant family of methodologies, it made sense to start here, especially given the pre-existing work in PACT.

In the case of the PACT Methodology, there are currently three cross sectoral standards defined for the representing the calculation of results: 

  • GHG Protocol Product Standard
  • ISO Standard 14067
  • ISO Standard 14044

And in coming versions, the set of methodologies is being expanded to support:

  • ISO 14067
  • PACT Methodology v1
  • PACT Methodology v2
  • GHG Protocol Product Standard
  • PAS 2050
  • ISO 14040-44
  • PEF
  • Other

I originally planned to talk about choosing a PCF Methodology in this post, however, the size of the writing became quite large and there were a variety of concepts and thoughts I wanted to address that I didn’t feel I’d do justice to by speeding it out and shortening it. So maybe subscribe to the newsletter so you’ll be alerted when I release it.

In the next part of this series, I will be outlining a process that we (because you’ll be able to replicate it yourself if you feel the need) can use to generate a functional specification through interacting with an LLM, I’ll be covering the basic structure of the process, how we want our inputs and outputs to look and how we define success. If you’d like to be emailed directly with this when it is released, sign up to our newsletter.

If you’ve resonated with this article, I’d really appreciate you sharing this article on whatever platforms you use. Alternatively, you can follow ZeroTwentyFifty or add me on Linkedin. I release all writing on our free newsletter. You can also book a 30 minute no-obligation call with me.

Commit to Transparency

We build solutions, we are engineers. We help organisations gain clarity, gain understanding, but mainly, we help organisations gain transparency. Let's build the future together.

schedule a call

Don't miss these stories: