Skip to main content

What Can Category Theory Do for Me?

Category Theory is one of the hot topics in computer science. There are many blogs, youtube videos and books about it. It is an elusive subject, with the potential to be the ultimate unifier of everything (math, quantum physics, social justice), to allow us to write the best possible programs and many other lofty goals. I decided to explore the field and see how it can help me. Below is a summary of 12 weeks of reading and watching presentations about Category Theory.

It took me some time to select the study material. In the end I decided to use David Spivak’s book (Category Theory for Sciences, David Spivak, 2014) and the youtube recording of a 2017 workshop. These provided both a rigorous approach and a lighter version in the youtube videos. In parallel we explored other sources for a more practical perspective.

The CTfS book is a systematic introduction to Category Theory, with definitions and proofs. I liked that, it is a solid foundation. The examples are mostly from math (Set, Monoids, Graphs, Ologs, POSet, … ) and they are very dry to me. I had to look in other places for more practical examples.

A practical, engineer’s perspective that I relate to, is in this Topos Institute video: Angeline Aguinaldo: Diary of a software engineer using categories. In the Benefits and Challenges slide Angeline mentioned Graphical Visualization as a benefit. I am a visual person, I like diagrams for everything, but what is a play there? Why are diagrams helping to understand a problem or solution? Because there is a Category of Wire Diagrams with a subcategory of UML Class Diagrams, there is a Category of Java Classes and there is a Functor between them. We can move between these two domains easily because when we look at one we can immediately see the other (Angeline refers to this as “Parallel Universes”).. This works because composition is associative in both - we create bigger diagrams from smaller diagrams by connecting the wires, and similarly we can create complex Java types by associating “smaller” Java classes.

The same video lists a number of challenges in using CT. For example, hard to intuit. Things start easy, with the category of Set - most people understand this, but then there is a category of all small categories (Cat), and there is a category of all the functors between two categories Fun(C,D), and natural transformations are morphisms in Fun(C,D) - one can lost quickly as the abstraction progresses. There is this famous saying: “A monad is just a monoid in the category of endofunctors, what's the problem?”. What does it mean? I had to create a diagram to deconstruct this statement:


While confusing, the strength of the CT comes from the levels of abstractions that maintain some of the basic principles: we can compose functions, but also categories (product) and functors, and natural transformations. This allows us to use the same language at different levels of abstraction. This is (somewhat) similar to the concepts of “meta” in UML: we can have a UML diagram for UML. Also in the Set theory: there is a set, and functions can be seen as sets (f: A -> B is a subset of AxB) and relations are sets, … . I found useful the metaphor that through abstraction a large thing becomes a “point” in the next universe. For example, all the elements of a set become “one point” (i.e. one object) in the category Set. And the category of Set, with all the objects and morphisms inside, becomes a “point” (i.e. one object) in the category Cat. An intuition of this is looking at a world map: a city map will show buildings and streets, but if we zoom out cities become dots and roads between cities appear. Note: this intuition has limitations because CT is more concerned with the streets and roads (arrows) than buildings and cities (objects).

These are the levels of abstraction that I covered (but there are higher levels):
  0 - sets with functions between sets
  1 - the category of Set, Graph, Poset, Monoid
  2 - the category of Categories (Cat), Functors
  3 - Natural Transformations, Adjunctions

Another point from Angeline’s video is that CT can help to formalize Cognitive Activities. Functors allow us to map solutions between problem spaces: Word Problems -> Equations, Physical World -> Equations (isn’t this what physics is?), UML Diagrams -> Java Code, Servers -> Infrastructure Diagrams. Some of these are isomorphisms, some are just one-way. I think that Natural Transformations will be something like - the rules for mapping UML diagrams to Java classes can also be used to map UML Diagrams to ER-Diagrams to Database Schemas.

Conclusion

After spending some time studying Category Theory, I see three areas where I can use it. In my personal life, thinking about things, CT can provide a way to recognize patterns, formulate arguments as “this is the same like saying”, recognizing inconsistencies. I think that the ability to see that two things that look very differently are in fact similar is a strong indicator of (intellectual) intelligence. Eugenia Cheng has books and videos in this direction.

In my professional life, as a solution architect, I often move up and down between levels of abstraction. CT offers a framework to be explicit and precise about what level of abstraction is a conversation, document or diagram. It also allows me to move horizontally, between domains that are at the same level of abstraction using “design patterns.” Further reading: John Baez.

Finally, as a developer, the idea is that by moving some logic at a higher level of abstraction we can organize the code to be more readable, reusable and easier to test. Recent extensions of OO languages - like Java Optional, TypeScript async-await, React useEffect, useState - are applications of CT in the category of Abstract Data Types. Having an introduction in CT helps better understand the reasons and the ways to use these extensions. Further reading: pointfree.co.

Notes

I find it hard to provide examples for concepts in CT. This is because in examples we already select a category and it seems as if we are expressing concepts in that problem space rather than CT. For me, the categories that I know are: OO classes (loosely isomorphic with Hask, Abstract Data Types), UML Diagrams (a subcategory of Wire Diagrams), Equations. Inevitably it will seem that I am talking inside these categories rather than across categories. This is a problem with other sources as well. David Spivak’s book takes all the examples from Math which look very dry. The Computer Science examples are from Haskell or Scala and they seem very specific to that domain. In one of Spivak’s videos a person from the audience asks, can you give us an example of X and DS says: yes, here is a POSet, and the guy asks, can’t you give an example with cars and trucks and D says, no, those are not categories!

Popular posts from this blog

Performance Testing a New CRM

Performance testing  is challenging, frustrating, often underestimated typically getting attention only after an incident. How did we the performance test and what did we learn during the development and implementation of  web-services for a new CRM system?

On Defining Messages

“Defining Message Formats” is the title of a message posted on the Service Oriented Architecture mailing list [1]  which attracted a lot of attention.  The post summarizes  the dilemma faced by solution architects when they have to define a service interface: 1. Base the internal message format on the external message standard (MISMO for this industry). This message set is bloated, but is mature, supports most areas of the business, and is extensible for attributes specific to this company and its processes. 2. Create an XML-based message set based on the company's enterprise data model, which the company has invested a great amount of money into, and includes a very high percentage of all attributes needed in the business. In a nutshell, we've generated an XML Schema from ER Studio, and will tinker with that construct types that define the payloads for messages. 3. Use MISMO mainly for its entity definitions, but simplify the structure to improve usability. We benefit from the