Metaspex: the Holy Grail Definitive Answers to the Hardest Software Problems

Posted by:

|

On:

|

The Clay Institute has published a series of unresolved mathematical conjectures called the Millenium Problems. Whether each conjecture is true or not constitutes an independent open problem. These problems are reputed to be both the most significant and the hardest ones that mathematics are currently facing. Their unlocking would translate into massive breakthroughs with important ramifications. There are 7 of them, and so far only one has been solved: The Poincaré conjecture. It is not known yet if the other problems will ever have an answer.

Similarly, 8 key Software problems have obsessed Computer Scientists for quite a while now. They are all unresolved to this date, and finding a solution to them will shake up the entire Technology Industry, as many people believe now that they have no solution.

Metaspex provides a striking, original, and possibly unique answer to each of them.

As a homage to Douglas Adams’ The Hitchhiker’s Guide to the Galaxy, Metaspex makes extensive use of the prefix “hx2a”, which is the hexadecimal value for “42”, the answer given in the Hitchhiker’s Guide to the Galaxy to the Ultimate Question of Life, the Universe, and Everything. For Software these are the 8 questions.

So, what are these 8 questions? Explaining them is the object of this page. Giving the actual answers is beyond the scope of this small text, but we, at Metaspex will be happy to explain them in details would you be interested. Most of these answers rely on a great deal of (yet unpublished) science, and even if you were to look at Metaspex’s internal code, it would not clarify why it works. Only theorems do.

Here are the 8 ultimate millenium questions:

1. How can we produce automatically modern online Enterprise Applications from a high level of abstraction and benefit from performance levels better than the ones hand-written?

2. How can we manage automatically objects (memory chunks) lifecycle efficiently with an immediate destruction guarantee, without preventing cycles?

3. What information modeling technique should be used to replace the relational model and unleash the object-oriented approach in Enterprise Applications?

4. How can we eliminate SQL from online Enterprise Applications and replace it with a more efficient and intuitive rule-based system?

5. How can we define and manage general information referential integrity, and check it and maintain it automatically?

6. How can we make Enterprise Applications highly serviceable, including around data model customizations?

7. How can we engineer reusable Enterprise Application business components so that we can put together a components catalog for each industry and build solutions simply by making a selection into it?

8. How can we specify in a rule-based manner complex and multiple structured information representations, and make incremental recalculations?

Metaspex gives an answer to each of these questions. The answer in each case is not only surprising and enlightening, it might very well be the only answer. The reason for that is that these questions are so incredibly hard that there is possibly only one answer for each of them. To come up with these answers has required to try thinking creatively and independently, sometimes writing off decades of accepted habits.

Let’s have a look at these questions.

1. How can we produce automatically modern online Enterprise Applications from a high level of abstraction and benefit from performance levels better than the ones hand-written?

4GLs have tried. Flow Charts of the 1970s and 1980s have tried. UML has tried hard. Design Patterns have tried. All have failed. Each of these have left some legacy, sometimes a great one (UML and Design Patterns for instance, in a distinct way, interestingly enough with Grady Booch present in both, respectively as a co-inventor of UML and as the author of the preface of the Gang of Four book).
Every time the goal was to be concise, to avoid unnecessary complications and let the machine perform the tedious implementation. Except in the case of Design Patterns which became a very useful low-level tool for developers, all the other attempts yielded either very low quality output, slow and clunky, or it was impossible to produce an application with the level of fidelity expected (e.g. with UML). Why so?
This is because the path for conciseness is relatively easy, but being precise at the same time makes this path extremely difficult. Being concise and precise at the same time is key. When conciseness was achieved, precision was lost. And when it was attempted to increase precision, things got complicated, and conciseness was lost. The solution, if it existed, was proving to be very evasive and slippery.
The same thing happened about performance. The more concise the more sluggish the automatic implementation became. There is actually a too little-known and very complex theorem explaining why it is the case: Rice’s Theorem. Running the risk of oversimplifying it, let’s say that it is impossible for a machine which implements automatically a high level specification to “invent” what is not already present in it explicitly. This sheds clarity in particular, over why some programming languages are intrinsically and hopelessly slow, and some other faster to implement some algorithms. Information is missing, and Rice’s Theorem (1951), which has a brilliant lineage from Georg Cantor to Albert Einstein‘s friend Kurt Gödel and the father of computing Alan Turing, showed that there is no way to overcome this. Dreaming of fighting theorems is, as we know, a futile endeavor.
So what does it take to follow the narrow and fragile path through conciseness and preciseness towards performance? It requires creating the proper abstractions that allow the compiler to obtain in the input everything it needs to perform implementation tasks perfectly.
Metaspex comes with these abstractions, which are new, original, and powerful.
But how is it possible that a compiler can take a technical specification and beat a programmer at his own game? Well, machines are already very capable of literally beating human beings at their own games of Chess or Go. What it took to achieve this are breakthroughs in algorithmics. And although many were skeptical that it would ever happen, it did happen.
In the field of Software programming we are actually all aware of examples of machines doing a better job than programmers. There is no reason to believe in an absolute Abstraction Penalty. Give us an example you say? Modern C compilers give excellent ones. Today, in the overwhelming majority of cases (well beyond 99.999% of the cases), a C compiler does a much better job at writing assembler code than a professional assembler programmer given a reasonable amount of time and quality objectives. Why? It is easy to understand, in a C compiler are “factored in” innumerable optimization techniques, most of which are completely inhuman, and beyond the capabilities of a programmer to implement systematically and accurately. In contrast to a human being, a machine is restless, tireless, accurate, and does not care whether the assembler it produces can be read or maintained. As a result it produces assembler code that no human being would produce by hand. Think of Tetris, it’s the machine which identifies patterns and “collapses” them. A human being would not do that reliably over and over again. A machine does.
Keep in mind that the goal of the folks who invented C was not to let C be capable to describe anything that an assembler programmer would write. It’s not about imitating human work, it is about being superhuman. So when Metaspex does a better job than a programmer, it does not mean that it would do the exact same job as a programmer of course. It does it differently and better. Evidently, the level of abstraction granted by the C programming language is low. But it shows that it is quite possible to increase abstraction and at the same time get better performance. There is no curse.
So there is no limit to stacking abstractions on top of one another, while being very careful that each of them can be properly implemented by a compiler. Stack a sufficient number of innovative abstractions, and inexorably you reach a specification-level of abstraction. And if you were careful, that specification is both concise and precise, while capable of being implemented automatically far better than what a human being would do.

2. How can we manage automatically objects (memory chunks) lifecycle efficiently with an immediate destruction guarantee, without preventing cycles?

The Software Industry has struggled and continues to struggle with that excruciating problem. Managing explicitly objects destruction is known to be error-prone, and to lead to bugs which are extremely difficult to find, only superseded in this domain by multithreading issues. For long, practictioners were divided between the ones fond of garbage collectors and the ones skeptical about them, finding them costly, lazy and an utterly imperfect solution to the problem. Initially, Java was promoted with its garbage collector as the solution to the problem, only to let programmers discover that it did not take care of ever-growing memory leaks in global datastructures. When programmers were performance-constrained, they were told that they were better off not using it and avoid creating objects by pooling them in a poor-man and naive implementation of the lower levels of malloc (a technique long-known in the Lisp community as the “no-cons” advice, contradicting the virtues of garbage collection that contributed to bringing people in in the first place). On top, garbage collection offers no guarantee whatsoever about when an object is destroyed, and in software stress situations, requires cumbersome if not esoterical tuning.
For a long while, it seemed that we were condemned either to the evils of garbage collection and its essentially blind navigation of the heap, hampering vertical (multi-processor and multi-core) scalability, or manual memory management and the stressful threats on software quality.
Interestingly enough, C# made a number of breakthroughs in its CLR virtual machine, mixing garbage collection with reference-counting. This feature is known to have contributed (alongside its generics implementation doing reification like C++ rather than type erasure like Java) to its somewhat better performance.
For long, reference counting was considered evil, as it does not cope well with isolated reference cycles (in the heap), as objects end up uncollected in a fatal deadlock, none of the objects involved in the cycle willing to set the next ones free.
Today, the Rust programming language, to improve performance has surrendered completely to reference counting, letting garbage collection behind, and just telling users to avoid reference cycles. Alas cycles are part of life, and this choice is very restrictive.
So we’re now sitting between a rock and a hard place, with no satisfactory solution. Either software is slow and objects automatic destruction offers no guarantee, or we are prevented to describe the world with its natural cycles.
Metaspex proposes a subtle solution in between these two techniques, which is far more efficient than garbage collection, and even more efficient than the restrictive approach proposed by Rust. It does not use any garbage collection, and does not prevent from forming cycles.
The secret comes from referential integrity, which gives to automatic object lifecycle management clear guidance on what to do, and breaks cycles in an intuitive and logical manner whenever needed. This more refined solution results from the fact that referential integrity operates at an abstraction level which is much higher than the mere memory pointing techniques used by all programming languages, which barely evolved from machine language, adding merely only a type, and sometimes polymorphism. Metaspex is much more precise, and through concise input from the user, derives automatic referential integrity.

3. What information modeling technique should be used to replace the relational model and unleash the object-oriented approach in Enterprise Applications?

If you are developing an enterprise application today, chances are that you either use a relational database system, and its associated schemas, or a schema-less database such as modern document databases. In the first case you use what’s called a “relational model”, and schemas which can be derived from the entity-relationship diagram (as proposed by Peter Chen in 1976). In the second case you are left with mapping your own business model to JSON or JSON-like tree datastructures.
The issue with the relational model is that it is a closed one, insulated from whatever you describe with your programming language (which is probably object-oriented). Maintaining a mapping is costly, error prone, and shocking from a design standpoint. On top, relational databases are known to struggle with horizontal scalability, as they have cultivated an obsession over consistency guarantees, at the expense of more useful (and more revenue-friendly for enterprise applications) scalability, availability and resilience properties. Additionally, every attempt so far to implement automatically an Object-Relational Mapping have failed miserably, as they are unable to reconciliate efficiently the object-oriented world in which we program, with the relational world which insists on its quirky ways.
The issue with the document model is that it is provides very little help. Document databases offer very nice distributed filesystem facilities with additional indexing, but they offer only that. They are extremely good at doing this, but they fail at offering any kind of modeling. MongoDB for instance offers its DBRef, which is affected by very stringent restrictions, and which is barely used by MongoDB itself. This weakness can be perceived when looking at relational databases schemas. The relationships described in a schema are completely absent from document databases, which only offer intra-document relationships that the relational model ignores completely.
Metaspex combines both harmoniously in a more sophisticated unified approach. It offers a wide portfolio of pre-programmed original macro-patterns to formalize a domain’s datamodel. Metaspex calls this an Ontology, a term which originates in Philosophy and which found its way into Information Science and more specifically Artificial Intelligence. When deeply rethinking the way to approach information modeling, and part ways from the vague and flimsy entity-relationship diagrams and their relational implementation, what can be best than adopting the long-known knowledge acquisition techniques? Metaspex pushes these concepts several notches further, offering to structure information along the long-known lines of syntax and semantics. These are well-grounded in science, and predate by a long stretch the relational model, the entity-relationship diagrams and even the object-oriented approach (while being in complete harmony with the latter).

4. How can we eliminate SQL from online Enterprise Applications and replace it with a more efficient and intuitive rule-based system?

SQL is a programming language. It falls short of being fully Turing-complete by not including loops. It smells COBOL, and operates by cross-cutting the entire datamodel. Evolving the datamodel means most of the time evolving a lot of the hard-to-read, hard-to-maintain, and hard-to-tune SQL snippets across the application. Sometimes these snippets grow uncontrolled, and reach multiple horrendous pages in size. At this moment, using an object-oriented language means coping with clunky SQL statements which create a difficult impedance issue.
All this means that reaching a high level of abstraction prohibits the use of SQL. Alas, to date, no replacement has been proposed, and we are carrying the burden of that ancient beast.
From a performance standpoint, practitioners know that you need to fight SQL to make it perform in an efficient way. It is a neverending exhausting struggle. You give it hints, and pray that they are going to be followed. SQL is a 4GL which propagated everywhere, like a virus. Like 4GLs, it is affected by the same issues: performance, clunkiness, it is very sensitive to datamodel evolutions. It is quite clear something is wrong, and SQL is not addressing the problem the right way.
To fix this once and for all, Metaspex proposes a drastically different model, which is based on adding Semantics to Ontologies. Metaspex Semantic Rules are attached to the types in the Ontology (the syntax), they are purely rule-based, completely object-oriented, compiled and verified. They execute automatically and incrementally, letting business analysts review them, and validate them. From a performance standpoint they outperform SQL, while not being affected by SQL’s opacity.
Metaspex Semantic Rules can even perform some joints in constant time!
Freed from SQL, Metaspex technical specifications can truly be called specifications, and be concise, while being easy to read and to understand, including by a business analyst.

5. How can we define and manage general information referential integrity, and check it and maintain it automatically?

Relational databases have claimed the “referential integrity” expression for too long a while. With their convoluted constraints (that very few people dare to use in real-world applications), added to their very “soft” schemas (graph databases exhibit the same “spineless” data modeling issue) they achieve in an awkward manner some level of information integrity. However, they fuse together two distinct levels of abstraction (physical lifecycle of information and logical maintenance of it), creating a great deal of confusion, and making their poor-man version of referential integrity impractical.
Metaspex does not require more work from the user than specifying their ontology, to perform a wide superset of what relational databases call referential integrity. Referential integrity checks and automatic maintenance are performed by Metaspex without intervention. Referential integrity operates at a logical level of abstraction, maintaining documents in a proper state, conforming to the specification. Physical objects lifecycle, and in particular objects automatic destruction is performed at a distinct and lower level of abstraction, avoiding the design mistake that RDBMS made.
It is thanks to its carefully stacked levels of abstraction, all captured in pre-programmed macro patterns, that Metaspex is capable of implementing real referential integrity, across documents, across database buckets, across database instances, across database distinct products (Couchbase and CouchDB for instance), across machines, and even across datacenters world-apart. This is truly revolutionary. Referential integrity is at last open, and not subject to clunky techniques within a closed relational database system.

6. How can we make Enterprise Applications highly serviceable, including around data model customizations?

Software editors (whether traditional with customer-side implementations or modern SaaS ones) are very familiar with the issue. They build a product, sometimes hand-in-hand with a first customer, the idiosyncrasies of the customer are factored into the software (at process and datamodel levels), and when a new prospect is identified, large chunks of the product do not fit and need to be reworked. The editor is left to decide between three evils:

  • Force-feeding the product with the new requirements and making a new version of the product combining incompatible old and new requirements. Unavoidably, it leads to iterating over this, customer after customer, and making the product gradually more complex, reaching a situation where it cannot grow, cannot be evolved, and dies of featuritis. Expert systems help a little postponing the inevitable, but they are just a brief relief. More standardized domains also delay a bit the inevitable (e.g. the Airline Industry). The product ends up bloated, with a human wave of programmers struggling to maintain it just afloat.
  • Force-feeding the customers into the existing product, and keeping saying no to their requests. Customers are upset, the editor runs the risk of seeing them join competition. Nothing is learnt from additional customers for the sake of the evolution of the product.
  • Implementing the new features for the prospect, lying to them, and hiding the fact that the code has actually been copied, and a new irreconciled branch of the code is being implemented for them. When the prospect, now customer, realizes that he does not benefit from new features offered to other customers, the editor has to spill the beans, reveal the secret, and ask for a hefty amount of money to implement the very costly reconciliation project. The editor has to pray that the customer will stay and will forgive, as he is tied into the product.

All this is awful, none of these options is satisfactory. You would like to be able to implement a product, with variations from customer to customer, at a very deep level, not only touching up or deeply adjusting processes, but even the datamodel itself. You would like to be able to maintain both the product and the implementations, completely independently, while retaining the link between the product and its implementations. New features added to the product (community functionalities) must make their way automatically towards the existing customers. Behaviors down to datamodel specializations must still be able to be applied, or reworked at minimal cost, when a new version of the product is released.
Philosophy has long been debating the problem. It is not specific to the Software Industry. The precious link between specific implementations and the product is called by Philosophy supervenience. So in Philosophical terms, the issue with the three options above is that there is no supervenience preserved between the customizations and the product. Either customizations do not exist, or supervenience is severed upfront.
Here also, Metaspex proposes a more sophisticated solution than the three crude ones listed above. Metaspex allows to preserve supervenience, down to datamodel customizations. To do that, Metaspex uses very sophisticated methods, combining in a unique fashion meta and object-oriented programming. Metaspex allows to create both meta-ontologies and polymorphic ones, or a mix of both. At the same time, the good news are that there is no need whatsoever to understand that level of complexity. Metaspex hides it safely inside its macro patterns, and the complexity does not perspire to the users. Customizations are simple and intuitive, all operating at a high level of abstraction. No need to resort to chisel and hammer. That technique, preserving supervenience is possibly the only proper answer to this difficult Millenium problem, and Metaspex is the only system that implements it.

7. How can we engineer reusable Enterprise Application business components so that we can put together a components catalog for each industry and build solutions simply by making a selection into it?

Business-oriented components catalogs has been a dream since the 1980s. A component catalog for Telecoms Applications, for Travel, for ERP, or for any domain-specific Enterprise Applications would be extremely useful. Building applications would merely require finding the set of components needed, assembling them in a straightforward manner, and producing quickly sophisticated applications. This is what Integrated Circuits have delivered to the Electronics Industry since the 1970s, with Apollo 11 being one of its little-known and spectacular achievements.
The lack of appropriate abstractions (for instance the absence so far of a technique allowing to part ways from SQL which is dragging abstraction down, or flexible frameworks allowing very high servicability) has prevented the Software Industry from achieving this dream.
Metaspex’s ontologies allow components catalogs to exist. The combination of all the previous solutions to the “Millenium” problems opens up this new horizon. The Software Integrated Circuits are born.

8. How can we specify in a rule-based manner complex and multiple structured information representations, and make incremental recalculations?

Software developers have been familiar with the Model-view-controller design patterns since its inception in the 1970s. It allows to facilitate software programs architecture to distinguish a model from its view(s). MVC fails to capture the subtleties of the dependencies between a model and its views. Models and views are structured pieces of information, and these structures are interdependent. Modifying the model (nowadays aka “document”) should trigger the modification of all its views, themselves aptly described as documents. Views updates should be incremental, meaning that limited modifications of the model should trigger limited and minimal modifications of each of its views.
If we want to stay at a very high level of abstraction, the rules governing the way views are calculated and recalculated based on modifications of the model they represent should be declarative.
There are striking parallels between this goal and what Relational Databases call “materialized views”. Unfortunately, Relational Databases do not offer any powerful way to describe datamodels, and even less so declarative rules, capable of incrementality, to tie relational data with their materialized views.
Metaspex treats views and models the same uniform way. Domain-specific types are described in ontologies, and views follow the same principle. Views are documents which follow the exact same modeling techniques. The link between the two, which articulates the way views are calculated and whenever necessary recalculated is expressed using semantic rules. In essence, Metaspex says that representating information is a semantic calculation.
As views are ontologies, it is possible to build portfolios of reusable views, that can be leveraged whenever a document needs to be represented. As an illustration, Metaspex’s built-in Foundation Ontology offers a “boxes” ontology which resembles very much CSS boxes, with automatically calculated position, height and width. Specifying how a document is represented into boxes is a matter of lining out a few semantic rules. The geographical properties of boxes are themselves semantic attributes, which are also incrementally recalculated whenever the underlying document is updated.
A given document can have multiple views belonging to as many representation ontologies as needed, all at the same time. Whenever the document is modified, each view is updated incrementally.
Multiple views can materialize in the GUI of a workstation, across workstations, or even just for use of different systems without any GUI whatsoever. Or a combination of both…
Philosophically, what Metaspex does is to distinguish between information’s noumenon (the “thing in itself”) and its phenomenons (the views), and adds on top that the relationship between the noumenon and its phenomenons is of semantic nature.
Of course, Metaspex does not place any limit in the number of views for a given document, and does not even place any limit on the ability to have views on views, with unlimited amounts of semantic layers stacked on top of each other, always preserving incrementality.