BindX issues and Jam vulnerabilities :roll_of_paper:

If there's one lesson to be learned from issue #502, in my opinion, it's that the `bindrule` function was used naively, and that's because it was designed in an irrational way.

Allow me to preface this, but I agree with Bjarne Stroustrup on one thing: I don't like the semantics of the `operator[]` in standard library containers, specifically that when a key doesn't exist, an item is created in the container anyway, just to return a valid reference. This behavior, which was chosen for efficiency reasons, is often the source of problems that are difficult to detect.
b2 also uses the same paradigm (once very popular) in several functions, I'm referring to `bindmodule`, `bindtarget`, and `bindrule`. All these functions, when they don't find the requested object, create a new one and return it.

In b2, the interface for search only has not been implemented. I added the `find_module` and `find_target` functions in #559.

In code where only one query needs to be performed, the call to `hash_find` is hardwired to the relevant hash table (`grep hash_find *.cpp | wc -l` currently finds 18.)

The main problem with bindX functions isn't how they're written but how/where they're used.

All built-in rules are called by passing them data from user scripts, which can therefore create (more or less unintentionally) objects in the process's memory given the lack of any checks! It may seem like the usual problem of not trusting data coming from the user, but the issue is more subtle.

I won't list the dangerous rules, but as a demonstration, here's a script that will quickly run out of memory, assuming the operating system doesn't somehow limit the amount of memory a process can use. Try monitoring it while it runs.

```
#|
bomb.jam

use with
b2 -f bomb.jam

|#

local i = 0 ;
while forever
{
    i = [ CALC $(i) + 1 ] ;
    ECHO $(i) ;

    # NOTE: creates a new module on each call regardless
    #       of whether the module exists, this will run out of memory
    #       unless limited in some way by the operating system
    IMPORT $(i) ;

    # NOTE: creates a new target on each call regardless
    #       of whether the target exists, this will run out of memory
    #       unless limited in some way by the operating system
    NOCARE $(i) ;
}
```

Even more important than the vulnerability issue, which probably no one cares about, is the fact that this behavior of bindX can contribute to hiding design errors. It's not a given that an object that was expected to be found, when instead it is synthesized by bindX because it is missing, will then produce an error in subsequent processing, as fortunately happened in the case of issue #502.

Unfortunately, precisely because of the semantics of the functions, it's impossible to consider modifying them, for example, to issue a warning every time an object is not found, so as to check for unexpected cases that no one has noticed until now, which probably represent errors.

In fact, we must assume that all calls to bindX are legitimate, meaning that whoever wrote them knew what they were doing. For example, during jamfile parsing, when b2 is building the dependency graph, it's natural for the objects not to exist, and when they do exist, it's correct to recall them with a bindX. I've also seen calls to `bindtarget` that, during the subsequent update phase, look for a semaphore that doesn't necessarily exist and should be created on demand <sup>[1]</sup>.

But in the general case, who can assure us that all the objects requested by bindX must be found? Or that if they aren't found, it's correct to synthesize new ones?

**Nobody, and the validity of every single call must be verified punctually! This is the main drawback to using functions like bindX.**

Conversely, using a find_X explicitly states the expectation regarding the expected results.

To eliminate the Jam vulnerability, suppose we now want to modify the builtin rule implementations to replace bindX calls with find_X calls and issue warnings when the requested objects are not found.

Let's take for example one of the many built-in rules, `IMPORT`.
Currently, if  `IMPORT` is asked to import from/to a nonexistent module, it has no problem calling the `bindmodule` on all the named modules (silently creating all the nonexistent ones), but if any of the rules it is asked to import are not found, it considers this an error and terminates the script. Obviously, it would be more correct if the error were produced early due to the absence of the modules, but this could cause backwards compatibility issues, so we will have to make do with issuing a warning.
And even if we decided to only issue a warning when a module does not exist, we would have to limit ourselves to checking only the source module. This is because among the Jam code accumulated over time there are those who now count on the possibility of creating modules in this way, such as `tools/types/register.jam` which creates a module for each type registered in the same directory (currently 18) just with an `IMPORT`
```
# A loop over all modules in this directory
for m in $(.sibling-modules)
{
    m = [ path.basename $(m) ] ;
    m = types/$(m) ;

    # Inject the type rule into the new module
    IMPORT $(__name__) : type : $(m:B) : type ;
    import $(m) ;
}
```


##### [1]
see the implementation of `OPT_SEMAPHORE` in [`make.cpp`](https://github.com/bfgroup/b2/blob/995da7c32f17ca22d7fb9f1b51ca0cb8d17468c7/src/engine/make.cpp#L365)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

BindX issues and Jam vulnerabilities 🧻 #565

[1]

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

BindX issues and Jam vulnerabilities 🧻 #565

Description

[1]

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions