Re: [tsf-dev] TSF process feedback

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [tsf-dev] TSF process feedback - part 1

From: Derek M Jones <derek@xxxxxxxxxxxx>
Date: Wed, 29 Apr 2026 16:05:59 +0100
Delivered-to: tsf-dev@xxxxxxxxxxx
List-archive: <https://www.eclipse.org/mailman/private/tsf-dev/>
List-help: <mailto:tsf-dev-request@eclipse.org?subject=help>
List-subscribe: <https://www.eclipse.org/mailman/listinfo/tsf-dev>, <mailto:tsf-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://www.eclipse.org/mailman/options/tsf-dev>, <mailto:tsf-dev-request@eclipse.org?subject=unsubscribe>
Organization: Knowledge Software, Ltd
User-agent: Mozilla Thunderbird

Nathan,

> An estimate of a claim's completeness does exist in the scoring algorithm, there is just no way to express it on the
> nodes currently. I see no reason why this shouldn't be a feature, apart from a belief that humans won't be particularly
> good at estimating this, so I have opened an issue: https://gitlab.eclipse.org/eclipse/tsf/tsf/-/issues/515.

The available evidence for short tasks,
https://shape-of-code.com/2022/06/19/over-under-estimation-factor-for-most-estimates/ ,
which is replicated across many projects, finds that 33% of estimates are accurate,
66% are within a factor of two (higher/lower),
and 95% are within a factor of 4 (higher/lower).

The score in current TSF has no "meaning" (I think the docs state it's confidence/probability in a statement but in myview it does not function like this at all). That's not to say it's useless, it just functions as more of an indicator,it goes up if evidence is, in general, good, and down otherwise. It can help indicate where to direct your attentionduring continuous evaluation.


You have created a single number that encapsulates lots of complicated stuff.
Pointy-haired bosses are going to use this single number, whether you
like it or not.

Why all these complicated calculations to create a meaningless number?

Ssam: Trudag generates a score based on some maths that I don’t fully understand
The maths is actually quite simple behind all the symbols: score(claim) = claim_completeness * mean([score(claim) forchid in claim.children])


If a node has two children each with a score of 0.5,
then the score of the parent is:

   1) 0.25 if both children are required to be true
   2) 0.75 if at least one child is required to be true
   3) complicated formula for 2) and 3) if a correlation
      exists between the truth value of the children.

How might taking the mean be interpreted?

Ssam: All of these mails are building to the key selling point of TSF and the reason we use it: our assurance case isstored in Git, alongside the product we are building, and the two evolve together.


Storing meaningless numbers in Git is not a selling point.

I agree, the novelty of TSF comes from how the assurance case is continually managed and the accompanying processes.


All assurance cases are continually managed, and
every assurance case can claim to be novel.

I always treat novelty with suspicion.  It's not a selling point.

--
Derek M. Jones           Evidence-based software engineering
blog:https://shape-of-code.com

Follow-Ups:
- Re: [tsf-dev] TSF process feedback - part 1
  - From: Paul Sherwood
- Re: [tsf-dev] TSF process feedback - part 1
  - From: Derek M Jones

References:
- [tsf-dev] TSF process feedback - part 1
  - From: Nathan Warren

Prev by Date: [tsf-dev] TSF In The Context of Github
Next by Date: Re: [tsf-dev] TSF process feedback - part 1
Previous by thread: [tsf-dev] TSF process feedback - part 1
Next by thread: Re: [tsf-dev] TSF process feedback - part 1
Index(es):
- Date
- Thread

Breadcrumbs