Skip to content

Conversation

@lucyleeow
Copy link
Member

Reference Issues/PRs

Noticed while working on #32755

What does this implement/fix? Explain your changes.

Adds reference to the glossary term https://scikit-learn.org/dev/glossary.html#term-label-indicator-matrix for classification metrics

Not sure about the best term to use.

  • Avoided use of 'binary' i.e., binary indicator - binary here is talking about the indicator being 0 or 1, not about the data being binary. This may be confusing and this term is not included as an alias in the glossary so I have avoided it
  • when talking about binary data, I have used the term "label indicator format“ / ”label indicator matrix“ (depending on how best the wording fits into the sentence), to avoid use of 'multilabel' (as the data is not multilabel).
  • used 'multilabel indicator matrix' when talking about multilabel data

AI usage disclosure

I used AI assistance for:

  • Code generation (e.g., when writing an implementation or fixing a bug)
  • Test/benchmark generation
  • Documentation (including examples)
  • Research and understanding

Any other comments?

Comment on lines -505 to +507
True labels or binary label indicators. The binary and multiclass cases
True labels or :term:`label indicator matrix`. The binary and multiclass cases
expect labels with shape (n_samples,) while the multilabel case expects
binary label indicators with shape (n_samples, n_classes).
:term:`multilabel indicator matrix` with shape (n_samples, n_classes).
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've used both glossary aliases 'label indicator matrix' (when talking about binary data) and 'multilabel indicator matrix' (when talking about multilabel data). I think this is okay. I amended the glossary entry to clarify that it could be used for both binary and multilabel data...

This format can be used to represent binary or multilabel data. Each row of
a 2d array or sparse matrix corresponds to a sample, each column
corresponds to a class, and each element is 1 if the sample is labeled
with the class and 0 if not.
Copy link
Member Author

@lucyleeow lucyleeow Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also wonder if we should add a short example, like we do here: https://scikit-learn.org/dev/modules/preprocessing_targets.html#multilabelbinarizer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant