Skip to content
This repository was archived by the owner on Apr 1, 2026. It is now read-only.

perf: Improve isin performance#1203

Merged
TrevorBergeron merged 10 commits intomainfrom
isin_join
Jan 29, 2025
Merged

perf: Improve isin performance#1203
TrevorBergeron merged 10 commits intomainfrom
isin_join

Conversation

@TrevorBergeron
Copy link
Copy Markdown
Contributor

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes #<issue_number_goes_here> 🦕

@product-auto-label product-auto-label bot added size: l Pull request size is large. api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. labels Dec 11, 2024
@TrevorBergeron TrevorBergeron requested a review from tswast January 17, 2025 00:16
@TrevorBergeron TrevorBergeron marked this pull request as ready for review January 17, 2025 00:16
@TrevorBergeron TrevorBergeron requested review from a team January 17, 2025 00:16
Comment thread bigframes/core/nodes.py Outdated


class AdditiveNode:
"""Definition of additive - if you drop added_fields, you end up with the descendent."""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A picture might help :-)

Suggested change
"""Definition of additive - if you drop added_fields, you end up with the descendent."""
"""Definition of additive - if you drop added_fields, you end up with the descendent.
.. code-block:: text
AdditiveNode (fields: a, b, c; added_fields: c)
|
| additive_base
V
BigFrameNode (fields: a, b)
"""

See https://stackoverflow.com/a/50956831/101923 for creating a plain text block

Comment thread bigframes/core/nodes.py
Comment on lines +480 to +481
def replace_additive_base(self, node: BigFrameNode):
return dataclasses.replace(self, left_child=node)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking aloud: I wonder if there's some way we can organize these sorts of tree transformations so that it's easier to reason about which can be applied in which order?

@TrevorBergeron TrevorBergeron enabled auto-merge (squash) January 29, 2025 18:50
@TrevorBergeron TrevorBergeron merged commit db087b0 into main Jan 29, 2025
@TrevorBergeron TrevorBergeron deleted the isin_join branch January 29, 2025 19:42
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. size: l Pull request size is large.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants