Skip to content
This repository was archived by the owner on Apr 1, 2026. It is now read-only.

docs: add sample for getting started with BQML#141

Merged
tswast merged 35 commits intomainfrom
bqml_tutorial
Dec 12, 2023
Merged

docs: add sample for getting started with BQML#141
tswast merged 35 commits intomainfrom
bqml_tutorial

Conversation

@DevStephanie
Copy link
Copy Markdown
Contributor

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes #<issue_number_goes_here> 🦕

@DevStephanie DevStephanie requested a review from a team as a code owner October 25, 2023 15:31
@DevStephanie DevStephanie requested review from a team, ashleyxuu and ohmayr October 25, 2023 15:31
@product-auto-label product-auto-label bot added size: m Pull request size is medium. api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. samples Issues that are directly related to samples. labels Oct 25, 2023
@tswast tswast mentioned this pull request Oct 25, 2023
4 tasks
@snippet-bot
Copy link
Copy Markdown

snippet-bot bot commented Oct 25, 2023

Here is the summary of changes.

You are about to add 1 region tag.

This comment is generated by snippet-bot.
If you find problems with this result, please file an issue at:
https://github.com/googleapis/repo-automation-bots/issues.
To update this comment, add snippet-bot:force-run label or use the checkbox below:

  • Refresh this comment

Comment thread samples/snippets/bqml_getting_started_test.py Outdated
Comment thread samples/snippets/bqml_getting_started_test.py Outdated
Comment thread samples/snippets/bqml_getting_started_test.py Outdated
Comment thread samples/snippets/bqml_getting_started_test.py
Comment thread samples/snippets/bqml_getting_started_test.py Outdated
Comment thread samples/snippets/bqml_getting_started_test.py Outdated
Comment thread samples/snippets/bqml_getting_started_test.py Outdated
Comment on lines +70 to +71
# When writing a DataFrame to a BigQuery table, include destinaton table
# and parameters, index defaults to "True".
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment has nothing to do with BigQuery ML models. Please fix.

Note: The important thing here is that we're taking our trained model and writing it to a permanent location.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, corrected.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: The comment is still talking about tables not models. I'll make a comment with a suggested edit.

@DevStephanie DevStephanie requested a review from tswast November 3, 2023 17:26
Comment thread samples/snippets/bqml_getting_started_test.py Outdated
Comment thread samples/snippets/bqml_getting_started_test.py Outdated
Comment thread samples/snippets/bqml_getting_started_test.py Outdated
Comment thread samples/snippets/bqml_getting_started_test.py
@@ -0,0 +1,13 @@
# Copyright 2023 Google LLC
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove this file for now. Let's do a separate PR for the K-Means tutorials.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file still needs to be deleted.

@DevStephanie DevStephanie requested a review from tswast November 6, 2023 18:13
Comment thread samples/snippets/bqml_getting_started_test.py Outdated
Comment thread samples/snippets/bqml_getting_started_test.py Outdated
@tswast tswast added automerge Merge the pull request once unit tests and other checks pass. and removed automerge Merge the pull request once unit tests and other checks pass. labels Nov 16, 2023

# The model.fit() call above created a temporary model.
# Use the to_gbq() method to write to a permanent location.
model.to_gbq("bqml_tutorial.sample_model", replace=True)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: We're getting

E           google.api_core.exceptions.BadRequest: 400 Concurrent update on same model: bigframes-dev:bqml_tutorial.sample_model is not supported. Share your usecase with the BigQuery DataFrames team at the [https://bit.ly/bigframes-feedback](https://www.google.com/url?q=https://bit.ly/bigframes-feedback&sa=D) survey.

failure in our test suite: https://fusion2.corp.google.com/invocations/8a7513c8-e7c9-4b5b-82fe-9a83c176fbc1/targets/bigframes%2Fpresubmit%2Fe2e/log

I think we'll need a test fixture for this to create a temporary place for the model and clean it up when the test finishes.

  1. Create a file called samples/snippets/conftest.py.

  2. In the conftest.py file you create, add a fixture called random_model_id, similar to this one: https://github.com/googleapis/python-bigquery/blob/f804d639fe95bef5d083afe1246d756321128b05/samples/snippets/conftest.py#L101-L111 except it'll call delete_model(...) instead of delete_table(...).

    You'll also need to add "prefixer" https://github.com/googleapis/python-bigquery/blob/f804d639fe95bef5d083afe1246d756321128b05/samples/snippets/conftest.py#L21 and bigquery_client fixture https://github.com/googleapis/python-bigquery/blob/f804d639fe95bef5d083afe1246d756321128b05/samples/snippets/conftest.py#L33-L36

  3. Update your code sample to use the new random_model_id fixture.

    Look how we do it in the remote functions test:

    your_gcp_project_id = project_id
    # [START bigquery_dataframes_remote_function]
    import bigframes.pandas as bpd
    # Set BigQuery DataFrames options
    bpd.options.bigquery.project = your_gcp_project_id
    but instead you'll be setting your_model_id = random_model_id and calling

    model.to_gbq(
        your_model_id,  # "project.dataset.model_id" or "dataset.model_id"
        replace=True,
    )
    



def test_bqml_getting_started():
# [START bigquery_getting_started_bqml_tutorial]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use bigquery_dataframes_bqml_getting_started for our region tags.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds great, will make edits now.

@DevStephanie DevStephanie requested a review from a team December 12, 2023 20:34
@product-auto-label product-auto-label bot added size: l Pull request size is large. and removed size: m Pull request size is medium. labels Dec 12, 2023
Comment thread third_party/geopandas/LICENSE.txt Outdated
@@ -0,0 +1,25 @@
Copyright (c) 2013-2022, GeoPandas developers.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert this

@product-auto-label product-auto-label bot added size: m Pull request size is medium. and removed size: l Pull request size is large. labels Dec 12, 2023
@tswast tswast added the automerge Merge the pull request once unit tests and other checks pass. label Dec 12, 2023
@tswast tswast merged commit fb14f54 into main Dec 12, 2023
@tswast tswast deleted the bqml_tutorial branch December 12, 2023 22:25
@gcf-merge-on-green gcf-merge-on-green bot removed the automerge Merge the pull request once unit tests and other checks pass. label Dec 12, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. samples Issues that are directly related to samples. size: m Pull request size is medium.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants