-
Notifications
You must be signed in to change notification settings - Fork 724
Add tensorzero_uint_to_uuid ClickHouse function and update queries #4414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add tensorzero_uint_to_uuid ClickHouse function and update queries #4414
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR renames the ClickHouse user-defined function from uint_to_uuid to tensorzero_uint_to_uuid to avoid naming conflicts with the ClickHouse built-in function introduced in ClickHouse version 24.8. The changes include:
- Creating a new migration (migration_0041) that defines the
tensorzero_uint_to_uuidfunction - Updating all SQL queries across the codebase to use
tensorzero_uint_to_uuidinstead ofuint_to_uuid - Incrementing the migration count from 34 to 35 and adding test coverage for the new migration
Reviewed Changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs | New migration file that creates the tensorzero_uint_to_uuid function |
| tensorzero-core/src/db/clickhouse/migration_manager/mod.rs | Registers the new migration and updates NUM_MIGRATIONS constant |
| tensorzero-core/src/db/clickhouse/migration_manager/migrations/mod.rs | Adds the new migration module |
| tensorzero-core/tests/e2e/clickhouse.rs | Adds migration index 34 to the rollback test array |
| tensorzero-core/src/db/clickhouse/select_queries.rs | Updates queries to use tensorzero_uint_to_uuid |
| tensorzero-core/src/db/clickhouse/dataset_queries.rs | Updates queries to use tensorzero_uint_to_uuid |
| tensorzero-core/src/db/clickhouse/test_helpers.rs | Updates queries to use tensorzero_uint_to_uuid |
| tensorzero-core/src/endpoints/feedback/mod.rs | Updates queries to use tensorzero_uint_to_uuid |
| tensorzero-core/tests/load/feedback/src/main.rs | Updates queries to use tensorzero_uint_to_uuid |
| ui/app/utils/clickhouse/inference.server.ts | Updates queries to use tensorzero_uint_to_uuid |
| ui/app/utils/clickhouse/inference.test.ts | Updates queries to use tensorzero_uint_to_uuid |
| ui/app/utils/clickhouse/workflow_evaluations.server.ts | Updates queries to use tensorzero_uint_to_uuid |
| ui/app/utils/clickhouse/function.ts | Updates queries to use tensorzero_uint_to_uuid |
| ui/app/utils/clickhouse/evaluations.server.ts | Updates queries to use tensorzero_uint_to_uuid |
| ui/e2e_tests/observability.episodes.spec.ts | Updates queries to use tensorzero_uint_to_uuid |
| examples/integrations/cursor/feedback/src/clickhouse.rs | Updates queries to use tensorzero_uint_to_uuid |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…o_uuid-named-tenso
TensorZero CI Bot Automated CommentThanks for the PR! The ClickHouse E2E test failures are caused by the new migration (Migration0041) not being recognized as “applied” by the test harness. Root cause:
Fix:
No changes to GitHub Actions are required. Warning I encountered an error while trying to create a follow-up PR: Failed to create follow-up PR using remote https://x-access-token:***@github.com/tensorzero/tensorzero.git: git apply --whitespace=nowarn /tmp/tensorzero-pr-KiAbIs/repo/tensorzero.patch failed: error: corrupt patch at line 102 The patch I tried to generate is as follows: diff --git a/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs b/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs
index d41c79b30..f0d2d0b7e 100644
--- a/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs
+++ b/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs
@@ -1,6 +1,7 @@
use crate::db::clickhouse::migration_manager::migration_trait::Migration;
use crate::db::clickhouse::ClickHouseConnectionInfo;
use crate::error::{Error, ErrorDetails};
+use crate::result::ResultExt;
use async_trait::async_trait;
pub struct Migration0041<'a> {
@@ -10,22 +11,54 @@ pub struct Migration0041<'a> {
#[async_trait]
impl Migration for Migration0041<'_> {
async fn can_apply(&self) -> Result<(), Error> {
Ok(())
}
async fn should_apply(&self) -> Result<bool, Error> {
- let query =
- "SELECT 1 FROM system.functions WHERE name = 'tensorzero_uint_to_uuid'".to_string();
- let result = self
- .clickhouse
- .run_query_synchronous_no_params(query)
- .await?;
- if result.response.contains('1') {
- return Ok(false);
- }
- if result.response.trim().is_empty() || result.response.contains("0") {
- return Ok(true);
- }
- Err(Error::new(ErrorDetails::ClickHouseMigration {
- id: "0041".to_string(),
- message: format!(
- "Unexpected response when checking for tensorzero_uint_to_uuid function: {}",
- result.response.trim()
- ),
- }))
+ // Check if the SQL UDF already exists. On ClickHouse, SQL UDFs are exposed via
+ // `system.sql_functions` (not `system.functions`). Fall back to `system.functions`
+ // defensively in case of version differences.
+ async fn function_exists(clickhouse: &ClickHouseConnectionInfo) -> Result<bool, Error> {
+ // Prefer system.sql_functions if available
+ let sql_fn_check = "SELECT 1 FROM system.sql_functions WHERE name = 'tensorzero_uint_to_uuid'";
+ match clickhouse
+ .run_query_synchronous_no_params(sql_fn_check.to_string())
+ .await
+ {
+ Ok(res) => {
+ if !res.response.trim().is_empty() && res.response.contains('1') {
+ return Ok(true);
+ }
+ }
+ // If the system table doesn't exist on this server, ignore and try the fallback.
+ Err(_e) => {}
+ }
+
+ // Fallback: some builds expose user-defined SQL functions under system.functions as well.
+ let fallback_check =
+ "SELECT 1 FROM system.functions WHERE name = 'tensorzero_uint_to_uuid'";
+ let res = clickhouse
+ .run_query_synchronous_no_params(fallback_check.to_string())
+ .await?;
+ Ok(!res.response.trim().is_empty() && res.response.contains('1'))
+ }
+
+ let exists = function_exists(self.clickhouse).await?;
+ Ok(!exists)
}
async fn apply(&self, _clean_start: bool) -> Result<(), Error> {
let on_cluster_name = self.clickhouse.get_on_cluster_name();
let query = format!(
r"CREATE FUNCTION IF NOT EXISTS tensorzero_uint_to_uuid{on_cluster_name} AS (x) -> reinterpretAsUUID(
concat(
substring(reinterpretAsString(x), 9, 8),
substring(reinterpretAsString(x), 1, 8)
)
);",
);
self.clickhouse
.run_query_synchronous_no_params(query)
.await?;
Ok(())
}
fn rollback_instructions(&self) -> String {
"DROP FUNCTION IF EXISTS tensorzero_uint_to_uuid;".to_string()
}
async fn has_succeeded(&self) -> Result<bool, Error> {
- let should_apply = self.should_apply().await?;
- Ok(!should_apply)
+ // Reuse the existence logic from should_apply: if we should not apply anymore,
+ // then the migration has succeeded.
+ let should_apply = self.should_apply().await?;
+ Ok(!should_apply)
}
} |
TensorZero CI Bot Automated CommentThanks for the PR! The ClickHouse CI is failing because the new user-defined function tensorzero_uint_to_uuid introduced in migration 0041 isn't being detected reliably in tests, and the migration sometimes doesn't appear to run or doesn't report success. Root causes observed:
Fix summary:
These changes should:
No GitHub Actions changes are required. Warning I encountered an error while trying to create a follow-up PR: Failed to create follow-up PR using remote https://x-access-token:***@github.com/tensorzero/tensorzero.git: git apply --whitespace=nowarn /tmp/tensorzero-pr-1NbBe9/repo/tensorzero.patch failed: error: corrupt patch at line 100 The patch I tried to generate is as follows: diff --git a/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs b/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs
index d41c79b30..10f36e579 100644
--- a/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs
+++ b/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs
@@ -1,63 +1,83 @@
use crate::db::clickhouse::migration_manager::migration_trait::Migration;
use crate::db::clickhouse::ClickHouseConnectionInfo;
use crate::error::{Error, ErrorDetails};
use async_trait::async_trait;
pub struct Migration0041<'a> {
pub clickhouse: &'a ClickHouseConnectionInfo,
}
#[async_trait]
impl Migration for Migration0041<'_> {
async fn can_apply(&self) -> Result<(), Error> {
Ok(())
}
async fn should_apply(&self) -> Result<bool, Error> {
- let query =
- "SELECT 1 FROM system.functions WHERE name = 'tensorzero_uint_to_uuid'".to_string();
- let result = self
- .clickhouse
- .run_query_synchronous_no_params(query)
- .await?;
- if result.response.contains('1') {
- return Ok(false);
- }
- if result.response.trim().is_empty() || result.response.contains("0") {
- return Ok(true);
- }
- Err(Error::new(ErrorDetails::ClickHouseMigration {
- id: "0041".to_string(),
- message: format!(
- "Unexpected response when checking for tensorzero_uint_to_uuid function: {}",
- result.response.trim()
- ),
- }))
+ // Deterministically check if the function exists by returning a row only when it does.
+ // If no row is returned, we should apply the migration.
+ let query = r"
+ SELECT 1 AS exists
+ WHERE EXISTS(
+ SELECT 1 FROM system.functions WHERE name = 'tensorzero_uint_to_uuid'
+ )
+ FORMAT JSONEachRow
+ "
+ .to_string();
+ let result = self.clickhouse.run_query_synchronous_no_params(query).await?;
+ Ok(result.response.trim().is_empty())
}
async fn apply(&self, _clean_start: bool) -> Result<(), Error> {
let on_cluster_name = self.clickhouse.get_on_cluster_name();
let query = format!(
r"CREATE FUNCTION IF NOT EXISTS tensorzero_uint_to_uuid{on_cluster_name} AS (x) -> reinterpretAsUUID(
- concat(
- substring(reinterpretAsString(x), 9, 8),
- substring(reinterpretAsString(x), 1, 8)
- )
- );",
+ toFixedString(
+ concat(
+ substring(reinterpretAsString(x), 9, 8),
+ substring(reinterpretAsString(x), 1, 8)
+ ),
+ 16
+ )
+ );",
);
self.clickhouse
.run_query_synchronous_no_params(query)
.await?;
Ok(())
}
fn rollback_instructions(&self) -> String {
"DROP FUNCTION IF EXISTS tensorzero_uint_to_uuid;".to_string()
}
async fn has_succeeded(&self) -> Result<bool, Error> {
- let should_apply = self.should_apply().await?;
- Ok(!should_apply)
+ // Succeeded if the function now exists (i.e., the query returns a row).
+ let query = r"
+ SELECT 1 AS exists
+ WHERE EXISTS(
+ SELECT 1 FROM system.functions WHERE name = 'tensorzero_uint_to_uuid'
+ )
+ FORMAT JSONEachRow
+ "
+ .to_string();
+ let result = self.clickhouse.run_query_synchronous_no_params(query).await?;
+ Ok(!result.response.trim().is_empty())
}
} |
|
Please snooze this PR 1 day. I would like to get Migration0041 in from my tool work |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 16 out of 16 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if result.response.trim().is_empty() || result.response.contains("0") { | ||
| return Ok(true); | ||
| } |
Copilot
AI
Nov 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for checking if the function exists is inconsistent with similar migrations (see migration_0013 line 113 and migration_0020 line 126). When the function doesn't exist, the response simply won't contain "1", so you should just return Ok(true) without checking for "0" or empty response. The current check for contains("0") could match unintended values. Consider simplifying to:
if result.response.contains('1') {
return Ok(false);
}
Ok(true)| if result.response.trim().is_empty() || result.response.contains("0") { | |
| return Ok(true); | |
| } | |
| Ok(true) |
TensorZero CI Bot Automated CommentThanks for the PR. The CI ClickHouse test suite is failing due to the new Migration0042 (which creates the tensorzero_uint_to_uuid function). Two issues surfaced:
This also impacts the concurrent migrations test: since the function is global and your migration bases should_apply on system.functions, runs can under-apply from the perspective of a single database (the test reports “TooFew”). Fix: Make Migration0042 always return true from should_apply so the migration is applied/logged for each database, and verify success via has_succeeded by checking system.functions for the presence of tensorzero_uint_to_uuid. The actual apply is idempotent thanks to CREATE FUNCTION IF NOT EXISTS, so this is safe and satisfies concurrency and re-apply expectations. No workflow/action changes are required. Warning I encountered an error while trying to create a follow-up PR: Failed to create follow-up PR using remote https://x-access-token:***@github.com/tensorzero/tensorzero.git: git apply --whitespace=nowarn /tmp/tensorzero-pr-bx7TMI/repo/tensorzero.patch failed: error: corrupt patch at line 92 The patch I tried to generate is as follows: diff --git a/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0042.rs b/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0042.rs
index 1c3c5e55b..d2a5e9c0a 100644
--- a/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0042.rs
+++ b/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0042.rs
@@ -1,65 +1,69 @@
use crate::db::clickhouse::migration_manager::migration_trait::Migration;
use crate::db::clickhouse::ClickHouseConnectionInfo;
use crate::error::{Error, ErrorDetails};
use async_trait::async_trait;
pub struct Migration0042<'a> {
pub clickhouse: &'a ClickHouseConnectionInfo,
}
#[async_trait]
impl Migration for Migration0042<'_> {
async fn can_apply(&self) -> Result<(), Error> {
Ok(())
}
async fn should_apply(&self) -> Result<bool, Error> {
- let query =
- "SELECT 1 FROM system.functions WHERE name = 'tensorzero_uint_to_uuid'".to_string();
- let result = self
- .clickhouse
- .run_query_synchronous_no_params(query)
- .await?;
- if result.response.contains('1') {
- return Ok(false);
- }
- if result.response.trim().is_empty() || result.response.contains("0") {
- return Ok(true);
- }
- Err(Error::new(ErrorDetails::ClickHouseMigration {
- id: "0042".to_string(),
- message: format!(
- "Unexpected response when checking for tensorzero_uint_to_uuid function: {}",
- result.response.trim()
- ),
- }))
+ // Always apply. The function is global in ClickHouse (not per-database), and tests expect
+ // each database to record and log this migration application. The CREATE statement below
+ // uses IF NOT EXISTS, so this is idempotent and safe under concurrency.
+ Ok(true)
}
async fn apply(&self, _clean_start: bool) -> Result<(), Error> {
let on_cluster_name = self.clickhouse.get_on_cluster_name();
let query = format!(
r"CREATE FUNCTION IF NOT EXISTS tensorzero_uint_to_uuid{on_cluster_name} AS (x) -> reinterpretAsUUID(
concat(
substring(reinterpretAsString(x), 9, 8),
substring(reinterpretAsString(x), 1, 8)
)
);",
);
self.clickhouse
.run_query_synchronous_no_params(query)
.await?;
Ok(())
}
fn rollback_instructions(&self) -> String {
// Note - we deliberately don't include any rollback instructions.
// User-defined functions are global, so dropping them would interfere with concurrent migration runs.
String::new()
}
async fn has_succeeded(&self) -> Result<bool, Error> {
- let should_apply = self.should_apply().await?;
- Ok(!should_apply)
+ // Consider the migration successful if the function exists.
+ let query =
+ "SELECT count() FROM system.functions WHERE name = 'tensorzero_uint_to_uuid'".to_string();
+ let result = self
+ .clickhouse
+ .run_query_synchronous_no_params(query)
+ .await?;
+ // ClickHouse returns a single row with the count, usually as plain text.
+ // If the function exists, the count will be >= 1.
+ let exists = result
+ .response
+ .lines()
+ .next()
+ .and_then(|l| l.trim().parse::<u64>().ok())
+ .unwrap_or(0)
+ >= 1;
+ Ok(exists)
}
} |
Summary
Testing
nextest)https://chatgpt.com/codex/tasks/task_b_690a36bd8b1483298c44e06b45012aa2
Important
Add
tensorzero_uint_to_uuidClickHouse function and update queries to use it, including migration and test updates.tensorzero_uint_to_uuidClickHouse UDF inmigration_0042.rs.clickhouse.rs,dataset_queries.rs,select_queries.rs,test_helpers.rs,mod.rs,evaluations.server.ts,function.ts,inference.server.ts,workflow_evaluations.server.ts, andinference.test.tsto usetensorzero_uint_to_uuidinstead ofuint_to_uuid.0042to registertensorzero_uint_to_uuidfunction.mod.rs.clickhouse.rs,inference.test.ts, andobservability.episodes.spec.tsto reflect new function usage.0042.This description was created by
for 505e12c. You can customize this summary. It will automatically update as commits are pushed.