Skip to content

Conversation

@Aaron1011
Copy link
Member

@Aaron1011 Aaron1011 commented Nov 4, 2025

Summary

  • add migration 0041 to install the tensorzero_uint_to_uuid ClickHouse UDF and register it with the migration runner
  • update runtime ClickHouse queries and tests in Rust/TypeScript to call tensorzero_uint_to_uuid instead of uint_to_uuid
  • bump the migration count/test harness to include migration 0041

Testing

  • cargo fmt
  • cargo clippy --all-targets --all-features -- -D warnings
  • cargo test-unit (fails: no such command nextest)
  • pnpm --filter tensorzero-node run build
  • pnpm run format
  • pnpm run lint
  • pnpm run typecheck

https://chatgpt.com/codex/tasks/task_b_690a36bd8b1483298c44e06b45012aa2


Important

Add tensorzero_uint_to_uuid ClickHouse function and update queries to use it, including migration and test updates.

  • Behavior:
    • Add tensorzero_uint_to_uuid ClickHouse UDF in migration_0042.rs.
    • Update queries in clickhouse.rs, dataset_queries.rs, select_queries.rs, test_helpers.rs, mod.rs, evaluations.server.ts, function.ts, inference.server.ts, workflow_evaluations.server.ts, and inference.test.ts to use tensorzero_uint_to_uuid instead of uint_to_uuid.
  • Migration:
    • Add migration 0042 to register tensorzero_uint_to_uuid function.
    • Update migration count in mod.rs.
  • Testing:
    • Update tests in clickhouse.rs, inference.test.ts, and observability.episodes.spec.ts to reflect new function usage.
  • Misc:
    • Bump migration count/test harness to include migration 0042.

This description was created by Ellipsis for 505e12c. You can customize this summary. It will automatically update as commits are pushed.

Copilot AI review requested due to automatic review settings November 4, 2025 18:13
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR renames the ClickHouse user-defined function from uint_to_uuid to tensorzero_uint_to_uuid to avoid naming conflicts with the ClickHouse built-in function introduced in ClickHouse version 24.8. The changes include:

  • Creating a new migration (migration_0041) that defines the tensorzero_uint_to_uuid function
  • Updating all SQL queries across the codebase to use tensorzero_uint_to_uuid instead of uint_to_uuid
  • Incrementing the migration count from 34 to 35 and adding test coverage for the new migration

Reviewed Changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs New migration file that creates the tensorzero_uint_to_uuid function
tensorzero-core/src/db/clickhouse/migration_manager/mod.rs Registers the new migration and updates NUM_MIGRATIONS constant
tensorzero-core/src/db/clickhouse/migration_manager/migrations/mod.rs Adds the new migration module
tensorzero-core/tests/e2e/clickhouse.rs Adds migration index 34 to the rollback test array
tensorzero-core/src/db/clickhouse/select_queries.rs Updates queries to use tensorzero_uint_to_uuid
tensorzero-core/src/db/clickhouse/dataset_queries.rs Updates queries to use tensorzero_uint_to_uuid
tensorzero-core/src/db/clickhouse/test_helpers.rs Updates queries to use tensorzero_uint_to_uuid
tensorzero-core/src/endpoints/feedback/mod.rs Updates queries to use tensorzero_uint_to_uuid
tensorzero-core/tests/load/feedback/src/main.rs Updates queries to use tensorzero_uint_to_uuid
ui/app/utils/clickhouse/inference.server.ts Updates queries to use tensorzero_uint_to_uuid
ui/app/utils/clickhouse/inference.test.ts Updates queries to use tensorzero_uint_to_uuid
ui/app/utils/clickhouse/workflow_evaluations.server.ts Updates queries to use tensorzero_uint_to_uuid
ui/app/utils/clickhouse/function.ts Updates queries to use tensorzero_uint_to_uuid
ui/app/utils/clickhouse/evaluations.server.ts Updates queries to use tensorzero_uint_to_uuid
ui/e2e_tests/observability.episodes.spec.ts Updates queries to use tensorzero_uint_to_uuid
examples/integrations/cursor/feedback/src/clickhouse.rs Updates queries to use tensorzero_uint_to_uuid

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions
Copy link
Contributor

github-actions bot commented Nov 4, 2025

TensorZero CI Bot Automated Comment

Thanks for the PR! The ClickHouse E2E test failures are caused by the new migration (Migration0041) not being recognized as “applied” by the test harness.

Root cause:

  • Migration0041’s should_apply/has_succeeded checks use `system.functions` to determine whether the SQL UDF `tensorzero_uint_to_uuid` exists. On ClickHouse, SQL UDFs are listed in `system.sql_functions`, not `system.functions`. As a result, after creating the function, the follow-up check still doesn’t find it, so the migration is deemed not applied, which triggers:
    • test_rollback_up_to_migration_index_34: fails with “Migration Migration0041 should have succeeded”
    • test_clickhouse_migration_manager: fails because it expects all migrations to be applied/logged
    • test_concurrent_clickhouse_migrations: fails with “TooFew” since the last migration is skipped

Fix:

  • Update Migration0041 to detect the function in `system.sql_functions` first, falling back to `system.functions` (for future compatibility), and reuse the same check in has_succeeded. This makes the migration reliable across ClickHouse versions.

No changes to GitHub Actions are required.

Warning

I encountered an error while trying to create a follow-up PR: Failed to create follow-up PR using remote https://x-access-token:***@github.com/tensorzero/tensorzero.git: git apply --whitespace=nowarn /tmp/tensorzero-pr-KiAbIs/repo/tensorzero.patch failed: error: corrupt patch at line 102
.

The patch I tried to generate is as follows:

diff --git a/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs b/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs
index d41c79b30..f0d2d0b7e 100644
--- a/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs
+++ b/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs
@@ -1,6 +1,7 @@
 use crate::db::clickhouse::migration_manager::migration_trait::Migration;
 use crate::db::clickhouse::ClickHouseConnectionInfo;
 use crate::error::{Error, ErrorDetails};
+use crate::result::ResultExt;
 use async_trait::async_trait;
 
 pub struct Migration0041<'a> {
@@ -10,22 +11,54 @@ pub struct Migration0041<'a> {
 #[async_trait]
 impl Migration for Migration0041<'_> {
     async fn can_apply(&self) -> Result<(), Error> {
         Ok(())
     }
 
     async fn should_apply(&self) -> Result<bool, Error> {
-        let query =
-            "SELECT 1 FROM system.functions WHERE name = 'tensorzero_uint_to_uuid'".to_string();
-        let result = self
-            .clickhouse
-            .run_query_synchronous_no_params(query)
-            .await?;
-        if result.response.contains('1') {
-            return Ok(false);
-        }
-        if result.response.trim().is_empty() || result.response.contains("0") {
-            return Ok(true);
-        }
-        Err(Error::new(ErrorDetails::ClickHouseMigration {
-            id: "0041".to_string(),
-            message: format!(
-                "Unexpected response when checking for tensorzero_uint_to_uuid function: {}",
-                result.response.trim()
-            ),
-        }))
+        // Check if the SQL UDF already exists. On ClickHouse, SQL UDFs are exposed via
+        // `system.sql_functions` (not `system.functions`). Fall back to `system.functions`
+        // defensively in case of version differences.
+        async fn function_exists(clickhouse: &ClickHouseConnectionInfo) -> Result<bool, Error> {
+            // Prefer system.sql_functions if available
+            let sql_fn_check = "SELECT 1 FROM system.sql_functions WHERE name = 'tensorzero_uint_to_uuid'";
+            match clickhouse
+                .run_query_synchronous_no_params(sql_fn_check.to_string())
+                .await
+            {
+                Ok(res) => {
+                    if !res.response.trim().is_empty() && res.response.contains('1') {
+                        return Ok(true);
+                    }
+                }
+                // If the system table doesn't exist on this server, ignore and try the fallback.
+                Err(_e) => {}
+            }
+
+            // Fallback: some builds expose user-defined SQL functions under system.functions as well.
+            let fallback_check =
+                "SELECT 1 FROM system.functions WHERE name = 'tensorzero_uint_to_uuid'";
+            let res = clickhouse
+                .run_query_synchronous_no_params(fallback_check.to_string())
+                .await?;
+            Ok(!res.response.trim().is_empty() && res.response.contains('1'))
+        }
+
+        let exists = function_exists(self.clickhouse).await?;
+        Ok(!exists)
     }
 
     async fn apply(&self, _clean_start: bool) -> Result<(), Error> {
         let on_cluster_name = self.clickhouse.get_on_cluster_name();
         let query = format!(
             r"CREATE FUNCTION IF NOT EXISTS tensorzero_uint_to_uuid{on_cluster_name} AS (x) -> reinterpretAsUUID(
             concat(
                 substring(reinterpretAsString(x), 9, 8),
                 substring(reinterpretAsString(x), 1, 8)
             )
         );",
         );
         self.clickhouse
             .run_query_synchronous_no_params(query)
             .await?;
 
         Ok(())
     }
 
     fn rollback_instructions(&self) -> String {
         "DROP FUNCTION IF EXISTS tensorzero_uint_to_uuid;".to_string()
     }
 
     async fn has_succeeded(&self) -> Result<bool, Error> {
-        let should_apply = self.should_apply().await?;
-        Ok(!should_apply)
+        // Reuse the existence logic from should_apply: if we should not apply anymore,
+        // then the migration has succeeded.
+        let should_apply = self.should_apply().await?;
+        Ok(!should_apply)
     }
 }

@github-actions
Copy link
Contributor

github-actions bot commented Nov 4, 2025

TensorZero CI Bot Automated Comment

Thanks for the PR! The ClickHouse CI is failing because the new user-defined function tensorzero_uint_to_uuid introduced in migration 0041 isn't being detected reliably in tests, and the migration sometimes doesn't appear to run or doesn't report success.

Root causes observed:

  • The migration’s should_apply() checks system.functions without a deterministic output format and simply string-matches for '1'/'0'. This is brittle and can cause false positives/negatives, leading to “TooFew” migrations applied in concurrency/migration-manager tests.
  • The migration’s apply() likely succeeds, but has_succeeded() relies on the same fragile check, so tests conclude it didn’t succeed (“Migration0041 should have succeeded”).
  • As a result, tests that expect to see “Applying migration: Migration0041” in the logs don’t, and e2e migration tests fail.

Fix summary:

  • Make should_apply() and has_succeeded() robust by using a deterministic query with FORMAT JSONEachRow and EXISTS to check for the function’s presence, then interpreting an empty vs non-empty response reliably.
  • Slightly harden the CREATE FUNCTION body to ensure the reinterpretation gets a fixed-length buffer (toFixedString(..., 16)), which is safer across ClickHouse versions.

These changes should:

  • Ensure the migration applies exactly once when needed.
  • Ensure has_succeeded() deterministically reports success after creation.
  • Unblock the failing Rust ClickHouse tests (test_concurrent_clickhouse_migrations, test_clickhouse_migration_manager, test_rollback_apply_rollback).

No GitHub Actions changes are required.

Warning

I encountered an error while trying to create a follow-up PR: Failed to create follow-up PR using remote https://x-access-token:***@github.com/tensorzero/tensorzero.git: git apply --whitespace=nowarn /tmp/tensorzero-pr-1NbBe9/repo/tensorzero.patch failed: error: corrupt patch at line 100
.

The patch I tried to generate is as follows:

diff --git a/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs b/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs
index d41c79b30..10f36e579 100644
--- a/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs
+++ b/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0041.rs
@@ -1,63 +1,83 @@
 use crate::db::clickhouse::migration_manager::migration_trait::Migration;
 use crate::db::clickhouse::ClickHouseConnectionInfo;
 use crate::error::{Error, ErrorDetails};
 use async_trait::async_trait;
 
 pub struct Migration0041<'a> {
     pub clickhouse: &'a ClickHouseConnectionInfo,
 }
 
 #[async_trait]
 impl Migration for Migration0041<'_> {
     async fn can_apply(&self) -> Result<(), Error> {
         Ok(())
     }
 
     async fn should_apply(&self) -> Result<bool, Error> {
-        let query =
-            "SELECT 1 FROM system.functions WHERE name = 'tensorzero_uint_to_uuid'".to_string();
-        let result = self
-            .clickhouse
-            .run_query_synchronous_no_params(query)
-            .await?;
-        if result.response.contains('1') {
-            return Ok(false);
-        }
-        if result.response.trim().is_empty() || result.response.contains("0") {
-            return Ok(true);
-        }
-        Err(Error::new(ErrorDetails::ClickHouseMigration {
-            id: "0041".to_string(),
-            message: format!(
-                "Unexpected response when checking for tensorzero_uint_to_uuid function: {}",
-                result.response.trim()
-            ),
-        }))
+        // Deterministically check if the function exists by returning a row only when it does.
+        // If no row is returned, we should apply the migration.
+        let query = r"
+            SELECT 1 AS exists
+            WHERE EXISTS(
+                SELECT 1 FROM system.functions WHERE name = 'tensorzero_uint_to_uuid'
+            )
+            FORMAT JSONEachRow
+        "
+        .to_string();
+        let result = self.clickhouse.run_query_synchronous_no_params(query).await?;
+        Ok(result.response.trim().is_empty())
     }
 
     async fn apply(&self, _clean_start: bool) -> Result<(), Error> {
         let on_cluster_name = self.clickhouse.get_on_cluster_name();
         let query = format!(
             r"CREATE FUNCTION IF NOT EXISTS tensorzero_uint_to_uuid{on_cluster_name} AS (x) -> reinterpretAsUUID(
-            concat(
-                substring(reinterpretAsString(x), 9, 8),
-                substring(reinterpretAsString(x), 1, 8)
-            )
-        );",
+                toFixedString(
+                    concat(
+                        substring(reinterpretAsString(x), 9, 8),
+                        substring(reinterpretAsString(x), 1, 8)
+                    ),
+                    16
+                )
+            );",
         );
         self.clickhouse
             .run_query_synchronous_no_params(query)
             .await?;
 
         Ok(())
     }
 
     fn rollback_instructions(&self) -> String {
         "DROP FUNCTION IF EXISTS tensorzero_uint_to_uuid;".to_string()
     }
 
     async fn has_succeeded(&self) -> Result<bool, Error> {
-        let should_apply = self.should_apply().await?;
-        Ok(!should_apply)
+        // Succeeded if the function now exists (i.e., the query returns a row).
+        let query = r"
+            SELECT 1 AS exists
+            WHERE EXISTS(
+                SELECT 1 FROM system.functions WHERE name = 'tensorzero_uint_to_uuid'
+            )
+            FORMAT JSONEachRow
+        "
+        .to_string();
+        let result = self.clickhouse.run_query_synchronous_no_params(query).await?;
+        Ok(!result.response.trim().is_empty())
     }
 }

@virajmehta
Copy link
Member

Please snooze this PR 1 day. I would like to get Migration0041 in from my tool work

Copilot AI review requested due to automatic review settings November 6, 2025 15:37
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +26 to +28
if result.response.trim().is_empty() || result.response.contains("0") {
return Ok(true);
}
Copy link

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic for checking if the function exists is inconsistent with similar migrations (see migration_0013 line 113 and migration_0020 line 126). When the function doesn't exist, the response simply won't contain "1", so you should just return Ok(true) without checking for "0" or empty response. The current check for contains("0") could match unintended values. Consider simplifying to:

if result.response.contains('1') {
    return Ok(false);
}
Ok(true)
Suggested change
if result.response.trim().is_empty() || result.response.contains("0") {
return Ok(true);
}
Ok(true)

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Contributor

github-actions bot commented Nov 6, 2025

TensorZero CI Bot Automated Comment

Thanks for the PR. The CI ClickHouse test suite is failing due to the new Migration0042 (which creates the tensorzero_uint_to_uuid function). Two issues surfaced:

  1. test_clickhouse_migration_manager expects every migration to actually “apply” (and be logged) for a DB even if the global function already exists. Because the function is global in ClickHouse (not scoped to a database), your current should_apply checks system.functions and returns false when the function already exists. That causes the migration to be skipped (no “Applying migration: Migration0042 …” log line), so the test fails.

  2. test_rollback_apply_rollback also fails for the same reason: during the “re-apply” phase (after a rollback that drops the database), the function still exists globally, so should_apply returns false and the migration is skipped. The test expects it to succeed for the (new) database.

This also impacts the concurrent migrations test: since the function is global and your migration bases should_apply on system.functions, runs can under-apply from the perspective of a single database (the test reports “TooFew”).

Fix: Make Migration0042 always return true from should_apply so the migration is applied/logged for each database, and verify success via has_succeeded by checking system.functions for the presence of tensorzero_uint_to_uuid. The actual apply is idempotent thanks to CREATE FUNCTION IF NOT EXISTS, so this is safe and satisfies concurrency and re-apply expectations.

No workflow/action changes are required.

Warning

I encountered an error while trying to create a follow-up PR: Failed to create follow-up PR using remote https://x-access-token:***@github.com/tensorzero/tensorzero.git: git apply --whitespace=nowarn /tmp/tensorzero-pr-bx7TMI/repo/tensorzero.patch failed: error: corrupt patch at line 92
.

The patch I tried to generate is as follows:

diff --git a/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0042.rs b/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0042.rs
index 1c3c5e55b..d2a5e9c0a 100644
--- a/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0042.rs
+++ b/tensorzero-core/src/db/clickhouse/migration_manager/migrations/migration_0042.rs
@@ -1,65 +1,69 @@
 use crate::db::clickhouse::migration_manager::migration_trait::Migration;
 use crate::db::clickhouse::ClickHouseConnectionInfo;
 use crate::error::{Error, ErrorDetails};
 use async_trait::async_trait;
 
 pub struct Migration0042<'a> {
     pub clickhouse: &'a ClickHouseConnectionInfo,
 }
 
 #[async_trait]
 impl Migration for Migration0042<'_> {
     async fn can_apply(&self) -> Result<(), Error> {
         Ok(())
     }
 
     async fn should_apply(&self) -> Result<bool, Error> {
-        let query =
-            "SELECT 1 FROM system.functions WHERE name = 'tensorzero_uint_to_uuid'".to_string();
-        let result = self
-            .clickhouse
-            .run_query_synchronous_no_params(query)
-            .await?;
-        if result.response.contains('1') {
-            return Ok(false);
-        }
-        if result.response.trim().is_empty() || result.response.contains("0") {
-            return Ok(true);
-        }
-        Err(Error::new(ErrorDetails::ClickHouseMigration {
-            id: "0042".to_string(),
-            message: format!(
-                "Unexpected response when checking for tensorzero_uint_to_uuid function: {}",
-                result.response.trim()
-            ),
-        }))
+        // Always apply. The function is global in ClickHouse (not per-database), and tests expect
+        // each database to record and log this migration application. The CREATE statement below
+        // uses IF NOT EXISTS, so this is idempotent and safe under concurrency.
+        Ok(true)
     }
 
     async fn apply(&self, _clean_start: bool) -> Result<(), Error> {
         let on_cluster_name = self.clickhouse.get_on_cluster_name();
         let query = format!(
             r"CREATE FUNCTION IF NOT EXISTS tensorzero_uint_to_uuid{on_cluster_name} AS (x) -> reinterpretAsUUID(
             concat(
                 substring(reinterpretAsString(x), 9, 8),
                 substring(reinterpretAsString(x), 1, 8)
             )
         );",
         );
         self.clickhouse
             .run_query_synchronous_no_params(query)
             .await?;
 
         Ok(())
     }
 
     fn rollback_instructions(&self) -> String {
         // Note - we deliberately don't include any rollback instructions.
         // User-defined functions are global, so dropping them would interfere with concurrent migration runs.
         String::new()
     }
 
     async fn has_succeeded(&self) -> Result<bool, Error> {
-        let should_apply = self.should_apply().await?;
-        Ok(!should_apply)
+        // Consider the migration successful if the function exists.
+        let query =
+            "SELECT count() FROM system.functions WHERE name = 'tensorzero_uint_to_uuid'".to_string();
+        let result = self
+            .clickhouse
+            .run_query_synchronous_no_params(query)
+            .await?;
+        // ClickHouse returns a single row with the count, usually as plain text.
+        // If the function exists, the count will be >= 1.
+        let exists = result
+            .response
+            .lines()
+            .next()
+            .and_then(|l| l.trim().parse::<u64>().ok())
+            .unwrap_or(0)
+            >= 1;
+        Ok(exists)
     }
 }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants